Skip to content

This project implements a simple sentiment analysis classifier using logistic regression, trained using manual gradient descent, without relying on any machine learning libraries like scikit-learn or TensorFlow. It demonstrates end-to-end model development from raw text to evaluation.

Notifications You must be signed in to change notification settings

waqarmunawar7/-Sentiment-Analysis-from-Scratch-No-ML-Libraries-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

5 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿง  Sentiment Analysis from Scratch (No ML Libraries)

This project implements a simple sentiment analysis classifier using logistic regression, trained using manual gradient descent, without relying on any machine learning libraries like scikit-learn or TensorFlow. It demonstrates end-to-end model development from raw text to evaluation.


๐Ÿ“‚ Project Structure


โ”œโ”€โ”€ tweets.txt        # Your labeled tweet dataset
โ”œโ”€โ”€ sentiment_analysis.py  
โ””โ”€โ”€ README.md                

โœ… Features

  • ๐Ÿ“ฆ Logistic Regression from scratch
  • ๐Ÿ“‰ Manual Gradient Descent for weight updates
  • ๐Ÿงน Text Preprocessing & Word Frequency Vectorization
  • ๐Ÿ“Š Loss vs Epochs graph
  • ๐Ÿ” Parameter Convergence plots
  • ๐Ÿ“‹ Evaluation with Confusion Matrix & Custom Metrics

๐Ÿ“ Input Format

The dataset file tweets.txt should contain tweets in the following format:

I love this product || Positive  
This is the worst thing ever || Negative  

Each line contains a tweet and its sentiment label (Positive or Negative) separated by ||.


๐Ÿงฎ Time Complexity

Component Complexity Description
Vocabulary Build O(N ร— L) N = # of tweets, L = avg. words per tweet
Vectorization O(N ร— V) V = vocabulary size
Training Loop O(E ร— N ร— V) E = # of epochs (includes gradient computation)
Evaluation O(N ร— V) Same as vectorization for test set

๐Ÿง  Model Overview

We use a logistic regression model where:

$$sigmoid(z) = 1 / (1 + exp(-z)) z = bias + w1 * pos_freq + w2 * neg_freq$$

Gradient Descent Weight Update:

error = predicted - actual
w1 -= learning_rate * error * pos_freq
w2 -= learning_rate * error * neg_freq
bias -= learning_rate * error

๐Ÿ“ˆ Output Graphs

  • ๐Ÿ“‰ Loss vs Epochs: Shows how training error decreases over time
  • ๐Ÿ“ Parameter Convergence: Plots w1, w2, and bias vs loss with circle markers for better interpretability

๐Ÿ“Š Evaluation Metrics

Evaluation is done using a custom implementation, without any external libraries:

  • โœ… Accuracy
  • ๐Ÿ” Precision
  • ๐ŸŽฏ Recall
  • ๐Ÿงฎ F1 Score
  • ๐Ÿงฎ Confusion Matrix

๐Ÿ› ๏ธ Requirements

  • Python 3.6+
  • matplotlib (for plotting)
pip install matplotlib

๐Ÿš€ How to Run

python sentiment_analysis.py

๐Ÿงช Example Output

Weights: w1=0.45, w2=-0.27, bias=0.62
Confusion Matrix:
[[7, 2],
 [1, 10]]
Accuracy: 0.85
Precision: 0.83
Recall: 0.91
F1 Score: 0.87

๐Ÿ™Œ Credits

Created by Waqar
Inspired by hands-on ML principles and low-level learning


About

This project implements a simple sentiment analysis classifier using logistic regression, trained using manual gradient descent, without relying on any machine learning libraries like scikit-learn or TensorFlow. It demonstrates end-to-end model development from raw text to evaluation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages