Skip to content

aman-coder03/LibSenti

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

10 Commits
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“š LibSenti: Library Review Sentiment Predictor & Analyst

LibSenti is an end-to-end, AI-powered application that leverages advanced machine learning and natural language processing (NLP) techniques to classify sentiment polarity โ€” Positive, Neutral, or Negative โ€” from student-submitted reviews of IIT and NIT libraries. It combines a robust backend sentiment analysis engine with a visually rich, interactive Streamlit application for real-time review exploration, data insights, and institutional benchmarking.


๐Ÿš€ Key Features

  • โœ… Real-time Review Prediction (๐Ÿ“ Sentiment Predictor Tab)
    Users can input custom reviews to receive live sentiment predictions with confidence probabilities and visual feedback.

  • โœ… Unigram WordCloud Visualization (๐Ÿ”ค Unigram WordClouds Tab)
    Generates wordclouds for individual institutions using most frequent single-word terms found in reviews.

  • โœ… Bigram WordCloud Comparison (๐Ÿ”— Bigram WordClouds Tab)
    Displays two-institution comparison of most common word pairs (bigrams) extracted from reviews.

  • โœ… Sentiment Pie Chart Comparison (๐Ÿ” Pie Chart Comparison Tab)
    Side-by-side sentiment distribution pie charts for any two selected institutions, including precise percentage labels.

  • โœ… IIT vs NIT Sentiment Analysis (๐Ÿ“Š IIT vs NIT Chart Tab)
    Presents a consolidated sentiment comparison chart contrasting IITs and NITs at a glance.

  • โœ… Library Experience Highlights (๐ŸŒŸ Library Experiences Tab)
    Displays standout user-submitted reviewsโ€”both best and worst experiencesโ€”curated by sentiment and length.


๐Ÿง  Model Details

  • Architecture: BERTForSequenceClassification
  • Dataset: IIT & NIT library reviews (labeled Positive, Neutral, Negative)
  • Frameworks: PyTorch, Transformers (HuggingFace)
  • Accuracy: ~96% on test data
  • Label Distribution Handling: Threshold-based for confident classification

๐Ÿ“ Project Structure

LibSenti/ โ”œโ”€โ”€ assets/ โ”‚ โ”œโ”€โ”€ wordclouds/ # Wordcloud PNGs for each IIT/NIT โ”‚ โ””โ”€โ”€ iit_vs_nit_sentiment_comparison.png โ”œโ”€โ”€ saved_model/ # Trained BERT model and tokenizer โ”œโ”€โ”€ app.py # Main Streamlit application โ”œโ”€โ”€ train_model.py # Script to train the BERT model โ”œโ”€โ”€ cleaned_iit+nit_library_reviews.csv โ”œโ”€โ”€ sentiment_iit_library_reviews.csv โ””โ”€โ”€ README.md # This file


๐Ÿ” Sentiment Categories

Label Class
Negative 0
Neutral 1
Positive 2

Class imbalance is addressed using weighted loss during training and probability thresholds during prediction.


๐Ÿง  Model Training

  • Base Model: bert-base-uncased
  • Framework: Hugging Face Transformers + PyTorch
  • Strategy: Fine-tuned with weighted cross-entropy loss for class imbalance
  • Threshold logic for better handling of imbalanced classes
  • Trained using Trainer API with evaluation metrics and early stopping

Run training script:

python train_model.py

This script handles:

  • Preprocessing
  • Tokenization
  • Model fine-tuning
  • Class weight balancing
  • Model saving to ./saved_model/

๐ŸŽจ Streamlit Application

Launch the Application:

streamlit run app.py

๐Ÿงฉ Components:

  • ๐Ÿ“ฅ Review Classifier (๐Ÿ“ Sentiment Predictor Tab)
    Enter any library review and instantly receive a sentiment prediction (Positive, Neutral, Negative).

  • ๐Ÿ“ˆ Sentiment Probabilities (๐Ÿ“ Sentiment Predictor Tab)
    Visualize the confidence scores for each sentiment using interactive progress bars to assess prediction certainty.

  • โ˜๏ธ WordCloud Comparator

    • ๐Ÿ”ค (Unigram WordClouds Tab): Select and compare two institutions to explore most frequent individual keywords.
    • ๐Ÿ”— (Bigram WordClouds Tab): Compare most common two-word combinations to find phrase patterns in reviews.
  • ๐Ÿ“Š Sentiment Pie Chart Comparison (๐Ÿ” Pie Chart Comparison Tab)
    Instantly loads sentiment distribution charts for selected institutions side-by-side for intuitive visual analysis.

  • ๐Ÿงฎ IIT vs NIT Overall Chart (๐Ÿ“Š IIT vs NIT Chart Tab)
    A comparative sentiment distribution chart to analyze trends across all IITs vs NITs.

  • ๐ŸŒŸ Library Experience Highlights (๐ŸŒŸ Library Experiences Tab)
    Shows handpicked positive and negative user reviews with institution tags and styled formatting.


๐Ÿ’ก Future Improvements

  • Add LIME/SHAP Explainability for BERT
    Integrate model interpretation techniques to explain why a review was labeled positive/negative.

  • Include More Institutions
    Expand dataset to cover regional universities, IIITs, NLUs, and other public libraries for broader benchmarking.

  • Review Metadata Integration
    Include attributes like review date, source, device, or student/staff tag for richer context and filtering.

  • Clustering or Topic Modeling
    Apply LDA/BERT-topic to identify trending topics or issues discussed across institutions.

  • Sentiment Timeline Analysis
    Show how sentiment for a specific institution evolves over time (e.g., semester-wise or pre/post renovation).

  • User Feedback Module
    Allow users to correct or rate the model's prediction to improve performance and trust.

  • Multilingual Support
    Add language detection and support for Hindi, Tamil, etc., using multilingual BERT (e.g., xlm-roberta-base).


๐Ÿ‘จโ€๐Ÿ’ป Author

Aman Srivastava [amansri345@gmail.com]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published