The Reddit Sentiment Dashboard is an interactive web application that performs real-time sentiment and emotion analysis on Reddit posts using multiple state-of-the-art NLP models. Leveraging technologies such as Python, Streamlit, PRAW, and HuggingFace Transformers, the dashboard aggregates data, generates dynamic visualizations, and provides actionable insights. This project demonstrates the integration of advanced machine learning models with data visualization and scalable data ingestion techniques.
- ✅ Enhanced Security: Environment-based credential management (no more hardcoded secrets!)
- ✅ Rate Limiting: Intelligent API rate limiting to prevent throttling
- ✅ Performance Caching: Sentiment analysis results are cached for faster response
- ✅ Better Error Handling: Comprehensive error messages and validation
- ✅ Input Validation: Subreddit validation and better input sanitization
- ✅ Progress Indicators: Real-time progress bars for long operations
- ✅ Improved UX: Fetch button control, better visual feedback
- ✅ Code Consolidation: Removed duplicate files for easier maintenance
-
Multi-Model Sentiment Analysis: Compare results from 4 different NLP models
- Cardiff RoBERTa (3-class: negative/neutral/positive)
- DistilBERT (binary: positive/negative)
- NLPTown BERT (5-star rating system)
- BERTweet (binary sentiment, Twitter-optimized)
-
Emotion Detection: Identifies 7 distinct emotions
- Joy, Sadness, Anger, Fear, Surprise, Disgust, Neutral
-
Interactive Dashboard: 4-tab interface
- 📋 Dashboard: Post-level sentiment and emotion analysis
- 📈 Temporal Analysis: Trend visualization over time
- 💬 User Behavior: Comment engagement metrics
- 🧠 Model Insights: Model performance comparison
-
Data Export: Download analysis results as CSV
- Python 3.8 or higher
- Reddit API credentials (free)
- 4GB+ RAM recommended (for transformer models)
- Internet connection
git clone https://github.com/yourusername/Reddit-Sentiment-Dashboard.git
cd Reddit-Sentiment-Dashboard# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activatepip install -r requirements.txtNote: Initial installation may take 5-10 minutes as it downloads ~1.4GB of dependencies, including transformer models.
-
Get Reddit API credentials (free):
- Go to https://www.reddit.com/prefs/apps
- Click "create app" or "create another app"
- Fill in the form:
- name: YourAppName
- app type: Select "script"
- description: (optional)
- about url: (optional)
- redirect uri: http://localhost:8080
- Click "create app"
- Note your
client_id(under the app name) andclient_secret
-
Create
.envfile in the project root:
# Copy the example file
cp .env.example .env- Edit
.envand add your credentials:
REDDIT_CLIENT_ID=your_client_id_here
REDDIT_CLIENT_SECRET=your_client_secret_here
REDDIT_USER_AGENT=YourAppName:v1.0 (by /u/YourRedditUsername).env file to version control. It's already in .gitignore.
streamlit run main_code.pyThe dashboard will open automatically in your default browser at http://localhost:8501
- Select Subreddit: Choose from popular subreddits or enter a custom one
- Configure Settings:
- Number of posts (1-50)
- Sentiment model (single or all 4 models)
- Click "Fetch & Analyze Posts": Start the analysis
- Explore Tabs:
- View individual post analysis
- Analyze trends over time
- Check comment engagement
- Compare model performance
Brand Monitoring:
Subreddit: technology
Posts: 25
Model: All
Goal: Track sentiment about your product
Market Research:
Subreddit: investing
Posts: 50
Model: Cardiff
Goal: Gauge market sentiment
Community Health:
Subreddit: mentalhealth
Posts: 20
Model: All
Goal: Analyze emotional trends
| Component | Technology | Purpose |
|---|---|---|
| Frontend | Streamlit | Interactive web interface |
| API Client | PRAW | Reddit data ingestion |
| ML Framework | PyTorch | Neural network backend |
| NLP Models | HuggingFace Transformers | Pre-trained models |
| Data Processing | Pandas | Data manipulation |
| Visualization | Plotly | Interactive charts |
Reddit-Sentiment-Dashboard/
├── main_code.py # Main application (v2.0 - enhanced)
├── config.py # Configuration (env-based)
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── .gitignore # Git ignore rules
├── README.md # This file
├── LICENSE # MIT License
└── app/
├── multi_model_sentiment.py # 4-model sentiment analysis
├── emotion_detector.py # Emotion detection
└── sentiment_analyzer.py # Legacy single model
User Input → Reddit API (PRAW) → Post Fetching
↓
Multi-Model Analysis
↓ ↓ ↓
Sentiment Emotion Caching
↓ ↓ ↓
Pandas DataFrame Processing
↓
Interactive Visualizations
(Plotly Charts & Tables)
| Model | Type | Classes | Training Data | Best For |
|---|---|---|---|---|
| Cardiff RoBERTa | 3-class | neg/neu/pos | Nuanced sentiment | |
| DistilBERT | Binary | pos/neg | SST-2 | Clear polarity |
| NLPTown BERT | 5-star | 1-5 stars | Reviews | Rating-style |
| BERTweet | Binary | pos/neg | Social media |
- Model: j-hartmann/emotion-english-distilroberta-base
- Emotions: anger, disgust, fear, joy, neutral, sadness, surprise
- Training: Multi-domain emotion dataset
- Sentiment and emotion analysis results are cached using Streamlit's
@st.cache_data - Reddit client initialization is cached with
@st.cache_resource - Significantly reduces computation time for repeated analyses
- Automatic rate limiting prevents Reddit API throttling
- Default: 30 calls per 60 seconds (conservative for safety)
- Visual warnings when approaching limits
- Models loaded once at startup
- Shared across all analysis requests
- ~1.5-2GB RAM footprint for all 5 models
1. "Reddit API credentials not found"
Solution: Ensure .env file exists with correct credentials
Check: REDDIT_CLIENT_ID and REDDIT_CLIENT_SECRET are set
2. "Failed to initialize Reddit client"
Solution: Verify credentials are correct
Check: Reddit app type is "script" not "web app"
3. "Rate limit approaching"
Solution: Wait for the indicated time
Note: This is normal and protects your API quota
4. "Module not found"
Solution: Reinstall dependencies
Command: pip install -r requirements.txt --upgrade
5. Out of memory errors
Solution: Reduce number of posts or close other applications
Requirement: 4GB+ RAM recommended
Enable verbose logging:
streamlit run main_code.py --logger.level=debugReddit API limits:
- Authenticated: 60 requests per minute
- Burst: 600 requests per 10 minutes
The dashboard implements conservative rate limiting (30 calls/min) to ensure smooth operation.
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Planned features for v3.0:
- Database persistence (PostgreSQL/MongoDB)
- Historical trend analysis
- Sentiment forecasting (ARIMA/LSTM)
- Multi-subreddit comparison
- Email alerts for sentiment thresholds
- GPU acceleration support
- Docker containerization
- RESTful API endpoint
- User authentication
This project is licensed under the MIT License - see the LICENSE file for details.
- HuggingFace for providing pre-trained transformer models
- PRAW developers for the excellent Reddit API wrapper
- Streamlit team for the amazing web framework
- Reddit community for providing valuable data
If you use this project in your research or work, please cite:
@software{reddit_sentiment_dashboard,
author = {DharmpratapSingh Vaghela},
title = {Reddit Sentiment Dashboard: Multi-Model Sentiment Analysis Tool},
year = {2024},
url = {https://github.com/yourusername/Reddit-Sentiment-Dashboard},
version = {2.0}
}- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Enhanced security with environment variables
- Added rate limiting and error handling
- Implemented performance caching
- Improved user experience
- Code consolidation and cleanup
- Initial release
- Basic multi-model sentiment analysis
- 4-tab dashboard interface
- Temporal and engagement analysis
Made with ❤️ using Python, Streamlit, and Transformers
⭐ If you find this project useful, please consider giving it a star on GitHub!