Skip to content

AI-powered Streamlit app for summarizing Reddit posts and comparing the outputs of two transformer models: BART and T5-Large

Notifications You must be signed in to change notification settings

Gabya06/content_summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Transformer Model Comparison: Reddit Post Summarization

This is an AI-powered Streamlit app for summarizing Reddit posts and comparing the outputs of two transformer models: BART and T5-Large. It uses Google Gemini to generate a concise summary and highlight differences between the model-generated summaries.

  • To learn more about the Hugging Face pipeline, BART and T5-Large transfomer models, you can follow along on Medium.com.

  • The Reddit API was used to generate the data for this repo; you can read about it here.


Features

  • Upload CSVs with BART and T5-Large summaries for Reddit posts
  • AI-generated summary & comparison using Google Gemini
  • Visualize sentiment (with emoji) and number of comments (progress bar)
  • Interactive UI built with Streamlit
  • Batch processing of multiple posts
  • Docker-ready for easy deployment

Project Structure

content_summarization/
│
├── app.py                  # Streamlit app entry point
├── summarize.py            # Summarization logic and ContentSummary class
├── requirements.txt        # Python dependencies
├── Dockerfile              # Docker build instructions
├── .env.example            # Example environment variables
├── README.md               # Project documentation
│
├── data/                   # Sample data
│
└── tests/                  # (TODO) Unit tests

Setup

1. Clone the repository

git clone https://github.com/gabya06/content_summarization.git
cd content_summarization

2. Install dependencies

pip install -r requirements.txt

3. Set up environment variables

Copy .env.example to .env and add your Google Gemini API key:

GEMINI_API_KEY=your-key-here

Usage

Run locally

streamlit run app.py
  • Upload your BART and T5-Large CSV files in the sidebar.
  • Select the number of posts to summarize.
  • Click Summarize and Compare to view results.

CSV Format

Your CSVs should have at least these columns:

  • title
  • cleaned_text
  • summary_bart (for BART CSV)
  • summary_t5 (for T5-Large CSV)
  • sentiment
  • num_comments

Docker Deployment

Build and run with Docker

docker build -t content-summarization .
docker run -p 8501:8501 --env-file .env content-summarization

Then open http://localhost:8501 in your browser.


Google Cloud

Check out the app in Google Cloud!

Customization

  • Edit summarize.py to change prompt logic or add more models.
  • Tweak app.py for UI changes or new visualizations.

Acknowledgements


About

AI-powered Streamlit app for summarizing Reddit posts and comparing the outputs of two transformer models: BART and T5-Large

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published