🔍Trace:Media Literacy Pipeline

A transparency tool for analyzing conflicts of interest in research, journalism, and public discourse. Our app takes any URL — a YouTube video, news article, blog post, or research piece — and automatically reveals the hidden context behind the message. For every speaker, author, publisher, or organization mentioned, we extract:

Who they are
What they said
Who they’re affiliated with
Who funds them
What organizations or movements they’re connected to

The goal is simple: give people the context they need to understand whether they can trust the information they’re consuming.

Instead of taking statements at face value, users get a structured breakdown of the entities behind the content and the influences that may shape their perspectives.

🚀 What the System Does

Given a URL, the pipeline:

Fetches and parses the content
- YouTube transcripts
- Article text
Extracts entities and their opinions
Runs SERP searches for each entity
Builds a structured context for Gemini
Extracts affiliations using a strict JSON schema
Validates the output
Returns a clean JSON object combining opinions + affiliations

This is a fully automated, end‑to‑end extraction system.

📁 Project Structure

backend/
  services/
    affiliations/
      context.py        # SERP → Gemini → affiliation extraction
      serp.py           # SERP API wrapper
      validate.py       # JSON schema validation
      __init__.py
    article.py          # Article text extraction
    opinion.py          # Entity + opinion extraction
    score.py            # Pipeline orchestrator (URL → final JSON)
    urls.py             # URL helpers
    youtube.py          # YouTube transcript extraction

🧠 Pipeline Flow

1. Opinion Extraction (`opinion.py`)

Takes a URL and returns:

{
  "url": "...",
  "entities": {
    "Speaker Name": {
      "isPerson": true,
      "opinions": ["...", "..."]
    },
    "Platform Name": {
      "isPerson": false,
      "opinions": []
    }
  }
}

2. Affiliation Extraction (`affiliations/context.py`)

For each entity:

Performs a SERP search
Builds a structured context
Sends it to Gemini
Enforces a strict JSON schema
Validates the output

Returns:

{
  "personal_affiliations": [...],
  "political_affiliations": [...],
  "financial_affiliations": [...],
  "sources": {...}
}

3. Pipeline Orchestration (`score.py`)

score.py ties everything together:

Takes a URL
Extracts opinions
Extracts affiliations for each entity
Validates all JSON
Returns the final combined result

🧪 Example Output

{
  "url": "...",
  "entities": {
    "Speaker": {
      "isPerson": true,
      "opinions": ["..."],
      "affiliations": {
        "personal_affiliations": [...],
        "political_affiliations": [...],
        "financial_affiliations": [...],
        "sources": {...}
      }
    },
    "Platform": {
      "isPerson": false,
      "opinions": [],
      "affiliations": {...}
    }
  }
}

⚙️ Setup

Follow these steps to run the full project locally.

🖥️ Backend Setup (Python + FastAPI)

1. Navigate to the backend folder

cd backend

2. Create and activate a virtual environment

Windows:

python -m venv .venv
.venv\Scripts\activate

Mac/Linux:

python3 -m venv .venv
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Create a `.env` file

Inside backend/, create a file named .env:

GEMINI_API_KEY=your_key_here
SERP_API_KEY=your_key_here

🌐 Frontend Setup (React)

1. Navigate to the frontend folder

cd frontend

2. Install dependencies

npm install

3. Start the development server

npm run dev

Frontend will be available at:

http://localhost:3000

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
backend		backend
frontend		frontend
node_modules		node_modules
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍Trace:Media Literacy Pipeline

🚀 What the System Does

📁 Project Structure

🧠 Pipeline Flow

1. Opinion Extraction (`opinion.py`)

2. Affiliation Extraction (`affiliations/context.py`)

3. Pipeline Orchestration (`score.py`)

🧪 Example Output

⚙️ Setup

🖥️ Backend Setup (Python + FastAPI)

1. Navigate to the backend folder

2. Create and activate a virtual environment

3. Install dependencies

4. Create a `.env` file

🌐 Frontend Setup (React)

1. Navigate to the frontend folder

2. Install dependencies

3. Start the development server

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

XavierLED/calgary-hacks-2026

Folders and files

Latest commit

History

Repository files navigation

🔍Trace:Media Literacy Pipeline

🚀 What the System Does

📁 Project Structure

🧠 Pipeline Flow

1. Opinion Extraction (opinion.py)

2. Affiliation Extraction (affiliations/context.py)

3. Pipeline Orchestration (score.py)

🧪 Example Output

⚙️ Setup

🖥️ Backend Setup (Python + FastAPI)

1. Navigate to the backend folder

2. Create and activate a virtual environment

3. Install dependencies

4. Create a .env file

🌐 Frontend Setup (React)

1. Navigate to the frontend folder

2. Install dependencies

3. Start the development server

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

1. Opinion Extraction (`opinion.py`)

2. Affiliation Extraction (`affiliations/context.py`)

3. Pipeline Orchestration (`score.py`)

4. Create a `.env` file

Packages