This project explores building a Retrieval-Augmented Generation (RAG) system on top of the FastAPI documentation, and then exposing it through a small FastAPI backend plus a minimal HTML/JavaScript frontend.
You can:
- index parts of the FastAPI docs into Chroma
- query them via a Groq-powered RAG pipeline
- use a simple web UI to ask questions and see answers with sources
All of the app-specific code lives inside this rag-learning/ directory:
-
fast-api-rag.ipynb- Prototyping notebook where the RAG pipeline was first built:
- loads the FastAPI docs from the
fastapi/repo - enriches documents with mkdocs navigation metadata
- builds two Chroma collections:
fastapi_dense_docsunderchroma_fastapi/densefastapi_faq_docsunderchroma_fastapi/faq
- defines the routing, multi-query reformulation, and final RAG chain
- loads the FastAPI docs from the
- Prototyping notebook where the RAG pipeline was first built:
-
backend/__init__.py– backend package marker.rag_service.py– production-ready RAG service that:- loads the persisted Chroma vector stores under
chroma_fastapi/ - recreates the same routing + multi-query retrieval logic as the notebook
- defines
answer_question(question: str)which:- generates multiple query variants
- routes each variant to the correct corpus (
dense_docs,faq_docs, or both) - deduplicates retrieved documents
- calls the Groq model to generate a final answer
- returns the answer text and a list of unique source metadata dicts
- loads the persisted Chroma vector stores under
app.py– FastAPI application that:- exposes
POST /ask– accepts{"question": "..."}and returns:answer: strsources: [{ source, corpus, section, category_path, ... }]
- exposes
GET /health– simple health check - mounts the
frontend/directory as static files at/
- exposes
-
frontend/index.html– minimal UI:- textarea for the question
- “Ask” button
- sections for the generated answer and the list of sources
style.css– small, modern dark theme for the UImain.js– calls the backend API and renders the response:- sends
POST /askwith{ question } - handles loading and error states
- pretty-prints the answer (
white-space: pre-wrap) - renders source paths and basic metadata
- sends
Before using the API or UI, you need to:
-
Generate the Chroma vector stores by running the notebook
fast-api-rag.ipynb:- Execute at least the cells that build and persist the vector stores
- This will create
chroma_fastapi/dense/andchroma_fastapi/faq/directories - Note: These directories are gitignored (they contain generated artifacts)
-
Set up environment variables:
- Create a
.envfile with yourGROQ_API_KEY - Optionally add LangSmith variables (
LANGSMITH_TRACING,LANGSMITH_API_KEY, etc.)
- Create a
-
Install dependencies:
pip install -r requirements.txt
From the rag-learning/ directory, activate your virtual environment and start the backend:
source venv/bin/activate # On Windows: venv\Scripts\activate
uvicorn backend.app:app --reloadBy default this will start the server at http://127.0.0.1:8000.
Environment variables such as GROQ_API_KEY, LANGSMITH_*, etc. are loaded
via python-dotenv in backend/rag_service.py, mirroring the notebook setup.
Once the FastAPI server is running, open:
http://127.0.0.1:8000/
You should see the FastAPI RAG Assistant page. Typical flow:
- Type a question about FastAPI docs, for example:
- “What is dependency injection in FastAPI and how do I use it?”
- Click Ask (or press
Cmd+Enter/Ctrl+Enter). - The UI:
- sends a
POST /askrequest to the backend - displays the generated answer
- shows a list of source documents with basic metadata:
- absolute path to the markdown file
- corpus (
dense_docsorfaq_docs) - navigation metadata (
category_path,top_level_category, etc.)
- sends a
You can also open frontend/index.html directly from disk. In that case,
main.js falls back to calling http://localhost:8000/ask, so make sure
the backend is running on that host and port.
You don’t need the frontend to use the RAG service; you can call the API
directly, for example with curl:
curl -X POST http://127.0.0.1:8000/ask \
-H "Content-Type: application/json" \
-d '{"question": "How do I enable CORS in FastAPI?"}'Example JSON response shape:
{
"answer": "...generated answer text...",
"sources": [
{
"source": "/abs/path/to/fastapi/docs/en/docs/tutorial/cors.md",
"corpus": "dense_docs",
"section": null,
"category_path": "Learn > Tutorial - User Guide",
"top_level_category": "Learn",
"subcategory": "Tutorial - User Guide"
}
]
}In the parent directory of this app you have a full clone of the FastAPI
project. The notebook and RAG pipeline rely on that checkout to load the
documentation markdown files and mkdocs.yml.
This rag-learning/ app is intentionally kept separate:
- it reads the FastAPI docs from the sibling
fastapi/directory - it stores vector indexes locally in
chroma_fastapi/ - it serves a small, self-contained RAG API and frontend for exploration
You can iterate on the RAG logic either in the notebook or in
backend/rag_service.py, and immediately test changes through the /ask
endpoint or the web UI.