RAG Model - Document Q&A System

A sophisticated Retrieval-Augmented Generation (RAG) system built with FastAPI that enables intelligent document-based question answering. This application combines the power of OpenAI's language models with vector search capabilities to provide contextually accurate responses based on uploaded documents.

🚀 Features

Core Functionality

Document Upload & Processing: Upload PDF documents that are automatically processed and vectorized
Intelligent Q&A: Ask questions about your documents and get contextually relevant answers
Vector Search: Advanced semantic search using Qdrant vector database
Chat History: Persistent conversation history with context awareness
User Authentication: Secure JWT-based authentication system
Real-time Processing: Efficient document chunking and embedding generation

Technical Highlights

RAG Architecture: Combines retrieval and generation for accurate responses
Vector Embeddings: Uses OpenAI's text-embedding-3-large model (3072 dimensions)
Scalable Database: PostgreSQL for relational data, Qdrant for vector storage
Modern API: RESTful API with automatic OpenAPI documentation
Production Ready: Comprehensive logging, error handling, and database migrations

🏗️ Architecture

System Overview

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   FastAPI App   │    │   PostgreSQL    │    │     Qdrant      │
│                 │    │                 │    │   Vector DB     │
│  ┌───────────┐  │    │  ┌───────────┐  │    │  ┌───────────┐  │
│  │ Auth      │  │◄──►│  │   Users   │  │    │  │ Embeddings│  │
│  │ Routes    │  │    │  │ Documents │  │    │  │ Vectors   │  │
│  └───────────┘  │    │  │ Messages  │  │    │  │ Metadata  │  │
│  ┌───────────┐  │    │  └───────────┘  │    │  └───────────┘  │
│  │ Chat      │  │    └─────────────────┘    └─────────────────┘
│  │ Routes    │  │              │                       │
│  └───────────┘  │              │                       │
│  ┌───────────┐  │              │                       │
│  │ Doc       │  │              │                       │
│  │ Routes    │  │              │                       │
│  └───────────┘  │              │                       │
└─────────────────┘              │                       │
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   OpenAI API    │    │   SQLAlchemy    │    │ Qdrant Client   │
│                 │    │      ORM        │    │                 │
│ ┌─────────────┐ │    │                 │    │                 │
│ │ GPT-4.1     │ │    │                 │    │                 │
│ │ Embeddings  │ │    │                 │    │                 │
│ └─────────────┘ │    │                 │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘

RAG Pipeline

Document Ingestion: PDF files are uploaded and processed
Text Extraction: Content is extracted using LlamaIndex PDFReader
Chunking: Text is split into manageable chunks (1000 chars, 100 overlap)
Vectorization: Chunks are converted to embeddings using OpenAI
Storage: Vectors stored in Qdrant with metadata
Query Processing: User questions are vectorized and matched
Context Retrieval: Relevant chunks are retrieved based on similarity
Response Generation: OpenAI generates answers using retrieved context

Technology Stack

Backend Framework: FastAPI 0.116.1+
Language Model: OpenAI GPT-4.1
Embeddings: OpenAI text-embedding-3-large (3072D)
Vector Database: Qdrant 1.16.1
Relational Database: PostgreSQL with SQLAlchemy 2.0.44
Authentication: JWT with python-jose 3.5.0
Password Hashing: Argon2 via passlib 1.7.4
Document Processing: LlamaIndex for PDF parsing
Migration Management: Alembic 1.17.2
Environment Management: UV package manager

🛠️ Installation

Prerequisites

Python: 3.13+ (specified in pyproject.toml)
PostgreSQL: 12+ for relational data storage
Qdrant: Vector database (can run via Docker)
OpenAI API Key: For language model and embeddings

Quick Start

Clone the Repository
```
git clone <repository-url>
cd Rag-Model
```

Set Up Python Environment

# Using UV (recommended)
uv venv
uv pip install -e .

# Or using pip
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .

Install Dependencies

# Production dependencies
uv pip install -r pyproject.toml

# Development dependencies (optional)
uv pip install -e .[dev]

Set Up Databases

# Start Qdrant (using Docker)
docker run -p 6333:6333 qdrant/qdrant

# Ensure PostgreSQL is running
# Create database: rag_model

Configure Environment

cp .env.example .env
# Edit .env with your configuration

Run Database Migrations
```
alembic upgrade head
```

Start the Application

uvicorn main:app --reload --host 0.0.0.0 --port 8000

⚙️ Configuration

Environment Variables

Create a .env file in the project root with the following variables:

# OpenAI Configuration
OPENAPI_API_KEY=<--value goes here-->               (required)

# Model Configuration
MODEL_NAME=<--value goes here-->                    (defaults: gpt-4.1 )
EMBED_MODEL=<--value goes here-->                   (defaults: text-embedding-3-large)
EMBED_SIZE=<--value goes here-->                    (defaults: 3072)

# Database Configuration
DB_STRING=<--value goes here-->                     (required)
ECHO_SQL=<--value goes here-->                      (defaults: False)
DB_SCHEMA=<--value goes here-->                     (defaults rag_model)

# Vector Database
VECTOR_DB_URL=<--value goes here-->                 (defaults: http://localhost:6333)
VECTOR_INCLUSION_THRESHOLD=<--value goes here-->    (defaults: 0.5)

# Authentication
JWT_SECRET_KEY=<--value goes here-->                (required)
JWT_ALGORITHM=<--value goes here-->                 (defaults: HS256)

# Application Settings
FALLBACK_MESSAGE=Sorry, Could not generate a message. Please try again later.
LOG_FILE=app.log

Configuration Details

Model Settings

MODEL_NAME: OpenAI model for chat completions (default: gpt-4.1)
EMBED_MODEL: Embedding model (default: text-embedding-3-large)
EMBED_SIZE: Embedding dimensions (3072 for text-embedding-3-large)

Database Settings

DB_STRING: PostgreSQL connection string
VECTOR_DB_URL: Qdrant server URL
VECTOR_INCLUSION_THRESHOLD: Minimum similarity score for including documents (0.0-1.0)

Security Settings

JWT_SECRET_KEY: Secret key for JWT token signing (use a strong, random key)
JWT_ALGORITHM: JWT signing algorithm (HS256 recommended)

📖 Usage

Starting the Application

Development Mode

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Production Mode

uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

Access the Application
- API Documentation: http://localhost:8000/docs
- Alternative Docs: http://localhost:8000/redoc
- Health Check: http://localhost:8000/health

Basic Workflow

Register/Login

# Register a new user
curl -X POST "http://localhost:8000/auth/signup" \
     -H "Content-Type: application/json" \
     -d '{"name": "John Doe", "email": "john@example.com", "password": "securepassword123"}'

# Login
curl -X POST "http://localhost:8000/auth/login" \
     -H "Content-Type: application/json" \
     -d '{"email": "john@example.com", "password": "securepassword123"}'

Upload Documents

curl -X POST "http://localhost:8000/docs/upload-pdf-document" \
     -H "Authorization: Bearer YOUR_JWT_TOKEN" \
     -F "file=@document.pdf"

Ask Questions

curl -X POST "http://localhost:8000/chats/send-message" \
     -H "Authorization: Bearer YOUR_JWT_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"message": "What is the main topic of the document?"}'

View Chat History

curl -X GET "http://localhost:8000/chats/messages" \
     -H "Authorization: Bearer YOUR_JWT_TOKEN"

📚 API Documentation

Authentication Endpoints

POST `/auth/signup`

Register a new user account.

Request Body:

{
  "name": "John Doe",
  "email": "john@example.com",
  "password": "securepassword123"
}

Response:

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
  "token_type": "bearer",
  "user_id": 1,
  "email": "john@example.com",
  "name": "John Doe"
}

POST `/auth/login`

Authenticate user and receive JWT token.

Request Body:

{
  "email": "john@example.com",
  "password": "securepassword123"
}

Response:

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
  "token_type": "bearer",
  "user_id": 1,
  "email": "john@example.com",
  "name": "John Doe"
}

GET `/auth/me`

Get current authenticated user information.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN

Response:

{
  "user_id": 1,
  "name": "John Doe",
  "email": "john@example.com",
  "is_active": true,
  "is_verified": false,
  "created_at": "2024-12-09T08:25:00Z"
}

Document Management Endpoints

POST `/docs/upload-pdf-document`

Upload and process a PDF document.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN
Content-Type: multipart/form-data

Request Body:

file: (PDF file)

Response:

{
  "message": "Document 'example.pdf' uploaded and processed successfully",
  "doc_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "status": "success"
}

GET `/docs/list-documents`

List all uploaded documents for the current user.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN

Response:

{
  "documents": [
    {
      "doc_uuid": "550e8400-e29b-41d4-a716-446655440000",
      "file_url": "Not Available",
      "file_size": 1048576,
      "original_filename": "example.pdf",
      "mime_type": "application/pdf",
      "created_at": "2024-12-09T08:25:00Z",
      "total_chunks": 42
    }
  ],
  "count": 1,
  "status": "success"
}

DELETE `/docs/delete-document/{doc_uuid}`

Delete a document and its associated vectors.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN

Response:

{
  "message": "Document 'example.pdf' deleted successfully",
  "status": "success"
}

Chat Endpoints

POST `/chats/send-message`

Send a message and get an AI response based on uploaded documents.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN
Content-Type: application/json

Request Body:

{
  "message": "What are the main points discussed in the document?",
  "message_history_count": 20
}

Response:

{
  "status": "success",
  "total_input_vectors": 1536,
  "total_query_hits": 5,
  "total_output_tokens": 150,
  "query_hit_doc_uuids": ["550e8400-e29b-41d4-a716-446655440000"],
  "model_response": "Based on the uploaded document, the main points discussed are..."
}

GET `/chats/messages`

Retrieve chat history for the current user.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN

Response:

{
  "status": "success",
  "messages": [
    {
      "role": "user",
      "content": "What are the main points?",
      "model_used": "gpt-4.1",
      "tokens_used": 1536,
      "response_time_ms": 0,
      "ai_prompt": null,
      "context_document_uuid": null,
      "created_at": "2024-12-09T08:25:00Z"
    },
    {
      "role": "assistant",
      "content": "Based on the document...",
      "model_used": "gpt-4.1",
      "tokens_used": 150,
      "response_time_ms": 2500,
      "ai_prompt": "System: You are a helpful assistant...",
      "context_document_uuid": ["550e8400-e29b-41d4-a716-446655440000"],
      "created_at": "2024-12-09T08:25:02Z"
    }
  ]
}

DELETE `/chats/clear-history`

Clear all chat history for the current user.

Headers:

Authorization: Bearer YOUR_JWT_TOKEN

Response:

{
  "status": "success"
}

Health Check

GET `/health`

Check application health status.

Response:

{
  "status": "healthy",
  "message": "RAG API is running."
}

📁 Project Structure

Rag-Model/
├── 📁 alembic/                    # Database migrations
│   ├── 📁 versions/               # Migration files
│   │   └── c938cf66d0f6_initial_setup.py
│   └── env.py                     # Alembic environment configuration
├── 📁 authentication/             # Authentication module
│   ├── __init__.py               # Module exports
│   ├── auth_models.py            # Authentication data models
│   └── utils.py                  # JWT and password utilities
├── 📁 database/                   # Database layer
│   ├── models.py                 # SQLAlchemy models
│   ├── postgres_db.py            # PostgreSQL connection
│   └── vector_db.py              # Qdrant vector database client
├── 📁 llm/                       # Language model integration
│   ├── models.py                 # LLM data models
│   └── openai_client.py          # OpenAI API client
├── 📁 log_config/                # Logging configuration
│   ├── __init__.py               # Logger factory
│   └── logging_config.py         # Logging setup
├── 📁 route_models/              # API request/response models
│   ├── auth_models.py            # Authentication models
│   ├── chat_models.py            # Chat endpoint models
│   └── doc_models.py             # Document endpoint models
├── 📁 routers/                   # FastAPI route handlers
│   ├── __init__.py               # Router exports
│   ├── auth_routes.py            # Authentication endpoints
│   ├── chat_routes.py            # Chat endpoints
│   └── doc_routes.py             # Document endpoints
├── 📁 utilities/                 # Utility functions
│   ├── __init__.py               # Utility exports
│   └── utility.py                # PDF processing and vectorization
├── 📁 logs/                      # Application logs (auto-created)
├── 📁 vector_db_storage/         # Qdrant data storage (auto-created)
├── .env                          # Environment variables
├── .gitignore                    # Git ignore rules
├── .python-version               # Python version specification
├── alembic.ini                   # Alembic configuration
├── main.py                       # FastAPI application entry point
├── pyproject.toml                # Project dependencies and metadata
├── settings.py                   # Application configuration
├── uv.lock                       # UV lock file
└── README.md                     # This file

Key Components

Core Application (`main.py`)

FastAPI application setup with CORS middleware
Application lifespan management
Database connection initialization
Router registration and API documentation

Authentication System (`authentication/`)

JWT Token Management: Secure token creation and validation
Password Security: Argon2 hashing for password storage
Role-Based Access: User role authorization system
Dependency Injection: FastAPI dependencies for route protection

Database Layer (`database/`)

PostgreSQL Models: User, Document, and Message entities
Vector Database: Qdrant integration for embeddings storage
Connection Management: Session handling and connection pooling
Migration Support: Alembic for database schema management

Language Model Integration (`llm/`)

OpenAI Client: GPT-4.1 and embedding model integration
RAG Pipeline: Document processing and context retrieval
Response Generation: Contextual answer generation
Token Management: Usage tracking and optimization

API Routes (`routers/`)

Authentication Routes: Login, signup, user management
Document Routes: Upload, list, delete PDF documents
Chat Routes: Message sending, history management
Error Handling: Comprehensive exception management

Utilities (`utilities/`)

PDF Processing: Document parsing and text extraction
Vectorization: Text-to-embedding conversion
Chunking: Intelligent text segmentation

Qdrant Vector Database

Collection: "rag"

Vector Dimension: 3072 (OpenAI text-embedding-3-large)
Distance Metric: Cosine similarity

Payload Schema:

{
  "source": "document_filename.pdf",
  "text": "chunk_content_text",
  "uuid": "document_uuid"
}

Relationships

Users → Documents: One-to-many (cascade delete)
Users → Messages: One-to-many (cascade delete)
Documents → Vector Embeddings: One-to-many (via UUID)

🔧 Development

Setting Up Development Environment

Install Development Dependencies
```
uv pip install -e .[dev]
```
Code Formatting
```
black .
```

Development Tools

Available Dependencies

black: Code formatting (25.12.0+)
icecream: Enhanced debugging (2.1.8+)

Database Migrations

Create a New Migration

alembic revision --autogenerate -m "Description of changes"

Apply Migrations

alembic upgrade head

Rollback Migration

alembic downgrade -1

Logging

The application uses a comprehensive logging system:

File Logging: Rotating logs in logs/app.log
Console Logging: Real-time output during development
Log Levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
Structured Format: Timestamp, module, level, file:line, message

Code Style Guidelines

Import Organization

# 1st party imports
import os
from typing import List

# 3rd party imports
from fastapi import FastAPI
from sqlalchemy import create_engine

# local imports
from settings import project_settings
from database.models import User

Function Documentation

def example_function(param1: str, param2: int = 10) -> bool:
    """
    Brief description of the function.
    
    Args:
        param1 (str): Description of param1.
        param2 (int): Description of param2. Defaults to 10.
    
    Returns:
        bool: Description of return value.
    """

Error Handling

try:
    # operation
    result = perform_operation()
    logger.info("Operation successful")
    return result
except SpecificException as e:
    logger.error(f"Specific error: {str(e)}")
    raise HTTPException(status_code=400, detail="Specific error message")
except Exception as e:
    logger.exception(f"Unexpected error: {str(e)}")
    raise HTTPException(status_code=500, detail="Internal server error")

🤝 Contributing

Getting Started

Fork the Repository

Create a Feature Branch

git checkout -b feature/your-feature-name

Make Changes
- Follow code style guidelines
- Add tests for new functionality
- Update documentation as needed

Test Your Changes

# Format code
black .

# Run tests (when available)
pytest

# Test API endpoints
curl -X GET "http://localhost:8000/health"

Submit a Pull Request
- Provide clear description of changes
- Reference any related issues
- Ensure all checks pass

Development Guidelines

Code Quality
- Follow PEP 8 style guidelines
- Use type hints consistently
- Write comprehensive docstrings
- Handle errors gracefully
Testing
- Write unit tests for new functions
- Test API endpoints thoroughly
- Verify database operations
- Test error conditions
Documentation
- Update README for new features
- Document API changes
- Add inline code comments
- Update configuration examples

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

🙏 Acknowledgments

OpenAI: For providing powerful language models and embeddings
Qdrant: For the excellent vector database solution
FastAPI: For the modern, fast web framework
LlamaIndex: For document processing capabilities
SQLAlchemy: For robust database ORM
Contributors: All developers who have contributed to this project

Built with ❤️ using FastAPI, OpenAI, and Qdrant

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
alembic		alembic
authentication		authentication
database		database
llm		llm
log_config		log_config
route_models		route_models
routers		routers
utilities		utilities
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
alembic.ini		alembic.ini
main.py		main.py
pyproject.toml		pyproject.toml
settings.py		settings.py
uv.lock		uv.lock

Prithvi824/LLM-Rag

Folders and files

Latest commit

History

Repository files navigation

RAG Model - Document Q&A System

🚀 Features

Core Functionality

Technical Highlights

📋 Table of Contents

🏗️ Architecture

System Overview

RAG Pipeline

Technology Stack

🛠️ Installation

Prerequisites

Quick Start

⚙️ Configuration

Environment Variables

Configuration Details

Model Settings

Database Settings

Security Settings

📖 Usage

Starting the Application

Basic Workflow

📚 API Documentation

Authentication Endpoints

POST /auth/signup

POST /auth/login

GET /auth/me

Document Management Endpoints

POST /docs/upload-pdf-document

GET /docs/list-documents

DELETE /docs/delete-document/{doc_uuid}

Chat Endpoints

POST /chats/send-message

GET /chats/messages

DELETE /chats/clear-history

Health Check

GET /health

📁 Project Structure

Key Components

Core Application (main.py)

Authentication System (authentication/)

Database Layer (database/)

Language Model Integration (llm/)

API Routes (routers/)

Utilities (utilities/)

Qdrant Vector Database

Collection: "rag"

Relationships

🔧 Development

Setting Up Development Environment

Development Tools

Available Dependencies

Database Migrations

Logging

Code Style Guidelines

🤝 Contributing

Getting Started

Development Guidelines

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

POST `/auth/signup`

POST `/auth/login`

GET `/auth/me`

POST `/docs/upload-pdf-document`

GET `/docs/list-documents`

DELETE `/docs/delete-document/{doc_uuid}`

POST `/chats/send-message`

GET `/chats/messages`

DELETE `/chats/clear-history`

GET `/health`

Core Application (`main.py`)

Authentication System (`authentication/`)

Database Layer (`database/`)

Language Model Integration (`llm/`)

API Routes (`routers/`)

Utilities (`utilities/`)

Packages