🤖 Multi-RAG Chatbot

A comprehensive framework for exploring Retrieval-Augmented Generation (RAG) with multiple retrieval algorithms

🚀 Quick Start • 📖 Documentation • 🛠️ Installation • 💡 Examples • 🤝 Contributing

🎯 Overview

Multi-RAG Chatbot is a powerful framework designed to compare and evaluate different Retrieval-Augmented Generation approaches on document-based question-answering tasks. This project enables researchers and developers to experiment with various RAG methodologies including Probabilistic RAG, Graph RAG, and BM25 retrieval algorithms.

✨ Key Highlights

🔍 Multiple Retrieval Algorithms - Compare Probabilistic RAG, Graph RAG, and BM25
📊 Performance Evaluation - Built-in metrics and comparison tools
🎛️ Flexible Configuration - Easy-to-customize parameters and models
📚 Document Processing - Support for multiple document formats
🌐 Interactive Interface - User-friendly chat interface
⚡ Optimized Performance - Efficient retrieval and generation pipeline

�️ Architecture

The following diagram illustrates the Multi-RAG Chatbot architecture and data flow:

graph TB
    A[User Query] --> B[Query Processor]
    B --> C{Select RAG Algorithm}
    
    C -->|Probabilistic| D[Embedding Model]
    C -->|Graph| E[Knowledge Graph]
    C -->|BM25| F[TF-IDF Indexer]
    
    D --> G[Vector Search]
    E --> H[Graph Traversal]
    F --> I[Keyword Matching]
    
    G --> J[Document Retrieval]
    H --> J
    I --> J
    
    J --> K[Context Assembly]
    K --> L[LLM Generation]
    L --> M[Response Ranking]
    M --> N[Final Answer]
    
    O[Document Corpus] --> P[Document Processor]
    P --> Q[Chunking & Embedding]
    Q --> D
    Q --> E
    Q --> F
    
    style A fill:#e1f5ff
    style N fill:#c8e6c9
    style L fill:#fff9c4
    style C fill:#ffccbc

Key Components

Query Processor: Analyzes and preprocesses user queries
RAG Algorithms: Three parallel retrieval strategies
Document Processor: Handles ingestion and preprocessing
LLM Generation: Produces context-aware responses
Response Ranking: Evaluates and selects best answers

�🌟 Features

🔧 RAG Algorithms

Algorithm	Description	Best For
Probabilistic RAG	Uses embedding similarity and probability scoring	General-purpose QA
Graph RAG	Leverages knowledge graphs for contextual retrieval	Complex, interconnected documents
BM25	Traditional keyword-based retrieval with TF-IDF	Keyword-specific queries

📋 Core Capabilities

✅ Multi-format document ingestion (PDF, TXT, DOCX, MD)
✅ Real-time performance comparison
✅ Customizable embedding models
✅ Advanced chunking strategies
✅ Interactive evaluation dashboard
✅ Export results and metrics

🛠️ Installation

Prerequisites

Python 3.8 or higher
pip package manager
Git

Quick Installation

# Clone the repository
git clone https://github.com/sobhan2204/multi-rag-chatbot.git
cd multi-rag-chatbot

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

🚀 Quick Start

1. Basic Setup

from multi_rag_chatbot import MultiRAGChatbot

# Initialize the chatbot
chatbot = MultiRAGChatbot(
    algorithms=['probabilistic', 'graph', 'bm25']
)

# Load documents
chatbot.load_documents('path/to/your/documents/')

2. Run Comparisons

# Ask a question and compare results
question = "What are the main benefits of renewable energy?"
results = chatbot.compare_algorithms(question)

# View performance metrics
chatbot.display_metrics()

💡 Usage Examples

Example 1: Document Analysis

from multi_rag_chatbot import DocumentProcessor, RAGComparator

# Process documents
processor = DocumentProcessor()
documents = processor.load_from_directory("./documents/")

# Compare algorithms
comparator = RAGComparator(documents)
results = comparator.evaluate_question("Explain the concept of machine learning")

print(f"Best performing algorithm: {results.best_algorithm}")
print(f"Confidence score: {results.confidence}")

Example 2: Custom Configuration

config = {
    'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2',
    'chunk_size': 1000,
    'overlap': 200,
    'top_k': 5
}

chatbot = MultiRAGChatbot(config=config)
chatbot.add_algorithm('custom_rag', CustomRAGImplementation())

📊 Performance Metrics

The framework provides comprehensive evaluation metrics:

Retrieval Accuracy: Measures relevance of retrieved documents
Response Quality: BLEU, ROUGE scores for generated answers
Latency: Response time for each algorithm
Resource Usage: Memory and CPU consumption
Coherence Score: Semantic consistency of responses

📁 Project Structure

multi-rag-chatbot/
├── 📂 src/
│   ├── 🧠 algorithms/
│   │   ├── probabilistic_rag.py
│   │   ├── graph_rag.py
│   │   └── bm25_rag.py
│   ├── 📊 evaluation/
│   │   ├── metrics.py
│   │   └── comparator.py
│   ├── 📄 document_processing/
│   │   ├── loader.py
│   │   └── chunker.py
│   └── 🌐 interface/
│       ├── streamlit_app.py
│       └── api.py
├── 📖 docs/
│   ├── api_reference.md
│   └── tutorials/
├── 🧪 tests/
├── 📊 examples/
├── 📋 requirements.txt
└── 📖 README.md

⚙️ Configuration

Environment Variables

Create a .env file in the root directory:

# OpenAI API (optional)
OPENAI_API_KEY=your_api_key_here

# Hugging Face Token (optional)
HUGGINGFACE_TOKEN=your_token_here

# Database Configuration
VECTOR_DB_PATH=./data/vectordb
GRAPH_DB_URL=bolt://localhost:7687

# Performance Settings
MAX_CONCURRENT_REQUESTS=10
CACHE_SIZE=1000

Model Configuration

# config.yaml
models:
  embedding: "sentence-transformers/all-MiniLM-L6-v2"
  llm: "gpt-3.5-turbo"
  
retrieval:
  chunk_size: 1000
  chunk_overlap: 200
  top_k: 5
  algorithms:
    - probabilistic
    - graph
    - bm25

🧪 Testing

# Run all tests
python -m pytest tests/

# Run specific test categories
python -m pytest tests/test_algorithms.py -v
python -m pytest tests/test_evaluation.py -v

# Run with coverage
python -m pytest --cov=src tests/

🤝 Contributing

We welcome contributions! Here's how you can help:

🐛 Bug Reports

Use the issue tracker
Include detailed reproduction steps
Provide system information

✨ Feature Requests

Check existing issues first
Describe the use case clearly
Consider submitting a pull request

🔧 Development Setup

# Fork and clone the repository
git clone https://github.com/your-username/multi-rag-chatbot.git

# Install development dependencies
pip install -r requirements-dev.txt

# Install pre-commit hooks
pre-commit install

# Run tests before submitting
python -m pytest

📋 Contribution Guidelines

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Add tests for new functionality
Ensure all tests pass
Submit a pull request

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

⭐ If this project helped you, please consider giving it a star! ⭐

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Lib/site-packages		Lib/site-packages
__pycache__		__pycache__
all-MiniLM-L6-v2		all-MiniLM-L6-v2
data		data
faiss_index		faiss_index
local_cross_encoder		local_cross_encoder
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
app.py		app.py
custom_exceptions.py		custom_exceptions.py
enhanced_kg_graph.pkl		enhanced_kg_graph.pkl
error.log		error.log
exceptions.py		exceptions.py
index.js		index.js
kg_graph.pkl		kg_graph.pkl
main.py		main.py
preprocessing.py		preprocessing.py
pyproject.toml		pyproject.toml
query_final.py		query_final.py
query_final_KG.py		query_final_KG.py
query_with_BM25.py		query_with_BM25.py
requirements.txt		requirements.txt
router.py		router.py
server.py		server.py
utils.py		utils.py
uv.lock		uv.lock
web_scraper.py		web_scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Multi-RAG Chatbot

🎯 Overview

✨ Key Highlights

�️ Architecture

Key Components

�🌟 Features

🔧 RAG Algorithms

📋 Core Capabilities

🛠️ Installation

Prerequisites

Quick Installation

🚀 Quick Start

1. Basic Setup

2. Run Comparisons

💡 Usage Examples

Example 1: Document Analysis

Example 2: Custom Configuration

📊 Performance Metrics

📁 Project Structure

⚙️ Configuration

Environment Variables

Model Configuration

🧪 Testing

🤝 Contributing

🐛 Bug Reports

✨ Feature Requests

🔧 Development Setup

📋 Contribution Guidelines

📜 License

About

Uh oh!

Releases

Packages

Languages

License

sobhan2204/multi-rag-chatbot

Folders and files

Latest commit

History

Repository files navigation

🤖 Multi-RAG Chatbot

🎯 Overview

✨ Key Highlights

�️ Architecture

Key Components

�🌟 Features

🔧 RAG Algorithms

📋 Core Capabilities

🛠️ Installation

Prerequisites

Quick Installation

🚀 Quick Start

1. Basic Setup

2. Run Comparisons

💡 Usage Examples

Example 1: Document Analysis

Example 2: Custom Configuration

📊 Performance Metrics

📁 Project Structure

⚙️ Configuration

Environment Variables

Model Configuration

🧪 Testing

🤝 Contributing

🐛 Bug Reports

✨ Feature Requests

🔧 Development Setup

📋 Contribution Guidelines

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages