Enterprise-grade AI system for intelligent document querying and dynamic tool execution with Model Context Protocol (MCP) integration
BotForge RAG is a production-ready AI platform that seamlessly combines Retrieval-Augmented Generation (RAG) with external tool execution capabilities. The system features intelligent intent detection to automatically route queries between information retrieval and action execution pipelines, making it ideal for complex business integrations and AI-powered applications.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ User Query │───▶│ Intent │───▶│ Response │
│ │ │ Detection │ │ Generation │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Route Query │
└─────────────────┘
│
┌────────────┼────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ RAG Pipeline │ │ MCP Pipeline │
│ │ │ │
│ • Vector Query │ │ • LangChain │
│ • Context │ │ Agent │
│ Assembly │ │ • External │
│ • LLM Response │ │ Tool Exec │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌────────────────--─┐ ┌─────────────────┐
│ Vector Search │ │ External Tools │
│ Knowledge Base. │ │ Dynamic Exec │
└─────────────────--┘ └─────────────────┘
- Python 3.11+
- UV package manager installed
# Clone the repository
git clone <repository-url>
cd botforge-rag
# Start the application (installs dependencies automatically)
./scripts/start.shThe application will be available at:
- API: http://localhost:8000
- Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
# Check if everything is working
./scripts/status.shAll dependencies are managed through UV and automatically installed. The system includes:
- ✅ LangChain ecosystem for AI operations
- ✅ MCP (Model Context Protocol) client for tool integration
- ✅ FastAPI for REST API
- ✅ Vector database for document storage
- ✅ Comprehensive test suite and development tools
- 🧠 Intelligent Intent Detection - Advanced query classification with context awareness
- 🔄 Unified API Architecture - Single endpoint handles both RAG and tool execution
- 🛠️ Dynamic MCP Integration - Per-bot registration of external business tools
- ⚡ High-Performance Stack - Async processing, Redis caching, connection pooling
- 🎯 Production-Ready RAG - Vector similarity search with source attribution
- 🔧 Extensible Design - Plugin architecture for custom tools and capabilities
- 🔒 Enterprise Security - Bot-scoped access control and request validation
- 📊 Comprehensive Monitoring - Health checks, metrics, and error tracking
- 🚀 Scalable Infrastructure - Docker, Kubernetes, and cloud-ready deployment
- 📝 Professional Documentation - Complete API reference and integration guides
- Python 3.9+ with pip or uv package manager
- PostgreSQL 15+ for primary data storage
- Redis 7+ for caching and session management
- OpenAI API Key for LLM functionality
- Upstash Vector Database account for embeddings
# Clone the repository
git clone https://github.com/your-org/botforge-rag.git
cd botforge-rag
# Install dependencies using uv (recommended)
uv sync
# Alternative: Install with pip
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your configuration (see Configuration section)
# Initialize database schema
python -c "
import asyncio
import asyncpg
from src.botforge.core.config import settings
async def init_db():
conn = await asyncpg.connect(settings.postgres_uri)
with open('create.sql', 'r') as f:
await conn.execute(f.read())
await conn.close()
print('Database initialized successfully')
asyncio.run(init_db())
"
# Start the development server
PYTHONPATH=./src uvicorn botforge.main:app --reload --port 8000# Test information retrieval (RAG)
curl -X POST "http://localhost:8000/vector/query-dynamic" \
-H "Content-Type: application/json" \
-d '{
"user_id": "test-user",
"bot_id": "test-bot",
"client_id": "test-client",
"query": "What is machine learning?",
"model": "gpt-3.5-turbo"
}'
# Test tool execution (MCP)
curl -X POST "http://localhost:8000/vector/query-dynamic" \
-H "Content-Type: application/json" \
-d '{
"user_id": "test-user",
"bot_id": "test-bot",
"client_id": "test-client",
"query": "Calculate 25 * 17 + 100",
"model": "gpt-3.5-turbo"
}'Create a .env file in the project root:
# Database Configuration
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_USER=botforge
POSTGRES_PASSWORD=your_password
POSTGRES_DB=botforge
# Redis Configuration
REDIS_URI=redis://localhost:6379/0
# Vector Database (Upstash)
UPSTASH_URL=https://your-region-xxxxx.upstash.io
UPSTASH_TOKEN=your_upstash_token
# OpenAI Configuration
OPENAI_API_KEY=sk-your_openai_api_key
OPENAI_DEFAULT_MODEL=gpt-3.5-turbo
OPENAI_MAX_TOKENS=1000
OPENAI_TEMPERATURE=0.7
# Application Settings
UPLOAD_LOCATION=/path/to/upload/directory
DEFAULT_TOP_K=5
MAX_TOP_K=20
DEFAULT_HISTORY_SIZE=3The system supports various configuration options through src/botforge/core/config.py:
- Vector Search: Configurable similarity thresholds and result limits
- MCP Integration: Timeout settings and retry policies
- Caching: TTL configuration for different cache types
- Performance: Connection pool sizes and async operation limits
import httpx
import asyncio
class BotForgeClient:
def __init__(self, base_url="http://localhost:8000"):
self.base_url = base_url
async def query(self, user_id, bot_id, query, client_id="python-sdk"):
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.base_url}/vector/query-dynamic",
json={
"user_id": user_id,
"bot_id": bot_id,
"client_id": client_id,
"query": query,
"model": "gpt-3.5-turbo"
}
)
return response.json()
async def register_mcp_server(self, bot_id, name, endpoint_url):
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.base_url}/mcp/register",
json={
"bot_id": bot_id,
"name": name,
"endpoint_url": endpoint_url,
"description": f"External tools for {name}"
}
)
return response.json()
# Usage example
async def main():
client = BotForgeClient()
# Information query (RAG)
result = await client.query(
user_id="user-123",
bot_id="bot-456",
query="What is our company return policy?"
)
print(f"RAG Response: {result['response']}")
# Execution query (MCP)
result = await client.query(
user_id="user-123",
bot_id="bot-456",
query="Calculate compound interest for $1000 at 5% for 10 years"
)
print(f"MCP Response: {result['response']}")
asyncio.run(main())class BotForgeClient {
constructor(baseUrl = 'http://localhost:8000') {
this.baseUrl = baseUrl;
}
async query(userId, botId, query, clientId = 'javascript-sdk') {
const response = await fetch(`${this.baseUrl}/vector/query-dynamic`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
user_id: userId,
bot_id: botId,
client_id: clientId,
query: query,
model: 'gpt-3.5-turbo'
})
});
return await response.json();
}
async registerMcpServer(botId, name, endpointUrl) {
const response = await fetch(`${this.baseUrl}/mcp/register`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
bot_id: botId,
name: name,
endpoint_url: endpointUrl,
description: `External tools for ${name}`
})
});
return await response.json();
}
}
// Usage
const client = new BotForgeClient();
// Information retrieval
client.query('user-123', 'bot-456', 'What are the system requirements?')
.then(result => console.log('Info:', result.response));
// Tool execution
client.query('user-123', 'bot-456', 'Convert "hello world" to uppercase')
.then(result => console.log('Tool:', result.response));BotForge RAG now supports Model Context Protocol (MCP) for professional-grade tool integration, following the same patterns as Anthropic Claude and GitHub Copilot.
# Install MCP client dependencies
./scripts/dev/install_mcp_client.sh
# Test MCP integration (requires MCP server running)
./scripts/dev/test_mcp_tools.pyOption 1: Claude-Style MCP Service (Recommended)
from botforge.services.mcp_agent_service_new import MCPAgentService
async with MCPAgentService() as mcp_service:
response = await mcp_service.query_with_mcp_agent(
bot_id="my-bot",
query="Use the weather tool to check temperature in Paris"
)
print(response)Option 2: Refactored Original Service
from botforge.services.mcp_agent_service import MCPAgentService
service = MCPAgentService()
response = await service.query_with_mcp_agent("bot-id", "your query")- 🔧 Proper MCP Protocol: Uses official MCP Python client
- 🚀 Dynamic Tool Discovery: Automatically detects tools from MCP servers
- 🧠 LLM-Driven Parameters: No hardcoded tool schemas required
- ⚡ Async Session Management: Efficient connection handling
- 📊 Database Integration: Server URLs fetched dynamically from DB
# Example: Weather tool with proper MCP protocol
from mcp.client.session import ClientSession
from mcp.client.streamable_http import streamablehttp_client
async def use_weather_tool():
async with streamablehttp_client(url="http://localhost:3001") as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# List available tools
tools = await session.list_tools()
print(f"Available tools: {[t.name for t in tools.tools]}")
# Execute weather tool
result = await session.call_tool("weather", {
"location": "Paris",
"units": "celsius"
})
return result.content[0].textFor detailed setup instructions, see docs/MCP_INTEGRATION_STATUS.md
curl -X POST "http://localhost:8000/mcp/register" \
-H "Content-Type: application/json" \
-d '{
"bot_id": "your-bot-id",
"name": "Business Tools",
"endpoint_url": "http://localhost:3001",
"description": "Customer management and notification tools"
}'# The bot will now automatically use external tools for relevant queries
curl -X POST "http://localhost:8000/vector/query-dynamic" \
-H "Content-Type: application/json" \
-d '{
"user_id": "manager-123",
"bot_id": "your-bot-id",
"client_id": "business-app",
"query": "Look up customer information for ID 12345",
"model": "gpt-3.5-turbo"
}'This project includes comprehensive documentation for enterprise-grade development and deployment:
- Architecture Guide - Complete system architecture, components, and data flows
- API Reference - Detailed endpoint documentation with request/response examples
- Deployment Guide - Docker, Kubernetes, and production deployment
- MCP Protocol Guide - External MCP server integration and examples
- External MCP Integration - Business tool integration patterns
- Unified Dynamic Query System - Intent-based routing details
- MCP Agent Implementation - LangChain agent development
- Intent-based Query Routing: Automatic classification between information retrieval and tool execution
- Per-bot MCP Registration: Isolated tool environments for different business contexts
- Async-first Design: High-performance async processing throughout the stack
- Professional Error Handling: Comprehensive error management with graceful degradation
- Enterprise Security: Bot-scoped access control and input validation
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ User Query │───▶│ Intent Detection│───▶│ Route Decision │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌───────────────────────────────┼───────────────────────────────┐
▼ ▼ │
┌─────────────────┐ ┌─────────────────┐ │
│ RAG Pipeline │ │ MCP Agent │ │
│ │ │ Pipeline │ │
│ • Vector Search │ │ │ │
│ • Context │ │ • LangChain │ │
│ • OpenAI LLM │ │ • External Tools│ │
└─────────────────┘ └─────────────────┘ │
│ │ │
▼ ▼ │
┌─────────────────┐ ┌─────────────────┐ │
│ Knowledge-based │ │ Action-based │ │
│ Response │ │ Response │ │
└─────────────────┘ └─────────────────┘ │
│
┌─────────────────┐ │
│ External MCP │◀───────────────────────---┘
│ Servers │
│ │
│ • Calculator │
│ • String Ops │
│ • Custom Tools │
└─────────────────┘
- Architecture Guide - Detailed technical architecture
- API Reference - Complete API documentation
- MCP Protocol - External server integration guide
- Deployment Guide - Production deployment instructions
Automatically classifies user queries:
- Information Retrieval:
"What is X?","How does Y work?" - Execution:
"Calculate X","Convert Y to Z"
- Vector-based document search
- Contextual response generation
- Source attribution and relevance scoring
- LangChain-powered tool selection
- External MCP server integration
- Dynamic tool discovery and execution
- Per-bot tool registration
- Server health monitoring
- Execution logging and metrics
# User asks about concepts, definitions, explanations
query = "What are the benefits of microservices architecture?"
# → Routes to RAG pipeline
# → Returns knowledge-based response with sources# User requests calculations, transformations, actions
query = "Convert 'hello world' to uppercase and count the words"
# → Routes to MCP agent
# → Executes external tools
# → Returns action results# Complex queries that might need both
query = "What is the current price of Bitcoin and calculate 10% of it?"
# → Agent can use both information retrieval and calculation tools# Database
DATABASE_URL=postgresql+asyncpg://user:pass@localhost/botforge
# Redis Cache
REDIS_URL=redis://localhost:6379
# OpenAI
OPENAI_API_KEY=sk-your-api-key-here
# Vector Model
VECTOR_MODEL_PATH=all-MiniLM-L6-v2
# Server
HOST=0.0.0.0
PORT=8000
DEBUG=falsecurl -X POST "http://localhost:8000/api/mcp/register" \
-H "Content-Type: application/json" \
-d '{
"bot_id": "550e8400-e29b-41d4-a716-446655440000",
"name": "Calculator Server",
"endpoint_url": "http://localhost:3001",
"description": "Mathematical calculations and string operations"
}'curl -X POST "http://localhost:8000/vector/query-dynamic" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user-123",
"bot_id": "550e8400-e29b-41d4-a716-446655440000",
"client_id": "web-client",
"query": "What is the square root of 144?",
"model": "gpt-3.5-turbo"
}'# System health
curl http://localhost:8000/health
# Component health
curl http://localhost:8000/health/detailed- Query response times
- Intent detection accuracy
- Tool execution success rates
- Cache hit rates
- Database performance
- Horizontal Scaling: Load balance multiple app instances
- Database Optimization: Connection pooling and read replicas
- Caching Strategy: Multi-layer caching with Redis and CDN
- Vector Database: Upstash Vector auto-scaling capabilities
- API Authentication: JWT-based authentication (planned)
- Input Validation: Comprehensive request validation
- Rate Limiting: Per-user and per-bot rate limits
- MCP Security: Secure external server communication
- Health Checks: Multi-layer health monitoring
- Error Tracking: Structured logging with error aggregation
- Performance Metrics: Response times and throughput monitoring
- Business Metrics: Intent detection accuracy and tool usage analytics
- Database Replication: PostgreSQL streaming replication
- Redis Clustering: Redis Cluster for cache availability
- Circuit Breakers: Graceful degradation for external services
- Backup Strategy: Automated database and configuration backups
We welcome contributions from the developer community! Please follow our contribution guidelines.
# Clone the repository
git clone https://github.com/your-org/botforge-rag.git
cd botforge-rag
# Install dependencies with development tools
uv sync --all-extras
# Set up pre-commit hooks for code quality
pre-commit install
# Run tests
pytest tests/ -v
# Start development server
PYTHONPATH=./src uvicorn botforge.main:app --reload --port 8000- Python Style: Follow PEP 8 with Black formatting
- Type Hints: Full type annotations required
- Documentation: Docstrings for all public functions
- Testing: Unit tests for new features
- Error Handling: Comprehensive error handling
- Fork the repository and create a feature branch
- Implement changes with tests and documentation
- Run the test suite and ensure all checks pass
- Submit a pull request with detailed description
- Address review feedback and maintain clean commit history
- Use async/await for I/O operations
- Follow existing patterns for error handling
- Add comprehensive tests for new functionality
- Update documentation for API changes
- Consider performance implications
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: docs.botforge.ai
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@botforge.ai
Built with ❤️ by the BotForge team