Transform from a static prediction tool into an intelligent Medical AI Assistant with fine-tuned LLM, RAG memory, and real-time streaming.
- Fine-tuned Medical LLM (Mistral-7B + LoRA)
- RAG Memory Engine (FAISS + PostgreSQL)
- Real-time SSE Streaming
- Subscription Logic (Free/Premium tiers)
- Modern Dark UI (ChatGPT-style)
- Feedback Learning System
- Production-Ready (Docker + GPU support)
User β Next.js Frontend (React + Zustand)
β
FastAPI Backend (Python + Async)
β
Medical LLM (Mistral-7B-LoRA) + RAG (FAISS)
β
PostgreSQL + Redis
- Python 3.10+
- Node.js 18+
- Docker & Docker Compose
- NVIDIA GPU (for training, optional for inference)
- CUDA 11.8+ (for GPU training)
- 16GB+ RAM (32GB recommended)
- 50GB+ Disk Space
git clone <your-repo>
cd doctorg
# Copy environment file
cp .env.example .env
# Edit .env with your API keys
nano .envEdit .env file:
# Required - Add your OpenAI API key
OPENAI_API_KEY=sk-proj-your_key_here
# Database (auto-configured in Docker)
POSTGRES_PASSWORD=your_secure_password_here
JWT_SECRET=your_jwt_secret_min_32_chars
# Optional - for dataset augmentation
GOOGLE_API_KEY=your_google_key_here
PUBMED_EMAIL=your_email@example.com# Build and start all services
docker-compose up --build
# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docscd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run database migrations
python -c "from app.db.database import init_db; init_db()"
# Start backend server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000cd frontend
# Install dependencies
npm install
# Start development server
npm run dev
# Access at http://localhost:3000cd backend
# Activate virtual environment
source venv/bin/activate
# Run data preparation (converts CSV to instruction format)
python scripts/prepare_training_data.pyThis creates:
backend/data/training/train.jsonl- Training databackend/data/training/val.jsonl- Validation data
# Fetch PubMed abstracts and Clinical QA datasets
python scripts/web_agent.py
# This downloads:
# - PubMed medical abstracts (1000+)
# - MedQA clinical questions
# - PubMedQA datasetRequirements:
- NVIDIA GPU with 16GB+ VRAM (RTX 3090, A100, etc.)
- CUDA 11.8+ installed
- PyTorch with CUDA support
# Verify GPU is available
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
# Start fine-tuning (takes 2-6 hours depending on GPU)
python scripts/train_llm.pyTraining Configuration:
- Base Model: Mistral-7B-v0.1
- Method: LoRA (Low-Rank Adaptation)
- Epochs: 3
- Batch Size: 4 (adjust based on VRAM)
- Learning Rate: 2e-4
- Quantization: 8-bit (reduces VRAM usage)
Expected Output:
Loading model: mistralai/Mistral-7B-v0.1
Model loaded successfully
LoRA configuration created
trainable params: 4,194,304 || all params: 7,241,732,096 || trainable%: 0.0579
Starting training...
Epoch 1/3: 100%|ββββββββββ| 500/500 [1:23:45<00:00]
Saving model to backend/models/doctorg-medical-llm
Training completed successfully!
# Test inference
python -c "
from backend.scripts.train_llm import MedicalLLMTrainer
trainer = MedicalLLMTrainer()
prompt = '''You are a medical AI assistant. Analyze the symptoms and provide a structured medical assessment.
Symptoms: headache, fever, fatigue
Provide your response in JSON format:'''
response = trainer.test_inference(prompt)
print(response)
"If you don't have a local GPU:
Google Colab (Free GPU):
# Upload your code to Google Drive
# Open Google Colab notebook
# Mount Drive and run:
!pip install -r requirements.txt
!python scripts/prepare_training_data.py
!python scripts/train_llm.pyAWS/GCP/Azure:
- Launch GPU instance (g4dn.xlarge on AWS, n1-standard-4 with T4 on GCP)
- Clone repository
- Run training scripts
- Download trained model
# Build for production
docker-compose -f docker-compose.yml up --build -d
# View logs
docker-compose logs -f
# Stop services
docker-compose down
# Stop and remove volumes (clean slate)
docker-compose down -vEdit docker-compose.yml to enable GPU:
services:
backend:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]Then run:
# Requires nvidia-docker2 installed
docker-compose up --build# Development
docker-compose -f docker-compose.yml up
# Production with GPU
docker-compose -f docker-compose.prod.yml up
# Staging
docker-compose -f docker-compose.staging.yml upcurl -X POST http://localhost:8000/api/v1/auth/register \
-H "Content-Type: application/json" \
-d '{
"email": "user@example.com",
"password": "securepassword123",
"full_name": "John Doe"
}'curl -X POST http://localhost:8000/api/v1/auth/login \
-H "Content-Type: application/json" \
-d '{
"email": "user@example.com",
"password": "securepassword123"
}'Response:
{
"access_token": "eyJhbGciOiJIUzI1NiIs...",
"token_type": "bearer",
"expires_in": 3600
}Via Web UI:
- Open http://localhost:3000
- Login with your credentials
- Describe your symptoms
- Get real-time streaming response
Via API:
curl -X POST http://localhost:8000/api/v1/chat/predict \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"symptoms": ["headache", "fever", "fatigue"]
}'curl -X POST http://localhost:8000/api/v1/feedback \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"session_id": "session-id-here",
"rating": 5,
"helpful": true,
"comments": "Very helpful diagnosis!"
}'Free Tier:
- 5 sessions per month
- No memory/history
- Basic medical insights
Premium Tier:
- Unlimited sessions
- Full RAG memory (past consultations)
- Detailed follow-up questions
- Priority support
Edit backend/app/core/constants.py:
class SubscriptionLimits:
FREE_SESSION_LIMIT = 5 # Change to desired limit
PREMIUM_SESSION_LIMIT = -1 # -1 = unlimitedcd backend
pytest tests/ -vcd frontend
npm test# Start all services
docker-compose up -d
# Run E2E tests
npm run test:e2ecurl http://localhost:8000/healthResponse:
{
"status": "healthy",
"version": "1.0.0",
"timestamp": "2026-02-15T10:30:00",
"services": {
"database": "connected",
"llm": "ready",
"rag": "ready"
}
}# Backend logs
docker-compose logs -f backend
# Frontend logs
docker-compose logs -f frontend
# Database logs
docker-compose logs -f postgres- β No hardcoded secrets (all in .env)
- β Bcrypt password hashing
- β JWT authentication with expiration
- β SQL injection prevention (ORM)
- β XSS protection (React escaping)
- β CORS configured
- β Security headers enabled
- β Rate limiting implemented
- Change default passwords in
.env - Use strong JWT secret (min 32 characters)
- Enable HTTPS in production
- Regular dependency updates:
pip list --outdated - Backup database regularly
# Check CUDA installation
nvidia-smi
# Check PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"
# Reinstall PyTorch with CUDA
pip install torch --index-url https://download.pytorch.org/whl/cu118# Clean rebuild
docker-compose down -v
docker-compose build --no-cache
docker-compose up
# Check container logs
docker-compose logs backend# Reset database
docker-compose down -v
docker-compose up postgres -d
sleep 10
docker-compose up backend# Find and kill process on port 8000
# Windows:
netstat -ano | findstr :8000
taskkill /PID <PID> /F
# Linux/Mac:
lsof -ti:8000 | xargs kill -9Interactive API docs available at:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
This project is licensed under the MIT License.
- Developer: Abhishek Gupta
- GitHub: @cosmos-dx
- LinkedIn: abhishek-gupta
- Mistral AI for the base model
- Hugging Face for transformers library
- OpenAI for API integration
- FastAPI and Next.js communities
For issues and questions:
- GitHub Issues: Create an issue
- Email: support@doctorg.ai
- Discord: Join our community