An AI-powered electrical engineering homework solver that analyzes textbook PDFs, understands questions with images, and generates detailed step-by-step solutions with circuit diagrams using schemdraw.
demo_ee_tutor.mp4
- 📚 PDF Analysis: Extract relevant pages from textbooks using ColPali vision-retrieval model
- 🔍 Multi-modal Question Analysis: Process textual questions and circuit images using Qwen2.5-VL
- ⚡ Circuit Diagram Generation: Create schemdraw circuit diagrams using DeepSeek code generation for output images.
- 🌐 Streamlit Web Interface: Tabbed interface with solution, generated code, and metadata
- 🔄 Cloud Processing: All AI models run on Modal cloud infrastructure with vLLM for fast inference
- ⏱️ Performance Optimized: Warm containers, parallel processing, and comprehensive logging
The system uses Modal cloud functions with three AI models served via vLLM for fast inference:
- ColPali (GPU: T4) - PDF page retrieval and ranking
- Qwen2.5-VL (GPU: A10, vLLM) - Vision-language understanding and solution generation
- DeepSeek-Coder (GPU: A10, vLLM) - Python code generation for schemdraw circuits
flowchart TD
A["🌐 Streamlit Frontend<br/>User uploads PDF + images + question"] --> B["📡 Modal Function<br/>solve_ee_problem()"]
B --> C["🖼️ Image Processing<br/>bytes to base64 for Qwen"]
%% Parallel Processing Block
C --> D["⚡ Parallel Execution"]
D --> E["📚 ColPali Model<br/>GPU: T4"]
D --> F["🚀 vLLM Servers Setup"]
E --> G["📄 PDF Processing<br/>index_pdf_from_bytes()"]
G --> H["🔍 Page Retrieval<br/>get_top_k_pages(k=3)"]
F --> I["🧠 Qwen Server<br/>serve_qwen() - A10"]
F --> J["💻 DeepSeek Server<br/>serve_deepseek() - A10"]
%% Sequential Processing
H --> K["📖 Qwen Analysis<br/>analyze_question_with_qwen_url()"]
I --> K
C --> K
K --> L["📝 Text Solution Generated<br/>Step-by-step explanation"]
L --> M["🔧 DeepSeek Code Gen<br/>generate_circuit_code()"]
J --> M
M --> N["🐍 Python Code<br/>schemdraw circuit code"]
N --> O["⚙️ Circuit Generator<br/>run_generated_code()"]
O --> P["🖼️ PNG Circuit Diagram<br/>Base64 encoded"]
P --> Q["📊 Final Response<br/>{textual_solution, circuit_diagram, metadata}"]
Q --> R["🌐 Streamlit Display<br/>Tabbed interface with LaTeX rendering"]
%% Styling
classDef frontend fill:#e1f5fe
classDef modal fill:#f3e5f5
classDef ai fill:#fff3e0
classDef processing fill:#e8f5e8
classDef output fill:#fce4ec
class A,R frontend
class B,C,D modal
class E,I,J,K,M ai
class G,H,O processing
class L,N,P,Q output
ee-tutor-system/
├── config/
│ ├── settings.py # Model configurations and constants
│ ├── modal_config.py # Modal cloud setup and shared image
│ ├── deepseek_prompt.txt # DeepSeek prompt template with 6 examples
│ └── qwen_instructions.txt # Qwen system instructions for EE tutoring
├── src/
│ ├── models/
│ │ ├── colpali_model.py # ColPali PDF retrieval with T4 GPU
│ │ ├── qwen_model.py # Qwen2.5-VL analysis with A10 GPU + vLLM
│ │ └── deepseek_model.py # DeepSeek code generation with A10 GPU + vLLM
│ └── services/
│ └── circuit_generator.py # Safe schemdraw code execution
├── frontend/
│ └── streamlit_app.py # Web interface with tabbed results
├── modal_app.py # Main orchestrator function
├── demo_img.png # Demo circuit diagram
├── requirements.txt # Python dependencies
└── README.md # This file
-
Clone repository:
git clone https://github.com/neha-nambiar/ee-homework-solver cd ee-tutor-system -
Install dependencies:
pip install -r requirements.txt
-
Setup Modal:
modal token new
-
Deploy to Modal:
modal deploy modal_app.py
-
Start Streamlit app:
streamlit run frontend/streamlit_app.py
-
Use the interface:
- Upload a textbook PDF (required)
- Optionally upload question images
- Enter your electrical engineering question
- Click "Generate Solution"
- View results in three tabs:
- Solution: LaTeX-formatted solution + circuit diagram
- Generated Code: Python schemdraw code
- Metadata: Processing details
- Warm Containers:
min_containers=1keeps models loaded - Parallel Processing: PDF indexing runs parallel to server startup
- Persistent Volumes: Cache models between deployments
- ColPali:
vidore/colpali-v1.3on T4 GPU for PDF retrieval - Qwen2.5-VL:
Qwen/Qwen2.5-VL-3B-Instructon A10 GPU for multimodal analysis - DeepSeek:
deepseek-ai/deepseek-coder-1.3b-instructon A10 GPU for code generation
- Comprehensive logging with step-by-step timing
- Health checks for vLLM servers with retry logic
- Safe code execution environment for schemdraw
- Graceful fallbacks for JSON parsing and code extraction
- Image Processing: Convert user uploads to base64 for multimodal input
- PDF Indexing: ColPali converts PDF pages to embeddings
- Page Retrieval: Find top-3 relevant pages using semantic similarity
- Solution Generation: Qwen2.5-VL analyzes question + images + PDF context
- Code Generation: DeepSeek generates schemdraw code using few-shot examples
- Circuit Rendering: Execute Python code safely to generate PNG diagram
- Response Assembly: Combine text solution, circuit image, and metadata
modal- Serverless cloud compute platformvllm==0.9.1- High-performance LLM inference serverstreamlit- Interactive web application framework
colpali_engine- Vision-based PDF retrievaltransformers- Hugging Face model loadingqwen_vl_utils- Qwen vision-language utilitiestorch- PyTorch deep learning framework
schemdraw- Circuit diagram generationmatplotlib- Plot rendering backendpdf2image- PDF to image conversionPillow- Image processinghttpx- HTTP client for API calls