(pronounced: aiΒ·ris | aΙͺ.rΙͺs)
"AI That Opens Eyes"
Note
This project is under active development. The core software is complete and tested. Core software complete. Custom ESP32-CAM with casing designed. Optional hardware accessories in progress.
Expected Completion: December 2025
AIris is a wearable AI assistant that helps visually impaired users find objects and understand their surroundings through real-time audio feedback. Unlike passive description tools, AIris provides active guidance β it doesn't just tell you what's there, it helps you reach it.
|
"Find my water bottle" Detects the object, tracks your hand, and guides you with audio until you touch it. Status: Working |
Continuous awareness Analyzes your environment and describes what's around you with safety alerts. Status: Testing |
graph TB
subgraph "π» Computer/Server"
E[β‘ FastAPI<br/>Backend]
F[π§ AI Models<br/>YOLO26 β’ MediaPipe β’ BLIP]
G[π¬ Groq LLM<br/>Llama 3]
H[π React<br/>Frontend]
I[π§ Email Service<br/>Guardian Alerts]
J[π· Built-in<br/>Webcam/Mic]
end
subgraph "π Optional Accessories"
A[π· ESP32-CAM<br/>WiFi Camera]
B[π€ Bluetooth<br/>Microphone]
C[π§ Bluetooth<br/>Headphone]
end
A -.->|WiFi Optional| E
B -.->|Bluetooth Optional| E
E -.->|Bluetooth Optional| C
J -->|Default| E
E --> F
F --> G
E --> H
E --> I
style E fill:#009688,color:#fff
style F fill:#4B4E9E,color:#fff
style G fill:#C9AC78,color:#000
style H fill:#61DAFB,color:#000
style I fill:#C75050,color:#fff
style J fill:#666,color:#fff
style A fill:#E7352C,color:#fff,stroke-dasharray: 5 5
style B fill:#00979D,color:#fff,stroke-dasharray: 5 5
style C fill:#00979D,color:#fff,stroke-dasharray: 5 5
Note: Dashed lines indicate optional accessories. The system runs entirely on your computer with built-in webcam/mic by default.
graph LR
A[π· Camera] -->|Video| B[π― YOLO<br/>Detection]
B -->|Objects| C[β MediaPipe<br/>Hand Track]
C -->|Position| D[π§ LLM<br/>Reasoning]
D -->|Instructions| E[π TTS<br/>Audio]
E -->|Voice| F[π€ User]
F -->|Speech| G[π€ STT<br/>Whisper]
G -->|Command| D
style B fill:#4B4E9E,color:#fff
style C fill:#4B4E9E,color:#fff
style D fill:#C9AC78,color:#000
style E fill:#009688,color:#fff
style G fill:#009688,color:#fff
Core Software: 100% Complete
Optional Hardware Accessories: In Progress
|
Note: We've designed a custom ESP32-CAM with protective casing (see |
Note: The React frontend is a development interface. The system is fully usable by blind users through handsfree voice commands β no screen or physical buttons required.
graph TD
ROOT[π AIRIS] --> MAIN[β AIris-System<br/>Main Application]
ROOT --> HW[π Hardware<br/>Custom ESP32-CAM]
ROOT --> DOCS[π Documentation<br/>Project Docs]
ROOT --> SW[π¦ Archive<br/>Archived Experiments]
MAIN --> BE[backend/<br/>FastAPI Server]
MAIN --> FE[frontend/<br/>React GUI]
SW --> EXP1[0-Inference-Experimental]
SW --> EXP2[1-Inference-LLM]
SW --> EXP3[2-Benchmarking]
SW --> OLD[AIris-Final-App-Old]
SW --> MORE[... more archives]
style ROOT fill:#1a1a2e,color:#fff
style MAIN fill:#C9AC78,color:#000
style HW fill:#00979D,color:#fff
style DOCS fill:#4B4E9E,color:#fff
style SW fill:#333,color:#fff
| Folder | Purpose | Status |
|---|---|---|
AIris-System/ |
β Main application β Start here! Contains the working FastAPI backend and React frontend | Active |
Hardware/ |
Custom ESP32-CAM casing design & firmware | Optional |
Documentation/ |
PRD, plans, technical docs, images | Reference |
Archive/ |
Archived experiments and prototypes from our development journey | Archive |
π¦ What's in Archive/?
These folders document our development journey β experiments, prototypes, and iterations that led to the current implementation:
| Folder | What It Was |
|---|---|
0-Inference-Experimental |
Early BLIP experiments |
1-Inference-LLM |
First LLM integration tests |
2-Benchmarking |
Ollama/Raspberry Pi benchmarks |
3-Performance-Comparision |
Model comparison tests |
AIris-Core-System |
Previous core implementation |
AIris-Final-App-Old |
Previous app version |
Merged_System |
Integration experiments |
RSPB, RSPB-2 |
Real-time system prototypes |
Preserved for reference and academic documentation.
# Clone the repository
git clone https://github.com/rajin-khan/AIRIS.git
cd AIRIS/AIris-System
# Follow the setup guide
cat QUICKSTART.md- Python 3.10+ and Node.js 18+
- Groq API Key (free at console.groq.com)
- Camera access (laptop webcam for testing)
π Full setup: AIris-System/README.md
- Complete ESP32-CAM WiFi streaming
- Finalize Bluetooth mic/headphone integration (optional)
- β Voice control complete (no physical buttons needed)
- Design wearable enclosure (3D print)
- Optimize Scene Description prompts
- Add guardian alert notifications
- Performance tuning for real-time streaming
- End-to-end wireless testing
- Field testing with visually impaired users
- Battery life and reliability testing
| Feature | Description |
|---|---|
| π― Object Guidance | Speak an object name β Get audio directions until you touch it |
| π Scene Understanding | Continuous environment awareness with fall detection |
| Automatic fall detection with guardian email notifications | |
| π€ Handsfree Mode | Full voice control β no screen interaction required |
| π¬ Live Transcription | Real-time display of user and system speech in voice-only mode |
| π Audio Cues | Comprehensive audio feedback for all system states and actions |
| β° Time-Aware Messages | Contextual greetings that adapt to time of day |
| π§ Guardian Features | Daily/weekly summaries and configurable risk thresholds |
| π‘ ESP32 Camera Support | Support for both front-facing (webcam) and away-facing (chest-mounted ESP32) cameras |
| π Privacy First | All AI processing happens on your local server |
| Document | Description |
|---|---|
| PRD.md | Product Requirements Document |
| Idea.md | Project vision and concept |
| Plan.md | Development roadmap |
| Structure.md | Detailed project structure |
| UseCases.md | Core assistive scenarios |
| TechKnowledge.md | Technology stack details |
This project is developed by:
| Name | Institution | ID | GitHub | Followers |
|---|---|---|---|---|
| Rajin Khan | North South University | 2212708042 | ||
| Saumik Saha Kabbya | North South University | 2211204042 |
~ as part of CSE 499A/B at North South University, building upon the foundation of TapSense to advance accessibility technology.
