GitHub - rajin-khan/AIris: AI-powered wearable device providing real-time scene descriptions for the visually impaired.

(pronounced: ai·ris | aɪ.rɪs)

AI-Powered Vision Assistant for the Visually Impaired

"AI That Opens Eyes"

Note

This project is under active development. The core software is complete and tested. Core software complete. Custom ESP32-CAM with casing designed. Optional hardware accessories in progress.

Expected Completion: December 2025

✨ What is AIris?

AIris is a wearable AI assistant that helps visually impaired users find objects and understand their surroundings through real-time audio feedback. Unlike passive description tools, AIris provides active guidance — it doesn't just tell you what's there, it helps you reach it.

🎯 Two Powerful Modes

Active Guidance ✅

"Find my water bottle"

Detects the object, tracks your hand, and guides you with audio until you touch it.

Status: Working

Scene Description 🔄

Continuous awareness

Analyzes your environment and describes what's around you with safety alerts.

Status: Testing

🏗️ System Architecture

System Architecture

graph TB
    subgraph "💻 Computer/Server"
        E[⚡ FastAPI<br/>Backend]
        F[🧠 AI Models<br/>YOLO26 • MediaPipe • BLIP]
        G[💬 Groq LLM<br/>Llama 3]
        H[🌐 React<br/>Frontend]
        I[📧 Email Service<br/>Guardian Alerts]
        J[📷 Built-in<br/>Webcam/Mic]
    end
    
    subgraph "🔌 Optional Accessories"
        A[📷 ESP32-CAM<br/>WiFi Camera]
        B[🎤 Bluetooth<br/>Microphone]
        C[🎧 Bluetooth<br/>Headphone]
    end
    
    A -.->|WiFi Optional| E
    B -.->|Bluetooth Optional| E
    E -.->|Bluetooth Optional| C
    J -->|Default| E
    E --> F
    F --> G
    E --> H
    E --> I
    
    style E fill:#009688,color:#fff
    style F fill:#4B4E9E,color:#fff
    style G fill:#C9AC78,color:#000
    style H fill:#61DAFB,color:#000
    style I fill:#C75050,color:#fff
    style J fill:#666,color:#fff
    style A fill:#E7352C,color:#fff,stroke-dasharray: 5 5
    style B fill:#00979D,color:#fff,stroke-dasharray: 5 5
    style C fill:#00979D,color:#fff,stroke-dasharray: 5 5

Note: Dashed lines indicate optional accessories. The system runs entirely on your computer with built-in webcam/mic by default.

Data Flow

graph LR
    A[📷 Camera] -->|Video| B[🎯 YOLO<br/>Detection]
    B -->|Objects| C[✋ MediaPipe<br/>Hand Track]
    C -->|Position| D[🧠 LLM<br/>Reasoning]
    D -->|Instructions| E[🔊 TTS<br/>Audio]
    E -->|Voice| F[👤 User]
    F -->|Speech| G[🎤 STT<br/>Whisper]
    G -->|Command| D
    
    style B fill:#4B4E9E,color:#fff
    style C fill:#4B4E9E,color:#fff
    style D fill:#C9AC78,color:#000
    style E fill:#009688,color:#fff
    style G fill:#009688,color:#fff

📊 Current Progress

Component	Status	Progress
🎯 Active Guidance Mode	✅ Complete
🔍 Scene Description Mode	✅ Complete
🎤 Handsfree Voice Mode	✅ Complete
📧 Guardian Email Alerts	✅ Complete
💬 Live Transcription	✅ Complete
🔊 Audio Cues System	✅ Complete
⏰ Time-Aware Messages	✅ Complete
📷 ESP32 Camera Support	✅ Complete
⚡ Backend API	✅ Complete
🌐 Frontend GUI	✅ Complete
🎧 Bluetooth Audio (Optional)	🔄 Optional

Core Software: 100% Complete
Optional Hardware Accessories: In Progress

🛠️ Technology Stack

💻 Software

Layer	Technology
Backend	FastAPI, Python 3.10+
Object Detection	YOLO26s (Ultralytics)
Hand Tracking	MediaPipe
Scene Analysis	BLIP
LLM Reasoning	Groq API (Llama 3)
Speech-to-Text	Whisper (offline)
Text-to-Speech	pyttsx3 (native)
Email Notifications	aiosmtplib (Gmail SMTP)
Frontend	React, TypeScript, Vite

🔌 Hardware

Component	Technology	Required?
Camera	Built-in webcam (default) or Custom ESP32-CAM with casing ⭐ (recommended)	No
Audio Input	Built-in mic (default) or Bluetooth Microphone (optional)	No
Audio Output	Built-in speakers (default) or Bluetooth Headphone (optional)	No
Controls	Voice Commands (handsfree mode)	Yes
Processing	Computer/Server	Yes

Note: We've designed a custom ESP32-CAM with protective casing (see Hardware/cam-casing/) — recommended for best handsfree experience. However, the system works perfectly with built-in hardware by default for maximum accessibility.

Note: The React frontend is a development interface. The system is fully usable by blind users through handsfree voice commands — no screen or physical buttons required.

📁 Repository Structure

graph TD
    ROOT[📂 AIRIS] --> MAIN[⭐ AIris-System<br/>Main Application]
    ROOT --> HW[🔌 Hardware<br/>Custom ESP32-CAM]
    ROOT --> DOCS[📚 Documentation<br/>Project Docs]
    ROOT --> SW[📦 Archive<br/>Archived Experiments]
    
    MAIN --> BE[backend/<br/>FastAPI Server]
    MAIN --> FE[frontend/<br/>React GUI]
    
    SW --> EXP1[0-Inference-Experimental]
    SW --> EXP2[1-Inference-LLM]
    SW --> EXP3[2-Benchmarking]
    SW --> OLD[AIris-Final-App-Old]
    SW --> MORE[... more archives]
    
    style ROOT fill:#1a1a2e,color:#fff
    style MAIN fill:#C9AC78,color:#000
    style HW fill:#00979D,color:#fff
    style DOCS fill:#4B4E9E,color:#fff
    style SW fill:#333,color:#fff

📂 Folder Guide

Folder	Purpose	Status
`AIris-System/`	⭐ Main application — Start here! Contains the working FastAPI backend and React frontend	Active
`Hardware/`	Custom ESP32-CAM casing design & firmware	Optional
`Documentation/`	PRD, plans, technical docs, images	Reference
`Archive/`	Archived experiments and prototypes from our development journey	Archive

📦 What's in Archive/?

These folders document our development journey — experiments, prototypes, and iterations that led to the current implementation:

Folder	What It Was
`0-Inference-Experimental`	Early BLIP experiments
`1-Inference-LLM`	First LLM integration tests
`2-Benchmarking`	Ollama/Raspberry Pi benchmarks
`3-Performance-Comparision`	Model comparison tests
`AIris-Core-System`	Previous core implementation
`AIris-Final-App-Old`	Previous app version
`Merged_System`	Integration experiments
`RSPB`, `RSPB-2`	Real-time system prototypes

Preserved for reference and academic documentation.

🚀 Quick Start

# Clone the repository
git clone https://github.com/rajin-khan/AIRIS.git
cd AIRIS/AIris-System

# Follow the setup guide
cat QUICKSTART.md

Requirements

Python 3.10+ and Node.js 18+
Groq API Key (free at console.groq.com)
Camera access (laptop webcam for testing)

📖 Full setup: AIris-System/README.md

📋 What's Left To Do

🔌 Hardware Integration (Current Focus)

Complete ESP32-CAM WiFi streaming
Finalize Bluetooth mic/headphone integration (optional)
✅ Voice control complete (no physical buttons needed)
Design wearable enclosure (3D print)

🔧 Software Refinement

Optimize Scene Description prompts
Add guardian alert notifications
Performance tuning for real-time streaming

✅ Testing & Validation

End-to-end wireless testing
Field testing with visually impaired users
Battery life and reliability testing

🌟 Key Features

Feature	Description
🎯 Object Guidance	Speak an object name → Get audio directions until you touch it
🔍 Scene Understanding	Continuous environment awareness with fall detection
⚠️ Safety Alerts	Automatic fall detection with guardian email notifications
🎤 Handsfree Mode	Full voice control — no screen interaction required
💬 Live Transcription	Real-time display of user and system speech in voice-only mode
🔊 Audio Cues	Comprehensive audio feedback for all system states and actions
⏰ Time-Aware Messages	Contextual greetings that adapt to time of day
📧 Guardian Features	Daily/weekly summaries and configurable risk thresholds
📡 ESP32 Camera Support	Support for both front-facing (webcam) and away-facing (chest-mounted ESP32) cameras
🔒 Privacy First	All AI processing happens on your local server

📚 Documentation

Document	Description
PRD.md	Product Requirements Document
Idea.md	Project vision and concept
Plan.md	Development roadmap
Structure.md	Detailed project structure
UseCases.md	Core assistive scenarios
TechKnowledge.md	Technology stack details

👥 Development Team

This project is developed by:

Name	Institution	ID	GitHub	Followers
Rajin Khan	North South University	2212708042
Saumik Saha Kabbya	North South University	2211204042

~ as part of CSE 499A/B at North South University, building upon the foundation of TapSense to advance accessibility technology.

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.cursor/commands		.cursor/commands
AIris-System		AIris-System
Archive		Archive
Documentation		Documentation
Hardware		Hardware
.DS_Store		.DS_Store
.cursorignore		.cursorignore
.gitignore		.gitignore
Log.md		Log.md
README.md		README.md
To-Do.md		To-Do.md
To-Do.pdf		To-Do.pdf
logextended.txt		logextended.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Powered Vision Assistant for the Visually Impaired

✨ What is AIris?

🎯 Two Powerful Modes

Active Guidance ✅

Scene Description 🔄

🏗️ System Architecture

System Architecture

Data Flow

📊 Current Progress

🛠️ Technology Stack

💻 Software

🔌 Hardware

📁 Repository Structure

📂 Folder Guide

🚀 Quick Start

Requirements

📋 What's Left To Do

🔌 Hardware Integration (Current Focus)

🔧 Software Refinement

✅ Testing & Validation

🌟 Key Features

📚 Documentation

👥 Development Team

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

rajin-khan/AIris

Folders and files

Latest commit

History

Repository files navigation

AI-Powered Vision Assistant for the Visually Impaired

✨ What is AIris?

🎯 Two Powerful Modes

Active Guidance ✅

Scene Description 🔄

🏗️ System Architecture

System Architecture

Data Flow

📊 Current Progress

🛠️ Technology Stack

💻 Software

🔌 Hardware

📁 Repository Structure

📂 Folder Guide

🚀 Quick Start

Requirements

📋 What's Left To Do

🔌 Hardware Integration (Current Focus)

🔧 Software Refinement

✅ Testing & Validation

🌟 Key Features

📚 Documentation

👥 Development Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages