Skip to content

rajin-khan/AIris

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AIris Banner


(pronounced: aiΒ·ris | aΙͺ.rΙͺs)

Status Course Focus AI

AI-Powered Vision Assistant for the Visually Impaired

"AI That Opens Eyes"

Python FastAPI React ESP32 License


Note

This project is under active development. The core software is complete and tested. Core software complete. Custom ESP32-CAM with casing designed. Optional hardware accessories in progress.

Expected Completion: December 2025


✨ What is AIris?

AIris is a wearable AI assistant that helps visually impaired users find objects and understand their surroundings through real-time audio feedback. Unlike passive description tools, AIris provides active guidance β€” it doesn't just tell you what's there, it helps you reach it.

🎯 Two Powerful Modes

Active Guidance βœ…

"Find my water bottle"

Detects the object, tracks your hand, and guides you with audio until you touch it.

Status: Working

Scene Description πŸ”„

Continuous awareness

Analyzes your environment and describes what's around you with safety alerts.

Status: Testing


πŸ—οΈ System Architecture

System Architecture

graph TB
    subgraph "πŸ’» Computer/Server"
        E[⚑ FastAPI<br/>Backend]
        F[🧠 AI Models<br/>YOLO26 β€’ MediaPipe β€’ BLIP]
        G[πŸ’¬ Groq LLM<br/>Llama 3]
        H[🌐 React<br/>Frontend]
        I[πŸ“§ Email Service<br/>Guardian Alerts]
        J[πŸ“· Built-in<br/>Webcam/Mic]
    end
    
    subgraph "πŸ”Œ Optional Accessories"
        A[πŸ“· ESP32-CAM<br/>WiFi Camera]
        B[🎀 Bluetooth<br/>Microphone]
        C[🎧 Bluetooth<br/>Headphone]
    end
    
    A -.->|WiFi Optional| E
    B -.->|Bluetooth Optional| E
    E -.->|Bluetooth Optional| C
    J -->|Default| E
    E --> F
    F --> G
    E --> H
    E --> I
    
    style E fill:#009688,color:#fff
    style F fill:#4B4E9E,color:#fff
    style G fill:#C9AC78,color:#000
    style H fill:#61DAFB,color:#000
    style I fill:#C75050,color:#fff
    style J fill:#666,color:#fff
    style A fill:#E7352C,color:#fff,stroke-dasharray: 5 5
    style B fill:#00979D,color:#fff,stroke-dasharray: 5 5
    style C fill:#00979D,color:#fff,stroke-dasharray: 5 5
Loading

Note: Dashed lines indicate optional accessories. The system runs entirely on your computer with built-in webcam/mic by default.

Data Flow

graph LR
    A[πŸ“· Camera] -->|Video| B[🎯 YOLO<br/>Detection]
    B -->|Objects| C[βœ‹ MediaPipe<br/>Hand Track]
    C -->|Position| D[🧠 LLM<br/>Reasoning]
    D -->|Instructions| E[πŸ”Š TTS<br/>Audio]
    E -->|Voice| F[πŸ‘€ User]
    F -->|Speech| G[🎀 STT<br/>Whisper]
    G -->|Command| D
    
    style B fill:#4B4E9E,color:#fff
    style C fill:#4B4E9E,color:#fff
    style D fill:#C9AC78,color:#000
    style E fill:#009688,color:#fff
    style G fill:#009688,color:#fff
Loading

πŸ“Š Current Progress

Component Status Progress
🎯 Active Guidance Mode βœ… Complete 100%
πŸ” Scene Description Mode βœ… Complete 100%
🎀 Handsfree Voice Mode βœ… Complete 100%
πŸ“§ Guardian Email Alerts βœ… Complete 100%
πŸ’¬ Live Transcription βœ… Complete 100%
πŸ”Š Audio Cues System βœ… Complete 100%
⏰ Time-Aware Messages βœ… Complete 100%
πŸ“· ESP32 Camera Support βœ… Complete 100%
⚑ Backend API βœ… Complete 100%
🌐 Frontend GUI βœ… Complete 100%
🎧 Bluetooth Audio (Optional) πŸ”„ Optional 30%

Core Software: 100% Complete
Optional Hardware Accessories: In Progress


πŸ› οΈ Technology Stack

πŸ’» Software

Layer Technology
Backend FastAPI, Python 3.10+
Object Detection YOLO26s (Ultralytics)
Hand Tracking MediaPipe
Scene Analysis BLIP
LLM Reasoning Groq API (Llama 3)
Speech-to-Text Whisper (offline)
Text-to-Speech pyttsx3 (native)
Email Notifications aiosmtplib (Gmail SMTP)
Frontend React, TypeScript, Vite

πŸ”Œ Hardware

Component Technology Required?
Camera Built-in webcam (default) or Custom ESP32-CAM with casing ⭐ (recommended) No
Audio Input Built-in mic (default) or Bluetooth Microphone (optional) No
Audio Output Built-in speakers (default) or Bluetooth Headphone (optional) No
Controls Voice Commands (handsfree mode) Yes
Processing Computer/Server Yes

Note: We've designed a custom ESP32-CAM with protective casing (see Hardware/cam-casing/) β€” recommended for best handsfree experience. However, the system works perfectly with built-in hardware by default for maximum accessibility.

Note: The React frontend is a development interface. The system is fully usable by blind users through handsfree voice commands β€” no screen or physical buttons required.


πŸ“ Repository Structure

graph TD
    ROOT[πŸ“‚ AIRIS] --> MAIN[⭐ AIris-System<br/>Main Application]
    ROOT --> HW[πŸ”Œ Hardware<br/>Custom ESP32-CAM]
    ROOT --> DOCS[πŸ“š Documentation<br/>Project Docs]
    ROOT --> SW[πŸ“¦ Archive<br/>Archived Experiments]
    
    MAIN --> BE[backend/<br/>FastAPI Server]
    MAIN --> FE[frontend/<br/>React GUI]
    
    SW --> EXP1[0-Inference-Experimental]
    SW --> EXP2[1-Inference-LLM]
    SW --> EXP3[2-Benchmarking]
    SW --> OLD[AIris-Final-App-Old]
    SW --> MORE[... more archives]
    
    style ROOT fill:#1a1a2e,color:#fff
    style MAIN fill:#C9AC78,color:#000
    style HW fill:#00979D,color:#fff
    style DOCS fill:#4B4E9E,color:#fff
    style SW fill:#333,color:#fff
Loading

πŸ“‚ Folder Guide

Folder Purpose Status
AIris-System/ ⭐ Main application β€” Start here! Contains the working FastAPI backend and React frontend Active
Hardware/ Custom ESP32-CAM casing design & firmware Optional
Documentation/ PRD, plans, technical docs, images Reference
Archive/ Archived experiments and prototypes from our development journey Archive
πŸ“¦ What's in Archive/?

These folders document our development journey β€” experiments, prototypes, and iterations that led to the current implementation:

Folder What It Was
0-Inference-Experimental Early BLIP experiments
1-Inference-LLM First LLM integration tests
2-Benchmarking Ollama/Raspberry Pi benchmarks
3-Performance-Comparision Model comparison tests
AIris-Core-System Previous core implementation
AIris-Final-App-Old Previous app version
Merged_System Integration experiments
RSPB, RSPB-2 Real-time system prototypes

Preserved for reference and academic documentation.


πŸš€ Quick Start

# Clone the repository
git clone https://github.com/rajin-khan/AIRIS.git
cd AIRIS/AIris-System

# Follow the setup guide
cat QUICKSTART.md

Requirements

  • Python 3.10+ and Node.js 18+
  • Groq API Key (free at console.groq.com)
  • Camera access (laptop webcam for testing)

πŸ“– Full setup: AIris-System/README.md


πŸ“‹ What's Left To Do

πŸ”Œ Hardware Integration (Current Focus)

  • Complete ESP32-CAM WiFi streaming
  • Finalize Bluetooth mic/headphone integration (optional)
  • βœ… Voice control complete (no physical buttons needed)
  • Design wearable enclosure (3D print)

πŸ”§ Software Refinement

  • Optimize Scene Description prompts
  • Add guardian alert notifications
  • Performance tuning for real-time streaming

βœ… Testing & Validation

  • End-to-end wireless testing
  • Field testing with visually impaired users
  • Battery life and reliability testing

🌟 Key Features

Feature Description
🎯 Object Guidance Speak an object name β†’ Get audio directions until you touch it
πŸ” Scene Understanding Continuous environment awareness with fall detection
⚠️ Safety Alerts Automatic fall detection with guardian email notifications
🎀 Handsfree Mode Full voice control β€” no screen interaction required
πŸ’¬ Live Transcription Real-time display of user and system speech in voice-only mode
πŸ”Š Audio Cues Comprehensive audio feedback for all system states and actions
⏰ Time-Aware Messages Contextual greetings that adapt to time of day
πŸ“§ Guardian Features Daily/weekly summaries and configurable risk thresholds
πŸ“‘ ESP32 Camera Support Support for both front-facing (webcam) and away-facing (chest-mounted ESP32) cameras
πŸ”’ Privacy First All AI processing happens on your local server

πŸ“š Documentation

Document Description
PRD.md Product Requirements Document
Idea.md Project vision and concept
Plan.md Development roadmap
Structure.md Detailed project structure
UseCases.md Core assistive scenarios
TechKnowledge.md Technology stack details

πŸ‘₯ Development Team

This project is developed by:

Name Institution ID GitHub Followers
Rajin Khan North South University 2212708042 Rajin's GitHub Followers
Saumik Saha Kabbya North South University 2211204042 Saumik's GitHub Followers

~ as part of CSE 499A/B at North South University, building upon the foundation of TapSense to advance accessibility technology.


About

AI-powered wearable device providing real-time scene descriptions for the visually impaired.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •