Skip to content

๐ŸŒŸ Lumiere: Multi-agent RAG system with semantic memory. Combines LangGraph, Qdrant vector search, and OpenAI for intelligent document Q&A, SQL data analysis, and context-aware conversations. Features long-term learning, critic validation, and full observability.

License

Notifications You must be signed in to change notification settings

kikomatchi/Lumiere

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

66 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒŸ Lumiere โ€” Agentic RAG Knowledge Workspace

An intelligent multi-agent system combining RAG, SQL data analysis, and semantic memory for context-aware interactions with complete observability

Python Streamlit Qdrant LangChain LangSmith License

Lumiere Architecture


๐ŸŽฏ Project Vision

Lumiere is an open-source, agentic RAG knowledge workspace that uses multi-agent reasoning, long- and short-term memory, Qdrant Cloud for vector storage, and complete observability via LangSmith.

Lumiere transforms traditional Q&A systems into an intelligent assistant that learns and adapts through semantic memory, supporting multiple interaction modes:

  • ๐Ÿ“š RAG Mode: Document-grounded responses with semantic search + reranking
  • ๐Ÿ“Š Data Analyst Mode: SQL queries with automated visualizations
  • ๐Ÿ’ฌ General Chat: Conversational AI with context awareness
  • ๐Ÿง  Semantic Memory: Long-term learning from past interactions
  • ๐Ÿ‘ค User Isolation: Complete data separation per user

โœจ Key Features

๐Ÿค– 9-Node Multi-Agent Architecture

  • Intent Node: Classifies queries, retrieves memories, and routes intelligently
  • Retrieve Node: Vector search with CrossEncoder reranking
  • Reason Node: Generates grounded RAG answers
  • General Reason Node: Fallback for general knowledge
  • SQL Execute Node: Generates and runs database queries
  • SQL Reason Node: Interprets SQL results
  • Visualize Node: Creates data visualizations (data_analyst mode)
  • Critic Node: Validates answer quality before storage
  • Memory Write Node: Stores conversations in semantic memory

๐Ÿง  Semantic Memory System

  • Long-term memory stored in Qdrant Cloud vector database
  • Automatic learning from successful interactions
  • Context-aware responses using past conversations
  • Quality filtering via critic node (only ACCEPT decisions stored)
  • Cross-session continuity for personalized experiences
  • User-specific collections for complete data isolation

๐Ÿ“Š Data Analysis & Visualization

  • Natural language to SQL query generation
  • Automated chart creation (bar, line, pie, scatter, table)
  • Interactive visualizations with Plotly
  • Multi-table support with user-specific SQLite databases
  • User isolation - each user has separate database file

๐Ÿ” Advanced RAG

  • Hybrid chunking with semantic overlap
  • Vector similarity search with OpenAI text-embedding-3-small
  • CrossEncoder reranking (ms-marco-MiniLM-L-6-v2)
  • Metadata filtering for precise retrieval
  • Source attribution for transparency
  • Pronoun resolution for conversational context
  • User-specific document collections in Qdrant Cloud

๐Ÿ“ˆ Complete Observability with LangSmith

  • Automatic tracing for all LangChain/LangGraph operations
  • Zero manual instrumentation required
  • Full trace replay for debugging
  • Performance metrics (latency, tokens, costs)
  • Session tracking via user_id/session_id
  • Error monitoring and alerting
  • Token usage tracking per operation

๐Ÿ‘ค User Data Isolation

  • Separate Qdrant collections per user: user_{user_id}_documents, user_{user_id}_memories
  • Separate SQLite databases per user: lumiere_user_{user_id}.db
  • Session-based user IDs (UUID per session)
  • Zero cross-user data leakage
  • Multi-tenant architecture ready for production

๐Ÿ—๏ธ Architecture

System Overview

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   User      โ”‚
โ”‚  (Streamlit)โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚
       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         LangGraph Workflow (9 Nodes)         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚  intent โ†’ [retrieve|sql_execute|     โ”‚   โ”‚
โ”‚  โ”‚           general_reason]             โ”‚   โ”‚
โ”‚  โ”‚     โ†“           โ†“           โ†“         โ”‚   โ”‚
โ”‚  โ”‚  reason    sql_reason  general_reasonโ”‚   โ”‚
โ”‚  โ”‚     โ†“           โ†“           โ†“         โ”‚   โ”‚
โ”‚  โ”‚  [visualize] โ†’ critic โ†’ memory_write โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚             โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ Qdrant Cloudโ”‚   โ”‚  SQLite (per  โ”‚
    โ”‚ (per user)  โ”‚   โ”‚    user)      โ”‚
    โ”‚  - docs     โ”‚   โ”‚  - tables     โ”‚
    โ”‚  - memories โ”‚   โ”‚  - sessions   โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚   LangSmith    โ”‚
    โ”‚  (Automatic    โ”‚
    โ”‚   Tracing)     โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Workflow Paths

  1. RAG Query Path

    intent (needs_rag) โ†’ retrieve โ†’ reason โ†’ critic โ†’ memory_write โ†’ END
    
  2. SQL/Data Analysis Path

    intent (needs_sql) โ†’ sql_execute โ†’ sql_reason โ†’ [visualize] โ†’ critic โ†’ memory_write โ†’ END
    
  3. General Chat Path

    intent โ†’ general_reason โ†’ critic โ†’ memory_write โ†’ END
    

See GRAPH_ARCHITECTURE.md for detailed workflow documentation or view lumiere_graph.png for visual representation.


๐Ÿš€ Quick Start

Prerequisites

  • Python 3.11+
  • Qdrant (running locally or cloud)
  • OpenAI API key
  • Langfuse account (optional, for observability)

Installation

  1. Clone the repository

    git clone https://github.com/kikomatchi/lumiere.git
    cd lumiere
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Set up environment variables

    Create a .env file in the project root:

    # OpenAI API
    OPENAI_API_KEY=your_openai_api_key_here
    
    # Qdrant Configuration (Cloud or Local)
    QDRANT_URL=https://your-cluster.qdrant.io  # Or http://localhost:6333
    QDRANT_API_KEY=your_qdrant_api_key  # Required for Qdrant Cloud
    
    # LangSmith Observability (Optional)
    LANGCHAIN_TRACING_V2=true
    LANGCHAIN_API_KEY=your_langsmith_api_key
    LANGCHAIN_PROJECT=Lumiere
    LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
  5. Start Qdrant (if running locally, skip if using Qdrant Cloud)

    docker run -p 6333:6333 -p 6334:6334 \
        -v $(pwd)/qdrant_storage:/qdrant/storage:z \
        qdrant/qdrant
  6. User collections auto-created

    • No manual initialization needed!
    • Collections created automatically on first upload/query per user
    • Format: user_{user_id}_documents, user_{user_id}_memories
  7. Launch Lumiere python scripts/init_semantic_memory.py

    
    
  8. Run the application

    streamlit run app.py
  9. Open your browser

    Navigate to http://localhost:8501


๐Ÿ“– Usage Guide

1. Ingesting Documents

Via Streamlit UI:

  1. Click "๐Ÿ“„ Document Ingestion" in sidebar
  2. Upload PDF, TXT, or MD files
  3. Click "Ingest Documents"
  4. Wait for confirmation

Via Script:

python -c "from rag.ingest import ingest_directory; ingest_directory('path/to/docs')"

2. Asking Questions

RAG Queries (Document-based)

"What is FFXIV?"
"Explain vector databases"
"How does semantic search work?"

Data Analysis Queries

"Show me the top 5 products by sales"
"How many hybrid cars are in the database?"
"What is the average price by manufacturer?"

General Chat

"Hello, how are you?"
"Can you help me with my project?"
"What can you do?"

3. Viewing Semantic Memory

In Streamlit:

  1. Expand "๐Ÿง  Semantic Memory" in sidebar
  2. View total memories and types
  3. Search memories by keyword
  4. See relevance scores and timestamps

Via Python:

from memory.semantic_memory import get_memory_stats, retrieve_memories

# Get statistics
stats = get_memory_stats()
print(stats)

# Search memories
memories = retrieve_memories(
    query="database queries",
    top_k=5,
    user_id="user123",
    min_score=0.7
)

4. Switching Modes

Use the sidebar to select:

  • All In: All features enabled (default)
  • Chat + RAG: Document Q&A only
  • Data Analyst: SQL queries + visualizations

๐Ÿ—‚๏ธ Project Structure

Lumiere/
โ”œโ”€โ”€ agents/                    # Agent implementations
โ”‚   โ”œโ”€โ”€ intent_agent.py       # Intent classification + memory retrieval
โ”‚   โ”œโ”€โ”€ reasoning_agent.py    # RAG reasoning
โ”‚   โ”œโ”€โ”€ sql_agent.py          # SQL generation & execution
โ”‚   โ”œโ”€โ”€ critic_agent.py       # Quality validation
โ”‚   โ””โ”€โ”€ viz_agent.py          # Visualization generation
โ”‚
โ”œโ”€โ”€ graph/                     # LangGraph workflow
โ”‚   โ”œโ”€โ”€ rag_graph.py          # Main graph definition
โ”‚   โ””โ”€โ”€ state.py              # State management
โ”‚
โ”œโ”€โ”€ memory/                    # Semantic memory system
โ”‚   โ””โ”€โ”€ semantic_memory.py    # Vector-based memory storage/retrieval
โ”‚
โ”œโ”€โ”€ rag/                       # RAG components
โ”‚   โ”œโ”€โ”€ chunking.py           # Document chunking strategies
โ”‚   โ”œโ”€โ”€ collections.py        # Qdrant collection management
โ”‚   โ”œโ”€โ”€ embeddings.py         # OpenAI embeddings wrapper
โ”‚   โ”œโ”€โ”€ ingest.py             # Document ingestion pipeline
โ”‚   โ”œโ”€โ”€ qdrant_client.py      # Qdrant client singleton
โ”‚   โ””โ”€โ”€ retriever.py          # Semantic search & filtering
โ”‚
โ”œโ”€โ”€ database/                  # Data storage
โ”‚   โ””โ”€โ”€ sqlite_client.py      # SQLite connection & queries
โ”‚
โ”œโ”€โ”€ config/                    # Configuration
โ”‚   โ””โ”€โ”€ settings.py           # Environment & settings
โ”‚
โ”œโ”€โ”€ scripts/                   # Utility scripts
โ”‚   โ”œโ”€โ”€ init_semantic_memory.py   # Initialize memory system
โ”‚   โ”œโ”€โ”€ ingest_test.py            # Test document ingestion
โ”‚   โ””โ”€โ”€ retrieval_test.py         # Test retrieval
โ”‚
โ”œโ”€โ”€ ui/                        # Streamlit components
โ”‚   โ””โ”€โ”€ (UI modules)
โ”‚
โ”œโ”€โ”€ app.py                     # Main Streamlit application
โ”œโ”€โ”€ requirements.txt           # Python dependencies
โ”œโ”€โ”€ graph_visualization.mmd    # Mermaid diagram
โ”œโ”€โ”€ graph_visualization.png    # Architecture diagram
โ”œโ”€โ”€ GRAPH_ARCHITECTURE.md      # Detailed architecture docs
โ”œโ”€โ”€ SEMANTIC_MEMORY.md         # Memory system documentation
โ””โ”€โ”€ README.md                  # This file

๐Ÿง  Semantic Memory System

How It Works

  1. Storage: Every accepted conversation is embedded and stored in Qdrant

    • Uses OpenAI text-embedding-3-small (1536 dimensions)
    • Includes query, response, mode, and metadata
    • Quality-filtered by critic agent (only ACCEPT decisions)
  2. Retrieval: Intent agent retrieves relevant memories before processing

    • Top-k semantic search with cosine similarity
    • Configurable threshold (default: 0.75)
    • Formatted context injected into agent prompts
  3. Benefits:

    • Personalization: Remembers user preferences
    • Context: Understands conversation history
    • Learning: Improves responses over time
    • Continuity: Works across sessions

Memory Types

  • conversation: General Q&A interactions
  • preference: User preferences (e.g., "I prefer bar charts")
  • fact: User-declared facts (e.g., "I'm working on X project")
  • pattern: Common query patterns
  • error_resolution: Problem-solving history

Example

First interaction:

User: "Show me sales data as a bar chart"
Assistant: [Generates bar chart]
๐Ÿ’พ Stores: User prefers bar charts for sales data

Later interaction:

User: "Show me revenue trends"
Assistant: [Retrieves memory about chart preference]
           [Automatically generates bar chart]

See SEMANTIC_MEMORY.md for detailed documentation.


๐Ÿ“Š Data Analyst Mode

Features

  • Natural language to SQL: Generate queries from plain English
  • Automated visualizations: Smart chart type selection
  • Interactive charts: Plotly-based visualizations
  • Result interpretation: Natural language summaries

Supported Chart Types

  • Bar Chart: Comparisons, rankings
  • Line Chart: Trends over time
  • Pie Chart: Proportions, distributions
  • Scatter Plot: Correlations, relationships

Example Queries

"Show me sales by region"
โ†’ SQL: SELECT region, SUM(sales) FROM sales GROUP BY region
โ†’ Chart: Bar chart with regions on x-axis

"How have prices changed over time?"
โ†’ SQL: SELECT date, AVG(price) FROM products GROUP BY date
โ†’ Chart: Line chart showing price trends

"What's the distribution of car types?"
โ†’ SQL: SELECT type, COUNT(*) FROM cars GROUP BY type
โ†’ Chart: Pie chart showing proportions

๐Ÿ” Advanced RAG Features

Chunking Strategies

  • Semantic chunking: Split by meaning, not just length
  • Overlap: Maintains context between chunks
  • Metadata preservation: Source, page numbers, timestamps

Retrieval Options

  • Hybrid search: Combines semantic + keyword search
  • Metadata filtering: Filter by source, date, type
  • Reranking: Re-scores results for relevance
  • Source attribution: Shows where answers come from

Document Support

  • PDF: Automatic text extraction
  • TXT: Plain text files
  • Markdown: Preserves formatting
  • Batch ingestion: Process entire directories

๐ŸŽ›๏ธ Configuration

Key Settings (config/settings.py)

# Model Configuration
OPENAI_MODEL = "gpt-4o-mini"
EMBEDDING_MODEL = "text-embedding-3-small"
EMBEDDING_DIMENSIONS = 1536

# Retrieval Settings
TOP_K_RETRIEVAL = 3
MIN_SIMILARITY_SCORE = 0.7

# Memory Settings
MEMORY_TOP_K = 3
MEMORY_MIN_SCORE = 0.75

# Chunking
CHUNK_SIZE = 1000
CHUNK_OVERLAP = 200

Environment Variables

See .env.example for all available configuration options.


๐Ÿ› Troubleshooting

Common Issues

1. Qdrant Connection Error

Error: Cannot connect to Qdrant

Solution: Ensure Qdrant is running on localhost:6333

docker ps | grep qdrant  # Check if running

2. OpenAI API Error

Error: Invalid API key

Solution: Check .env file has correct OPENAI_API_KEY

3. No Memories Stored

Memory count remains at 3

Solution:

  • Check critic is accepting answers (look for โœ… in terminal)
  • Ensure Qdrant collection exists
  • Verify semantic memory is enabled

4. Import Errors

ModuleNotFoundError: No module named 'X'

Solution: Reinstall dependencies

pip install -r requirements.txt

Debug Mode

Enable detailed logging:

# In config/settings.py
DEBUG_MODE = True

Look for these debug indicators in terminal:

  • ๐Ÿ’พ Memory Write Node
  • โœ… Stored semantic memory
  • โญ๏ธ Skipping memory storage
  • ๐Ÿ“ฆ Retrieval node
  • ๐Ÿ” Query analysis

๐Ÿ“ˆ Observability

Langfuse Integration

Lumiere integrates with Langfuse for comprehensive observability:

  1. Traces: Full request lifecycle tracking
  2. Token usage: Cost monitoring per operation
  3. Latency: Performance metrics
  4. Agent behavior: Decision tracking

Setup:

  1. Create account at langfuse.com
  2. Add keys to .env
  3. View traces in Langfuse dashboard

Memory Statistics

View memory stats in terminal:

python -c "from memory.semantic_memory import get_memory_stats; import json; print(json.dumps(get_memory_stats(), indent=2))"

Example output:

{
  "total_memories": 15,
  "vector_size": 1536,
  "memory_types": {
    "conversation": 10,
    "preference": 3,
    "fact": 1,
    "pattern": 1
  }
}

๐Ÿค Contributing

We welcome contributions! Please see our contributing guidelines.

Development Setup

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

Code Style

  • Follow PEP 8
  • Use type hints
  • Add docstrings to functions
  • Keep functions focused and small

๐Ÿ“ Documentation

Full documentation is available in the docs/ folder:

๐Ÿงช Testing

Comprehensive test suite with 34 tests covering core functionality:

# Run all tests
pytest

# Run with coverage
pytest --cov=. --cov-report=html

# Run specific test file
pytest tests/test_semantic_memory.py

Test Coverage:

  • โœ… Semantic Memory (9 tests)
  • โœ… Intent Agent (6 tests)
  • โœ… Graph Workflow (10 tests)
  • โœ… RAG Components (10 tests)

See tests/README.md for complete testing guide and TEST_SETUP_SUMMARY.md for current status.


๐Ÿ—บ๏ธ Roadmap

Current Features โœ…

  • Multi-agent RAG system
  • Semantic memory integration
  • SQL data analysis
  • Automated visualizations
  • Critic-based quality control
  • Langfuse observability

Coming Soon ๐Ÿšง

  • Multi-user support with user isolation
  • Memory pruning and consolidation
  • Advanced query routing
  • Custom embedding models
  • API endpoints (REST/GraphQL)
  • Memory analytics dashboard
  • Feedback loop for memory refinement

Future Vision ๐Ÿ”ฎ

  • Multi-modal support (images, audio)
  • Agent collaboration framework
  • Distributed memory architecture
  • Real-time streaming responses
  • Plugin system for extensibility

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments

Built with:


๐Ÿ“ง Contact

For questions, issues, or feedback:

  • Open an issue on GitHub
  • Check existing documentation
  • Review troubleshooting section

โญ Star History

If you find Lumiere useful, please consider giving it a star! โญ


Made with โค๏ธ for the AI community

About

๐ŸŒŸ Lumiere: Multi-agent RAG system with semantic memory. Combines LangGraph, Qdrant vector search, and OpenAI for intelligent document Q&A, SQL data analysis, and context-aware conversations. Features long-term learning, critic validation, and full observability.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published