Multi-LLM routing, vector store, and agent framework #35

nshkrdotcom · 2025-12-21T10:36:22Z

Summary

This PR adds major new capabilities to the RAG library:

Multi-LLM Provider Routing: Support for Gemini, Claude, and Codex with configurable routing strategies (fallback, round-robin, specialist)
Vector Store: pgvector-based semantic search with hybrid search (RRF) combining vector similarity and full-text search
Agent Framework: Tool registry, session management, and built-in tools for code analysis and file operations
Embedding Service: GenServer-based embedding generation with batching support

Changes

New Modules

Rag.Router - Multi-provider routing with strategies
Rag.Ai.{Gemini,Claude,Codex} - LLM provider implementations
Rag.Ai.Capabilities - Provider capability registry
Rag.VectorStore - Semantic/hybrid search operations
Rag.Embedding.Service - Embedding generation service
Rag.Agent.{Agent,Registry,Session,Tool} - Agent framework
Built-in tools: AnalyzeCode, ReadFile, SearchRepos, GetRepoContext

Examples

examples/basic_chat.exs - Simple LLM interaction
examples/routing_strategies.exs - Multi-provider routing demo
examples/agent.exs - Agent framework with tools
examples/vector_store.exs - Semantic search demo
examples/rag_demo/ - Full Phoenix app demo with database

Other

Updated dependencies (Mimic 2.2, added pgvector)
Comprehensive test coverage (368 tests)
Removed legacy mix tasks (gen_eval, gen_rag_module, gen_servings, install)

Test plan

All 368 tests pass
Examples run successfully with mix run examples/*.exs
Demo app works end-to-end with database

nshkrdotcom · 2025-12-21T21:52:47Z

Update: Additional Features Added

The latest commit adds significant new capabilities:

Modular RAG Architecture

Retriever Behaviour: Pluggable retrieval with Semantic, FullText, Hybrid, and Graph implementations
Reranker Behaviour: LLM-based and passthrough reranking strategies
Chunking Module: Multiple strategies (character, sentence, paragraph, recursive, semantic)

GraphRAG

Entity/Relationship Extraction: LLM-powered extraction with entity resolution
Graph Store: PostgreSQL-based storage with pgvector for entity embeddings
Community Detection: Label propagation algorithm with hierarchical clustering
Graph Retriever: Local, global, and hybrid search modes

Pipeline System

Pipeline Executor: Multi-step workflow orchestration
Context Management: State passing between pipeline steps
Error Handling: Configurable retry and fallback strategies

Examples

basic_rag.exs - Complete RAG workflow with database
hybrid_search.exs - Semantic + full-text with RRF fusion
graph_rag.exs - Entity extraction and graph-based retrieval
chunking_strategies.exs - All chunking strategies demonstrated
pipeline_example.exs - Multi-step pipeline orchestration

Test Coverage

567 tests passing (up from 368)

This release represents a major architectural evolution of the RAG library, transforming it from a single-provider system into a comprehensive multi-LLM orchestration platform with agentic capabilities. MULTI-LLM PROVIDER SUPPORT Add three new provider implementations under Rag.Ai namespace: - Gemini: Full support for text generation and embeddings via gemini_ex - Claude: Text generation via claude_agent_sdk with agentic workflows - Codex: OpenAI-compatible API via codex_sdk with structured output Each provider implements the Rag.Ai.Provider behaviour and exposes capability metadata through Rag.Ai.Capabilities for runtime introspection. SMART ROUTER Introduce Rag.Router module with pluggable routing strategies: - Fallback: Sequential provider attempts with failure tracking and decay - RoundRobin: Load distribution with optional weights and recovery - Specialist: Task-based routing using provider strengths The router automatically selects strategies based on available providers and handles retries transparently. Provider health is tracked per-strategy with configurable failure thresholds and cooldown periods. VECTOR STORE WITH PGVECTOR Add Rag.VectorStore module providing: - Semantic search using L2 distance with pgvector - Full-text search using PostgreSQL tsvector - Hybrid search combining both approaches via Reciprocal Rank Fusion - Text chunking with configurable overlap for context preservation - Rag.VectorStore.Chunk Ecto schema for document storage The vector store integrates with any Ecto repo provided by the consuming application, following library-first design principles. EMBEDDING SERVICE Add Rag.Embedding.Service GenServer for managed embedding operations: - Automatic batching with configurable batch sizes - Statistics tracking for monitoring - Direct chunk embedding with prepare_for_insert helper - Provider abstraction defaulting to Gemini AGENT FRAMEWORK Introduce complete agent system under Rag.Agent namespace: - Agent: Core orchestrator with tool calling and iteration limits - Session: Conversation memory with context management - Registry: Tool registration and execution dispatch - Tool: Behaviour definition for custom tool implementation Built-in tools included: - SearchRepos: Semantic search over indexed repositories - ReadFile: File content retrieval with line range extraction - GetRepoContext: Repository structure and metadata access - AnalyzeCode: Elixir AST parsing for code structure analysis EXAMPLES AND DOCUMENTATION Add comprehensive examples directory with runnable scripts: - basic_chat.exs: Simple LLM interaction patterns - routing_strategies.exs: Multi-provider routing demonstration - agent.exs: Tool registry and agent framework usage - vector_store.exs: In-memory semantic search example Include complete demo application in examples/rag_demo showing: - Phoenix integration patterns - Database setup with pgvector migrations - End-to-end RAG pipeline implementation BREAKING CHANGES Remove igniter-based Mix tasks: - rag.install - rag.gen_rag_module - rag.gen_servings - rag.gen_eval These tasks assumed a standalone application context. The library now follows a dependency-first design where consuming applications provide their own Ecto repo and run migrations as needed. DEPENDENCY CHANGES Added: - gemini_ex for Gemini provider - codex_sdk for OpenAI-compatible provider (optional) - claude_agent_sdk for Claude provider (optional) - pgvector, ecto_sql, postgrex for vector store Temporarily disabled: - torus: Elixir 1.18 compatibility issue with inflex - igniter: Removed along with Mix tasks TEST INFRASTRUCTURE Add comprehensive test coverage for all new modules: - Provider tests with integration tags for API testing - Router strategy tests with state verification - Agent and tool tests with Mimic for provider mocking - Vector store tests for query building and RRF scoring Configure test helper to exclude integration tests by default and skip provider-specific tests when credentials are unavailable. VISUAL IDENTITY Add project logo (assets/rag.svg) featuring hexagonal design with knowledge graph visualization representing the retrieval-augmented generation concept.

Replace local path-based SDK dependencies with versioned Hex packages for gemini_ex, codex_sdk, and claude_agent_sdk. This removes the requirement for a specific local directory structure during builds.

Introduce a complete set of documentation guides to cover the extensive functionality of the library. These guides provide detailed explanations, code examples, and architecture overviews for features ranging from basic setup to advanced concepts like GraphRAG and Agents. New guides added: - Getting Started: Installation, configuration, and basic usage patterns. - LLM Providers: Setup for Gemini, Claude, Codex, and other backends. - Smart Router: Strategies for multi-provider routing and failover. - Vector Store: Pgvector integration, schema setup, and document management. - Embeddings: Managed embedding service with automatic batching. - Chunking: Strategies for text splitting including recursive and semantic. - Retrievers: Semantic, full-text, hybrid, and graph retrieval methods. - Rerankers: Implementation of LLM-based result reranking. - Pipelines: Construction of composable, parallel RAG workflows. - GraphRAG: Knowledge graph extraction, community detection, and retrieval. - Agent Framework: Building tool-using agents with memory and registry. Configuration updates: - Update mix.exs to include the guides directory in the package files. - Configure ex_doc extras to render the new markdown files in the docs. - Organize module documentation into logical groups (Core, Providers, etc.). - Update README.md to provide a clear index table of available guides.

nshkrdotcom force-pushed the feature/multi-llm-routing-agent-framework branch from 7d4a054 to 51b1057 Compare December 21, 2025 21:51

nshkrdotcom added 3 commits December 21, 2025 11:56

build: switch SDK dependencies from path to hex

c54e424

Replace local path-based SDK dependencies with versioned Hex packages for gemini_ex, codex_sdk, and claude_agent_sdk. This removes the requirement for a specific local directory structure during builds.

Release v0.3.0: Modular RAG Architecture, GraphRAG, and Pipelines

8ea974a

nshkrdotcom force-pushed the feature/multi-llm-routing-agent-framework branch from 51b1057 to 8ea974a Compare December 21, 2025 21:57

nshkrdotcom closed this Dec 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-LLM routing, vector store, and agent framework #35

Multi-LLM routing, vector store, and agent framework #35

Uh oh!

nshkrdotcom commented Dec 21, 2025 •

edited

Loading

Uh oh!

nshkrdotcom commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Multi-LLM routing, vector store, and agent framework #35

Multi-LLM routing, vector store, and agent framework #35

Uh oh!

Conversation

nshkrdotcom commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

New Modules

Examples

Other

Test plan

Uh oh!

nshkrdotcom commented Dec 21, 2025

Update: Additional Features Added

Modular RAG Architecture

GraphRAG

Pipeline System

Examples

Test Coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nshkrdotcom commented Dec 21, 2025 •

edited

Loading