Skip to content

Conversation

@nshkrdotcom
Copy link

@nshkrdotcom nshkrdotcom commented Dec 21, 2025

Summary

This PR adds major new capabilities to the RAG library:

  • Multi-LLM Provider Routing: Support for Gemini, Claude, and Codex with configurable routing strategies (fallback, round-robin, specialist)
  • Vector Store: pgvector-based semantic search with hybrid search (RRF) combining vector similarity and full-text search
  • Agent Framework: Tool registry, session management, and built-in tools for code analysis and file operations
  • Embedding Service: GenServer-based embedding generation with batching support

Changes

New Modules

  • Rag.Router - Multi-provider routing with strategies
  • Rag.Ai.{Gemini,Claude,Codex} - LLM provider implementations
  • Rag.Ai.Capabilities - Provider capability registry
  • Rag.VectorStore - Semantic/hybrid search operations
  • Rag.Embedding.Service - Embedding generation service
  • Rag.Agent.{Agent,Registry,Session,Tool} - Agent framework
  • Built-in tools: AnalyzeCode, ReadFile, SearchRepos, GetRepoContext

Examples

  • examples/basic_chat.exs - Simple LLM interaction
  • examples/routing_strategies.exs - Multi-provider routing demo
  • examples/agent.exs - Agent framework with tools
  • examples/vector_store.exs - Semantic search demo
  • examples/rag_demo/ - Full Phoenix app demo with database

Other

  • Updated dependencies (Mimic 2.2, added pgvector)
  • Comprehensive test coverage (368 tests)
  • Removed legacy mix tasks (gen_eval, gen_rag_module, gen_servings, install)

Test plan

  • All 368 tests pass
  • Examples run successfully with mix run examples/*.exs
  • Demo app works end-to-end with database

@nshkrdotcom nshkrdotcom force-pushed the feature/multi-llm-routing-agent-framework branch from 7d4a054 to 51b1057 Compare December 21, 2025 21:51
@nshkrdotcom
Copy link
Author

Update: Additional Features Added

The latest commit adds significant new capabilities:

Modular RAG Architecture

  • Retriever Behaviour: Pluggable retrieval with Semantic, FullText, Hybrid, and Graph implementations
  • Reranker Behaviour: LLM-based and passthrough reranking strategies
  • Chunking Module: Multiple strategies (character, sentence, paragraph, recursive, semantic)

GraphRAG

  • Entity/Relationship Extraction: LLM-powered extraction with entity resolution
  • Graph Store: PostgreSQL-based storage with pgvector for entity embeddings
  • Community Detection: Label propagation algorithm with hierarchical clustering
  • Graph Retriever: Local, global, and hybrid search modes

Pipeline System

  • Pipeline Executor: Multi-step workflow orchestration
  • Context Management: State passing between pipeline steps
  • Error Handling: Configurable retry and fallback strategies

Examples

  • basic_rag.exs - Complete RAG workflow with database
  • hybrid_search.exs - Semantic + full-text with RRF fusion
  • graph_rag.exs - Entity extraction and graph-based retrieval
  • chunking_strategies.exs - All chunking strategies demonstrated
  • pipeline_example.exs - Multi-step pipeline orchestration

Test Coverage

  • 567 tests passing (up from 368)

This release represents a major architectural evolution of the RAG library,
transforming it from a single-provider system into a comprehensive multi-LLM
orchestration platform with agentic capabilities.

MULTI-LLM PROVIDER SUPPORT

Add three new provider implementations under Rag.Ai namespace:
- Gemini: Full support for text generation and embeddings via gemini_ex
- Claude: Text generation via claude_agent_sdk with agentic workflows
- Codex: OpenAI-compatible API via codex_sdk with structured output

Each provider implements the Rag.Ai.Provider behaviour and exposes capability
metadata through Rag.Ai.Capabilities for runtime introspection.

SMART ROUTER

Introduce Rag.Router module with pluggable routing strategies:
- Fallback: Sequential provider attempts with failure tracking and decay
- RoundRobin: Load distribution with optional weights and recovery
- Specialist: Task-based routing using provider strengths

The router automatically selects strategies based on available providers and
handles retries transparently. Provider health is tracked per-strategy with
configurable failure thresholds and cooldown periods.

VECTOR STORE WITH PGVECTOR

Add Rag.VectorStore module providing:
- Semantic search using L2 distance with pgvector
- Full-text search using PostgreSQL tsvector
- Hybrid search combining both approaches via Reciprocal Rank Fusion
- Text chunking with configurable overlap for context preservation
- Rag.VectorStore.Chunk Ecto schema for document storage

The vector store integrates with any Ecto repo provided by the consuming
application, following library-first design principles.

EMBEDDING SERVICE

Add Rag.Embedding.Service GenServer for managed embedding operations:
- Automatic batching with configurable batch sizes
- Statistics tracking for monitoring
- Direct chunk embedding with prepare_for_insert helper
- Provider abstraction defaulting to Gemini

AGENT FRAMEWORK

Introduce complete agent system under Rag.Agent namespace:
- Agent: Core orchestrator with tool calling and iteration limits
- Session: Conversation memory with context management
- Registry: Tool registration and execution dispatch
- Tool: Behaviour definition for custom tool implementation

Built-in tools included:
- SearchRepos: Semantic search over indexed repositories
- ReadFile: File content retrieval with line range extraction
- GetRepoContext: Repository structure and metadata access
- AnalyzeCode: Elixir AST parsing for code structure analysis

EXAMPLES AND DOCUMENTATION

Add comprehensive examples directory with runnable scripts:
- basic_chat.exs: Simple LLM interaction patterns
- routing_strategies.exs: Multi-provider routing demonstration
- agent.exs: Tool registry and agent framework usage
- vector_store.exs: In-memory semantic search example

Include complete demo application in examples/rag_demo showing:
- Phoenix integration patterns
- Database setup with pgvector migrations
- End-to-end RAG pipeline implementation

BREAKING CHANGES

Remove igniter-based Mix tasks:
- rag.install
- rag.gen_rag_module
- rag.gen_servings
- rag.gen_eval

These tasks assumed a standalone application context. The library now follows
a dependency-first design where consuming applications provide their own Ecto
repo and run migrations as needed.

DEPENDENCY CHANGES

Added:
- gemini_ex for Gemini provider
- codex_sdk for OpenAI-compatible provider (optional)
- claude_agent_sdk for Claude provider (optional)
- pgvector, ecto_sql, postgrex for vector store

Temporarily disabled:
- torus: Elixir 1.18 compatibility issue with inflex
- igniter: Removed along with Mix tasks

TEST INFRASTRUCTURE

Add comprehensive test coverage for all new modules:
- Provider tests with integration tags for API testing
- Router strategy tests with state verification
- Agent and tool tests with Mimic for provider mocking
- Vector store tests for query building and RRF scoring

Configure test helper to exclude integration tests by default and skip
provider-specific tests when credentials are unavailable.

VISUAL IDENTITY

Add project logo (assets/rag.svg) featuring hexagonal design with knowledge
graph visualization representing the retrieval-augmented generation concept.
Replace local path-based SDK dependencies with versioned Hex packages for
gemini_ex, codex_sdk, and claude_agent_sdk. This removes the requirement for a
specific local directory structure during builds.
@nshkrdotcom nshkrdotcom force-pushed the feature/multi-llm-routing-agent-framework branch from 51b1057 to 8ea974a Compare December 21, 2025 21:57
Introduce a complete set of documentation guides to cover the extensive
functionality of the library. These guides provide detailed explanations,
code examples, and architecture overviews for features ranging from basic
setup to advanced concepts like GraphRAG and Agents.

New guides added:
- Getting Started: Installation, configuration, and basic usage patterns.
- LLM Providers: Setup for Gemini, Claude, Codex, and other backends.
- Smart Router: Strategies for multi-provider routing and failover.
- Vector Store: Pgvector integration, schema setup, and document management.
- Embeddings: Managed embedding service with automatic batching.
- Chunking: Strategies for text splitting including recursive and semantic.
- Retrievers: Semantic, full-text, hybrid, and graph retrieval methods.
- Rerankers: Implementation of LLM-based result reranking.
- Pipelines: Construction of composable, parallel RAG workflows.
- GraphRAG: Knowledge graph extraction, community detection, and retrieval.
- Agent Framework: Building tool-using agents with memory and registry.

Configuration updates:
- Update mix.exs to include the guides directory in the package files.
- Configure ex_doc extras to render the new markdown files in the docs.
- Organize module documentation into logical groups (Core, Providers, etc.).
- Update README.md to provide a clear index table of available guides.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant