-
Notifications
You must be signed in to change notification settings - Fork 6
Multi-LLM routing, vector store, and agent framework #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
nshkrdotcom
wants to merge
4
commits into
bitcrowd:main
from
nshkrdotcom:feature/multi-llm-routing-agent-framework
Closed
Multi-LLM routing, vector store, and agent framework #35
nshkrdotcom
wants to merge
4
commits into
bitcrowd:main
from
nshkrdotcom:feature/multi-llm-routing-agent-framework
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7d4a054 to
51b1057
Compare
Author
Update: Additional Features AddedThe latest commit adds significant new capabilities: Modular RAG Architecture
GraphRAG
Pipeline System
Examples
Test Coverage
|
This release represents a major architectural evolution of the RAG library, transforming it from a single-provider system into a comprehensive multi-LLM orchestration platform with agentic capabilities. MULTI-LLM PROVIDER SUPPORT Add three new provider implementations under Rag.Ai namespace: - Gemini: Full support for text generation and embeddings via gemini_ex - Claude: Text generation via claude_agent_sdk with agentic workflows - Codex: OpenAI-compatible API via codex_sdk with structured output Each provider implements the Rag.Ai.Provider behaviour and exposes capability metadata through Rag.Ai.Capabilities for runtime introspection. SMART ROUTER Introduce Rag.Router module with pluggable routing strategies: - Fallback: Sequential provider attempts with failure tracking and decay - RoundRobin: Load distribution with optional weights and recovery - Specialist: Task-based routing using provider strengths The router automatically selects strategies based on available providers and handles retries transparently. Provider health is tracked per-strategy with configurable failure thresholds and cooldown periods. VECTOR STORE WITH PGVECTOR Add Rag.VectorStore module providing: - Semantic search using L2 distance with pgvector - Full-text search using PostgreSQL tsvector - Hybrid search combining both approaches via Reciprocal Rank Fusion - Text chunking with configurable overlap for context preservation - Rag.VectorStore.Chunk Ecto schema for document storage The vector store integrates with any Ecto repo provided by the consuming application, following library-first design principles. EMBEDDING SERVICE Add Rag.Embedding.Service GenServer for managed embedding operations: - Automatic batching with configurable batch sizes - Statistics tracking for monitoring - Direct chunk embedding with prepare_for_insert helper - Provider abstraction defaulting to Gemini AGENT FRAMEWORK Introduce complete agent system under Rag.Agent namespace: - Agent: Core orchestrator with tool calling and iteration limits - Session: Conversation memory with context management - Registry: Tool registration and execution dispatch - Tool: Behaviour definition for custom tool implementation Built-in tools included: - SearchRepos: Semantic search over indexed repositories - ReadFile: File content retrieval with line range extraction - GetRepoContext: Repository structure and metadata access - AnalyzeCode: Elixir AST parsing for code structure analysis EXAMPLES AND DOCUMENTATION Add comprehensive examples directory with runnable scripts: - basic_chat.exs: Simple LLM interaction patterns - routing_strategies.exs: Multi-provider routing demonstration - agent.exs: Tool registry and agent framework usage - vector_store.exs: In-memory semantic search example Include complete demo application in examples/rag_demo showing: - Phoenix integration patterns - Database setup with pgvector migrations - End-to-end RAG pipeline implementation BREAKING CHANGES Remove igniter-based Mix tasks: - rag.install - rag.gen_rag_module - rag.gen_servings - rag.gen_eval These tasks assumed a standalone application context. The library now follows a dependency-first design where consuming applications provide their own Ecto repo and run migrations as needed. DEPENDENCY CHANGES Added: - gemini_ex for Gemini provider - codex_sdk for OpenAI-compatible provider (optional) - claude_agent_sdk for Claude provider (optional) - pgvector, ecto_sql, postgrex for vector store Temporarily disabled: - torus: Elixir 1.18 compatibility issue with inflex - igniter: Removed along with Mix tasks TEST INFRASTRUCTURE Add comprehensive test coverage for all new modules: - Provider tests with integration tags for API testing - Router strategy tests with state verification - Agent and tool tests with Mimic for provider mocking - Vector store tests for query building and RRF scoring Configure test helper to exclude integration tests by default and skip provider-specific tests when credentials are unavailable. VISUAL IDENTITY Add project logo (assets/rag.svg) featuring hexagonal design with knowledge graph visualization representing the retrieval-augmented generation concept.
Replace local path-based SDK dependencies with versioned Hex packages for gemini_ex, codex_sdk, and claude_agent_sdk. This removes the requirement for a specific local directory structure during builds.
51b1057 to
8ea974a
Compare
Introduce a complete set of documentation guides to cover the extensive functionality of the library. These guides provide detailed explanations, code examples, and architecture overviews for features ranging from basic setup to advanced concepts like GraphRAG and Agents. New guides added: - Getting Started: Installation, configuration, and basic usage patterns. - LLM Providers: Setup for Gemini, Claude, Codex, and other backends. - Smart Router: Strategies for multi-provider routing and failover. - Vector Store: Pgvector integration, schema setup, and document management. - Embeddings: Managed embedding service with automatic batching. - Chunking: Strategies for text splitting including recursive and semantic. - Retrievers: Semantic, full-text, hybrid, and graph retrieval methods. - Rerankers: Implementation of LLM-based result reranking. - Pipelines: Construction of composable, parallel RAG workflows. - GraphRAG: Knowledge graph extraction, community detection, and retrieval. - Agent Framework: Building tool-using agents with memory and registry. Configuration updates: - Update mix.exs to include the guides directory in the package files. - Configure ex_doc extras to render the new markdown files in the docs. - Organize module documentation into logical groups (Core, Providers, etc.). - Update README.md to provide a clear index table of available guides.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds major new capabilities to the RAG library:
Changes
New Modules
Rag.Router- Multi-provider routing with strategiesRag.Ai.{Gemini,Claude,Codex}- LLM provider implementationsRag.Ai.Capabilities- Provider capability registryRag.VectorStore- Semantic/hybrid search operationsRag.Embedding.Service- Embedding generation serviceRag.Agent.{Agent,Registry,Session,Tool}- Agent frameworkAnalyzeCode,ReadFile,SearchRepos,GetRepoContextExamples
examples/basic_chat.exs- Simple LLM interactionexamples/routing_strategies.exs- Multi-provider routing demoexamples/agent.exs- Agent framework with toolsexamples/vector_store.exs- Semantic search demoexamples/rag_demo/- Full Phoenix app demo with databaseOther
Test plan
mix run examples/*.exs