-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: streamline RAG query execution in Hivemind agent by replaci… #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ng deprecated components with new QueryDataSources class and enhancing error handling
WalkthroughUpdates CI workflows to use a new reusable template (ci2.yml), removes LangChain dependencies from requirements, refactors the Hivemind agent to call QueryDataSources directly instead of a LangChain-based agent, and makes the RAG tool’s LangChain decorator optional via runtime import with a no-op fallback. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Router as Router
participant Agent as AgenticHivemindFlow
participant QDS as QueryDataSources
participant DS as Data Sources
Router->>Agent: do_rag_query(state)
Note over Agent: Build QueryDataSources with context
Agent->>QDS: query(prompt, constraints) (async)
QDS->>DS: fetch/search/aggregate
DS-->>QDS: results or none
QDS-->>Agent: answer or null
alt answer available
Agent->>Agent: set state.last_answer, inc retry_count
else error/none
Agent->>Agent: log error, set "NONE", inc retry_count
end
Agent-->>Router: "stop"
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tasks/hivemind/agent.py (1)
163-169: Preferlogging.exceptionfor exception logging.When logging within an exception handler, use
logging.exceptioninstead oflogging.errorto automatically include the stack trace. This provides better debugging context without needingexc_info=True.Apply this diff:
except Exception as e: - logging.error(f"RAG query execution failed: {e}") + logging.exception("RAG query execution failed") answer = "NONE"Note: The broad
Exceptioncatch is acceptable here since any failure should fall back to "NONE" per the existing error handling strategy.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.github/workflows/production.yml(1 hunks).github/workflows/start.staging.yml(1 hunks)requirements.txt(1 hunks)tasks/hivemind/agent.py(2 hunks)tasks/hivemind/query_data_sources.py(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-07-01T11:14:19.992Z
Learnt from: amindadgar
PR: TogetherCrew/agents-workflow#44
File: tasks/hivemind/agent.py:164-164
Timestamp: 2025-07-01T11:14:19.992Z
Learning: In RAG architectures using LangChain agents with custom RAG tools, the agent should pass queries unchanged to the RAG tool rather than processing them. The RAG tool itself (like `make_rag_tool` in tasks/hivemind/agent.py) handles all query processing, retrieval, and answer generation, while the agent serves as a simple intermediary. The system instruction "Pass the incoming queries without changing it" is appropriate for this separation of concerns.
Applied to files:
tasks/hivemind/agent.py
📚 Learning: 2025-02-26T15:41:54.631Z
Learnt from: amindadgar
PR: TogetherCrew/agents-workflow#2
File: tasks/hivemind/query_data_sources.py:27-31
Timestamp: 2025-02-26T15:41:54.631Z
Learning: In the agents-workflow project, "HivemindWorkflow" and "AgenticHivemindTemporalWorkflow" are two distinct workflows. The QueryDataSources class in tasks/hivemind/query_data_sources.py is designed to interact with "HivemindWorkflow", which is separate from "AgenticHivemindTemporalWorkflow" defined in tasks/agent.py.
Applied to files:
tasks/hivemind/agent.py
🧬 Code graph analysis (1)
tasks/hivemind/agent.py (2)
tasks/hivemind/query_data_sources.py (2)
QueryDataSources(17-82)query(23-71)tasks/agent.py (1)
run(219-227)
🪛 Ruff (0.13.1)
tasks/hivemind/query_data_sources.py
100-100: Do not catch blind exception: Exception
(BLE001)
tasks/hivemind/agent.py
167-167: Do not catch blind exception: Exception
(BLE001)
168-168: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test / Test
🔇 Additional comments (5)
requirements.txt (1)
1-9: LGTM! Dependency cleanup aligns with the refactoring.The removal of
langchainandlangchain-openaidependencies is consistent with the PR's objective to decouple from LangChain-based tooling and adopt the newQueryDataSourcesclass approach.tasks/hivemind/query_data_sources.py (1)
98-106: Well-designed runtime import pattern for optional LangChain support.The try-except block with a no-op fallback decorator elegantly handles the case where LangChain is not installed, allowing the RAG functionality to work without the dependency. This aligns perfectly with the PR's goal of making LangChain optional.
Regarding the static analysis hint about catching broad
Exceptionat line 100: this is appropriate here since import failures can raise various exception types (ImportError, ModuleNotFoundError, etc.), and the fallback behavior is safe and intentional.tasks/hivemind/agent.py (1)
157-161: Cleaner refactoring that removes LangChain complexity.The direct instantiation of
QueryDataSourcesand synchronous wrapper withasyncio.run(line 164) simplifies the RAG query flow by removing the LangChain agent/executor intermediary. This aligns well with the PR objectives..github/workflows/start.staging.yml (1)
9-9: LGTM! Consistent workflow template update.This change mirrors the production workflow update to
ci2.yml@main, ensuring consistency across staging and production environments..github/workflows/production.yml (1)
12-12: Confirm reusable workflow compatibility
ci2.yml in TogetherCrew/operations/.github/workflows exists and declareson: workflow_call, so the updated workflow reference is valid and compatible.
| query_data_sources = QueryDataSources( | ||
| community_id=self.community_id, | ||
| enable_answer_skipping=self.enable_answer_skipping, | ||
| workflow_id=self.workflow_id, | ||
| ) | ||
| agent = create_openai_functions_agent(llm, tools, prompt) | ||
|
|
||
| # Run the agent | ||
| agent_executor = AgentExecutor( | ||
| agent=agent, | ||
| tools=tools, | ||
| verbose=True, | ||
| return_intermediate_steps=False, | ||
| max_iterations=3, | ||
| ) | ||
| try: | ||
| answer = asyncio.run(query_data_sources.query(self.state.user_query)) | ||
| if answer is None: | ||
| answer = "NONE" | ||
| except Exception as e: | ||
| logging.error(f"RAG query execution failed: {e}") | ||
| answer = "NONE" | ||
|
|
||
| result = agent_executor.invoke({"input": self.state.user_query}) | ||
| self.state.last_answer = result["output"] | ||
| self.state.last_answer = answer | ||
| self.state.retry_count += 1 | ||
|
|
||
| return "stop" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type inconsistency: last_answer expects CrewOutput | None but receives str.
Line 171 assigns a string value (answer) to self.state.last_answer, but the field is typed as CrewOutput | None (line 19). This mismatch will cause type checking failures and potential runtime issues downstream where CrewOutput is expected.
Apply this diff to fix the type annotation:
class AgenticFlowState(BaseModel):
user_query: str = ""
retry_count: int = 0
- last_answer: CrewOutput | None = None
+ last_answer: CrewOutput | str | None = None
state: str = "continue"
chat_history: str | None = NoneAlternatively, if last_answer should remain CrewOutput | None, wrap the string response in a CrewOutput object before assignment.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| query_data_sources = QueryDataSources( | |
| community_id=self.community_id, | |
| enable_answer_skipping=self.enable_answer_skipping, | |
| workflow_id=self.workflow_id, | |
| ) | |
| agent = create_openai_functions_agent(llm, tools, prompt) | |
| # Run the agent | |
| agent_executor = AgentExecutor( | |
| agent=agent, | |
| tools=tools, | |
| verbose=True, | |
| return_intermediate_steps=False, | |
| max_iterations=3, | |
| ) | |
| try: | |
| answer = asyncio.run(query_data_sources.query(self.state.user_query)) | |
| if answer is None: | |
| answer = "NONE" | |
| except Exception as e: | |
| logging.error(f"RAG query execution failed: {e}") | |
| answer = "NONE" | |
| result = agent_executor.invoke({"input": self.state.user_query}) | |
| self.state.last_answer = result["output"] | |
| self.state.last_answer = answer | |
| self.state.retry_count += 1 | |
| return "stop" | |
| class AgenticFlowState(BaseModel): | |
| user_query: str = "" | |
| retry_count: int = 0 | |
| last_answer: CrewOutput | str | None = None | |
| state: str = "continue" | |
| chat_history: str | None = None |
🧰 Tools
🪛 Ruff (0.13.1)
167-167: Do not catch blind exception: Exception
(BLE001)
168-168: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
In tasks/hivemind/agent.py around lines 157 to 174, last_answer is being set to
a plain string but its type is declared as CrewOutput | None (line 19); update
the code so the assigned value matches the declared type by either (A) wrapping
the string into a CrewOutput instance before assigning to self.state.last_answer
(create a CrewOutput with the appropriate fields populated from the RAG response
and use that), or (B) if the design intent is to allow plain strings, change the
state type annotation for last_answer to str | CrewOutput | None (and update any
downstream usages to handle the string case). Ensure you choose one approach and
make corresponding downstream type/usage updates so type checking passes.
…ng deprecated components with new QueryDataSources class and enhancing error handling
Summary by CodeRabbit
Refactor
Chores
Bug Fixes