-
Notifications
You must be signed in to change notification settings - Fork 0
Feat: persist AQC reasoning! #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes introduce MongoDB-based workflow state persistence and step tracking into the agent workflow system. This includes updating environment and dependency files, refactoring agent and workflow classes to support MongoDB logging, adding a new persistence module, and providing comprehensive documentation and integration tests for MongoDB persistence. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Agent (run_hivemind_agent_activity)
participant MongoPersistence
participant AgenticHivemindFlow
participant Redis
participant RAGPipelineTool
User->>Agent: Submit query (payload)
Agent->>MongoPersistence: Create workflow state
Agent->>Redis: (If chat_id) Get chat history
Agent->>MongoPersistence: Update step: chat history retrieval
Agent->>AgenticHivemindFlow: Initialize with workflow_id, mongo_persistence
Agent->>MongoPersistence: Update step: flow initialization
Agent->>AgenticHivemindFlow: Run flow (async)
AgenticHivemindFlow->>MongoPersistence: Update step: classification results (various)
AgenticHivemindFlow->>RAGPipelineTool: (If needed) Run RAG query with workflow_id
RAGPipelineTool->>QueryDataSources: Query with workflow_id
QueryDataSources->>MongoPersistence: (Optional) Update step
AgenticHivemindFlow-->>Agent: Return answer
Agent->>MongoPersistence: Update step: answer processed
Agent->>Redis: (If chat_id) Update chat memory
Agent->>MongoPersistence: Update step: memory updated
Agent->>MongoPersistence: Update response & status
Agent-->>User: Return answer
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (7)
.env.example (1)
11-16: Keep dotenv keys ordered to satisfydotenv-linter.The linter warnings are harmless at runtime but will fail the pipeline if linting is enforced. Re-order the block:
-REDIS_PORT= -REDIS_PASSWORD= +REDIS_PASSWORD= +REDIS_PORT=-MONGODB_PORT= -MONGODB_USER= -MONGODB_PASS= +MONGODB_USER= +MONGODB_PASS= +MONGODB_PORT=tests/unit/test_mongo_persistence.py (1)
4-4: Remove unused importdatetime.Keeps the test file Ruff-clean.
-from datetime import datetimeREADME.md (1)
51-60: Minor wording & repetition fixes.-**Data**: Result from local transformer model -**Model**: `local_transformer` +**Data**: Result from local transformer model +**Model**: `local_transformer`Also consider removing the duplicate “Passed to” phrasing in the Workflow-ID list to improve readability.
tasks/hivemind/query_data_sources.py (1)
111-115: Nested event-loop viaasyncio.runcan dead-lock in some executors.Although
nest_asynciopatches help, callingasyncio.run()inside another running loop (e.g., when CrewAI executes inside Jupyter) is still discouraged. Prefer:response = asyncio.get_event_loop().run_until_complete( query_data_sources.query(query) )or make
_runasync and let CrewAI handle it if supported.tasks/mongo_persistence.py (2)
13-21: Docstring is out-of-sync with the signature
__init__(self, database_name: str = "hivemind", collection_name: str = "internal_messages")
but the docstring only documentscollection_name. Please adddatabase_name(and default) to keep the documentation accurate.
26-36: Method is getting large – consider a typed payload object
create_workflow_statetakes 9 positional/keyword arguments (Pylint R0913). A small@dataclass WorkflowInitor passing a single dict would cut parameter count, improve readability and make future schema evolution safer.tasks/agent.py (1)
180-183: Drop the redundantelseafterreturnAfter
return Nonetheelseis unnecessary. Simplyreturn final_answerdirectly:- if final_answer == "NONE": - return None - else: - return final_answer + if final_answer == "NONE": + return None + return final_answerThis matches Pylint R1705 and reduces nesting.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.env.example(1 hunks)README.md(1 hunks)requirements.txt(1 hunks)tasks/agent.py(2 hunks)tasks/hivemind/agent.py(5 hunks)tasks/hivemind/query_data_sources.py(5 hunks)tasks/mongo_persistence.py(1 hunks)tests/unit/test_mongo_persistence.py(1 hunks)
🧰 Additional context used
🪛 dotenv-linter (3.3.0)
.env.example
[warning] 11-11: [UnorderedKey] The REDIS_PASSWORD key should go before the REDIS_PORT key
[warning] 16-16: [UnorderedKey] The MONGODB_PASS key should go before the MONGODB_PORT key
🪛 Ruff (0.11.9)
tests/unit/test_mongo_persistence.py
4-4: datetime.datetime imported but unused
Remove unused import: datetime.datetime
(F401)
🪛 Pylint (3.3.7)
tasks/mongo_persistence.py
[refactor] 26-26: Too many arguments (9/5)
(R0913)
[refactor] 26-26: Too many positional arguments (9/5)
(R0917)
tasks/agent.py
[refactor] 180-183: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
🪛 LanguageTool
README.md
[duplication] ~51-~51: Possible typo: you repeated a word.
Context: ...Data: Result from local transformer model - Model: local_transformer ### Question Cl...
(ENGLISH_WORD_REPEAT_RULE)
[uncategorized] ~148-~148: Loose punctuation mark.
Context: ...``` ## Dependencies - pymongo==4.8.0: MongoDB driver - `redis==5.2.0`: Redis ...
(UNLIKELY_OPENING_PUNCTUATION)
[grammar] ~159-~159: This phrase is duplicated. You should probably use “Passed to” only once.
Context: ...ted in run_hivemind_agent_activity 2. Passed to AgenticHivemindFlow 3. Passed to RAGPipelineTool 4. Passed to `QueryDa...
(PHRASE_REPETITION)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: ci / test / Test
🔇 Additional comments (2)
requirements.txt (1)
5-5: Confirm compatibility oftc-temporal-backend==1.1.3.A minor-version bump occasionally introduces breaking API or serialization changes in Temporal clients. Please verify that:
- Generated stubs are still compatible.
- Any protobuf / payload schema changes don’t affect persisted workflows.
If this PR has already been run end-to-end on a long-running namespace you can disregard, otherwise consider pinning the exact SHA or running integration tests before merge.
tests/unit/test_mongo_persistence.py (1)
34-35: Patch target may miss the real import path.
patch("pymongo.MongoClient", …)only works iftasks.mongo_persistenceimports viaimport pymongoand thenpymongo.MongoClient.
If that module instead doesfrom pymongo import MongoClient, the patch point must be"tasks.mongo_persistence.MongoClient".Please double-check; otherwise the tests silently create a real client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
tests/unit/test_mongo_persistence.py (1)
4-4: Remove unuseddatetimeimport
datetimeis never referenced in this module – keep the import list lean to avoid misleading readers and satisfy Ruff’s F401 hint.-from datetime import datetimetasks/mongo_persistence.py (1)
13-20: Docstring omitsdatabase_nameparameter
__init__exposes bothdatabase_nameandcollection_name, but only the latter is documented. Add the missing entry for clarity & IDE assistance.tasks/agent.py (1)
180-183: Drop unnecessaryelseafter early returnThe
returninside theifalready exits the function – theelsewrapper adds indentation noise.- if final_answer == "NONE": - return None - else: - return final_answer + if final_answer == "NONE": + return None + return final_answer
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
tasks/agent.py(2 hunks)tasks/mongo_persistence.py(1 hunks)tests/unit/test_mongo_persistence.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/unit/test_mongo_persistence.py (1)
tasks/mongo_persistence.py (5)
MongoPersistence(10-200)create_workflow_state(26-89)update_workflow_step(91-138)update_response(140-177)get_workflow_state(179-200)
tasks/mongo_persistence.py (1)
tasks/hivemind/query_data_sources.py (1)
query(23-52)
🪛 Ruff (0.11.9)
tests/unit/test_mongo_persistence.py
4-4: datetime.datetime imported but unused
Remove unused import: datetime.datetime
(F401)
🪛 Pylint (3.3.7)
tasks/agent.py
[refactor] 180-183: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
tasks/mongo_persistence.py
[refactor] 26-26: Too many arguments (9/5)
(R0913)
[refactor] 26-26: Too many positional arguments (9/5)
(R0917)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: ci / test / Test
- GitHub Check: ci / lint / Lint
| if memory and final_answer != "NONE": | ||
| chat = f"User: {payload.query}\nAgent: {final_answer}" | ||
| memory.append_text(chat) | ||
|
|
||
| # Update step: Memory update | ||
| mongo_persistence.update_workflow_step( | ||
| workflow_id=workflow_id, | ||
| step_name="memory_update", | ||
| step_data={ | ||
| "memoryKey": f"conversation:{payload.chat_id}", | ||
| "chatEntryLength": len(chat), | ||
| } | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard against None answers before persisting to Redis
When enable_answer_skipping=True a valid pathway leaves final_answer as None.
The current condition (memory and final_answer != "NONE") treats None as valid and stores the literal string "None" in Redis.
-if memory and final_answer != "NONE":
+if memory and final_answer is not None and final_answer != "NONE":Prevents polluting conversation history with meaningless entries.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if memory and final_answer != "NONE": | |
| chat = f"User: {payload.query}\nAgent: {final_answer}" | |
| memory.append_text(chat) | |
| # Update step: Memory update | |
| mongo_persistence.update_workflow_step( | |
| workflow_id=workflow_id, | |
| step_name="memory_update", | |
| step_data={ | |
| "memoryKey": f"conversation:{payload.chat_id}", | |
| "chatEntryLength": len(chat), | |
| } | |
| ) | |
| if memory and final_answer is not None and final_answer != "NONE": | |
| chat = f"User: {payload.query}\nAgent: {final_answer}" | |
| memory.append_text(chat) | |
| # Update step: Memory update | |
| mongo_persistence.update_workflow_step( | |
| workflow_id=workflow_id, | |
| step_name="memory_update", | |
| step_data={ | |
| "memoryKey": f"conversation:{payload.chat_id}", | |
| "chatEntryLength": len(chat), | |
| } | |
| ) |
🤖 Prompt for AI Agents
In tasks/agent.py around lines 152 to 164, the condition checking final_answer
before appending to memory does not guard against final_answer being None,
causing the string "None" to be stored. Update the condition to explicitly check
that final_answer is not None and not equal to the string "NONE" before
appending to memory and updating the workflow step, preventing meaningless
entries in Redis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/unit/test_mongo_persistence.py (1)
14-22: Mock chain still broken –__getitem__paths are not patched
MongoPersistence.__init__resolves the DB/collection viaclient[db_name][collection_name].
Onlyget_database()/get_collection()are mocked here, so the attributes referenced inside the class resolve to freshMagicMockinstances and all subsequent assertions oninsert_one/update_onewill fail.- self.db_mock.get_collection.return_value = self.collection_mock - - self.client_mock = MagicMock() - self.client_mock.get_database.return_value = self.db_mock + # Patch dictionary-style access used in the implementation + self.db_mock.__getitem__.return_value = self.collection_mock + + self.client_mock = MagicMock() + self.client_mock.__getitem__.return_value = self.db_mock(You can keep the
get_*patches if you like, but the__getitem__hooks are essential.)
🧹 Nitpick comments (1)
tests/unit/test_mongo_persistence.py (1)
11-13: Avoid loading the real.envduring unit testsCalling
load_dotenv()pulls in the developer’s environment and may introduce flaky behaviour. Prefer replacing it with a controlledpatch.dict(os.environ, {...})setup or mockload_dotenvitself.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/unit/test_mongo_persistence.py(1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
tests/unit/test_mongo_persistence.py
[error] 30-30: Instance of 'TestMongoPersistence' has no 'env_patcher' member
(E1101)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: ci / test / Test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
tasks/mongo_persistence.py (3)
13-20: Docstring omitsdatabase_nameparameter and encourages hidden coupling
__init__accepts adatabase_nameargument but the docstring only documentscollection_name. Update the docstring so callers know the parameter exists and can override it for tests / staging.- Parameters - ---------- - collection_name : str - The MongoDB collection name to use for storing workflow states + Parameters + ---------- + database_name : str + Database to connect to (defaults to "hivemind"). + collection_name : str + Collection inside the database that stores workflow states.
26-36: Constructor-like signature is approaching the “too many params” smell
create_workflow_statenow takes 9 positional/keyword parameters (flagged by pylint R0913 / R0917). Consider:• bundling optional fields (
source,destination,filters,metadata,chat_id,enable_answer_skipping) into a small dataclass (Route,QuestionOptions, …)
• or accepting a single**kwargs/ config object.This will keep the interface stable when more attributes are inevitably added.
64-70: Avoid writing"destination": NonedocumentsWhen
destinationis omitted we still persist"route": { "source": "temporal", "destination": null }Storing
nullclutters the document and forces downstream code to handle both missing andnull. Consider excluding the key when it’s not provided:- "route": { - "source": source, - "destination": destination, - }, + "route": {"source": source, **({"destination": destination} if destination else {})},tests/integration/test_mongo_persistence.py (1)
31-38: Tests rely on private.collectionattribute
self.persistence.collectionreaches inside the class, violating encapsulation. A refactor ofMongoPersistencecould break every test. Prefer public helpers (e.g.,clear_collection()or a parameter to ctor that injects the collection) or, at minimum, access viapersistence._collectionto signal intentional private use.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tasks/mongo_persistence.py(1 hunks)tests/integration/test_mongo_persistence.py(1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
tasks/mongo_persistence.py
[refactor] 26-26: Too many arguments (9/5)
(R0913)
[refactor] 26-26: Too many positional arguments (9/5)
(R0917)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: ci / lint / Lint
- GitHub Check: ci / test / Test
| @classmethod | ||
| def setUpClass(cls): | ||
| """Set up test environment once for all tests""" | ||
| load_dotenv() | ||
|
|
||
| # Use a test-specific collection name to avoid interfering with production data | ||
| cls.test_collection_name = f"test_internal_messages_{uuid.uuid4().hex[:8]}" | ||
| cls.persistence = MongoPersistence(collection_name=cls.test_collection_name) | ||
|
|
||
| # Verify MongoDB connection | ||
| try: | ||
| # Test the connection by trying to access the collection | ||
| cls.persistence.collection.find_one() | ||
| print(f"✅ MongoDB connection successful. Using test collection: {cls.test_collection_name}") | ||
| except Exception as e: | ||
| print(f"❌ MongoDB connection failed: {e}") | ||
| print("Make sure MongoDB is running and environment variables are set correctly") | ||
| raise | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Flaky CI risk – gracefully skip when MongoDB is not reachable
The integration suite assumes a live Mongo instance; most CI runners don’t provide one. Wrap the connection probe in unittest.SkipTest instead of raise, so pipelines without MongoDB succeed while still running when the service is available.
- except Exception as e:
- print(f"❌ MongoDB connection failed: {e}")
- print("Make sure MongoDB is running and environment variables are set correctly")
- raise
+ except Exception as e:
+ raise unittest.SkipTest(
+ f"MongoDB not available for integration tests: {e}"
+ ) from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @classmethod | |
| def setUpClass(cls): | |
| """Set up test environment once for all tests""" | |
| load_dotenv() | |
| # Use a test-specific collection name to avoid interfering with production data | |
| cls.test_collection_name = f"test_internal_messages_{uuid.uuid4().hex[:8]}" | |
| cls.persistence = MongoPersistence(collection_name=cls.test_collection_name) | |
| # Verify MongoDB connection | |
| try: | |
| # Test the connection by trying to access the collection | |
| cls.persistence.collection.find_one() | |
| print(f"✅ MongoDB connection successful. Using test collection: {cls.test_collection_name}") | |
| except Exception as e: | |
| print(f"❌ MongoDB connection failed: {e}") | |
| print("Make sure MongoDB is running and environment variables are set correctly") | |
| raise | |
| @classmethod | |
| def setUpClass(cls): | |
| """Set up test environment once for all tests""" | |
| load_dotenv() | |
| # Use a test-specific collection name to avoid interfering with production data | |
| cls.test_collection_name = f"test_internal_messages_{uuid.uuid4().hex[:8]}" | |
| cls.persistence = MongoPersistence(collection_name=cls.test_collection_name) | |
| # Verify MongoDB connection | |
| try: | |
| # Test the connection by trying to access the collection | |
| cls.persistence.collection.find_one() | |
| print(f"✅ MongoDB connection successful. Using test collection: {cls.test_collection_name}") | |
| except Exception as e: | |
| raise unittest.SkipTest( | |
| f"MongoDB not available for integration tests: {e}" | |
| ) from e |
🤖 Prompt for AI Agents
In tests/integration/test_mongo_persistence.py around lines 10 to 28, the
current setup raises an exception if MongoDB is not reachable, causing CI
failures. Modify the exception handling in setUpClass to catch connection errors
and raise unittest.SkipTest instead of re-raising the exception. This will
gracefully skip the tests when MongoDB is unavailable, preventing flaky CI
failures while allowing tests to run when the service is accessible.
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests
Chores