-
Notifications
You must be signed in to change notification settings - Fork 13
Test/verify suite template #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test/verify suite template #51
Conversation
Learn moreAll Green is an AI agent that automatically: ✅ Addresses code review comments ✅ Fixes failing CI checks ✅ Resolves merge conflicts |
WalkthroughAdds a new test module implementing a mock FAQ-adapter workflow: data carrier, iterable mock dataset, mock agent, adapter with prepare/execute, and a test that runs the flow asserting successful outcomes and expected outputs. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test (pytest)
participant Dataset as MockDataset
participant Adapter as FAQAdapter
participant Agent as MockAgent
Note over Test,Adapter: Setup
Test->>Adapter: instantiate
Test->>Adapter: Adapter.prepare()
Adapter->>Agent: connect()
Agent-->>Adapter: connected
Note over Test,Dataset: Execution loop
Test->>Dataset: for item in dataset
Dataset-->>Test: FAQItem(question, answer)
Test->>Adapter: Adapter.execute(FAQItem)
Adapter->>Agent: answer(question)
Agent-->>Adapter: response
Adapter->>Test: AdapterOutcome(success, output)
Test->>Test: assert outcome.success and output == expected
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related issues
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @tests/test_suite_template.py:
- Around line 37-44: The execute method signature is incompatible with
BaseAdapter: change the method signature of execute to match BaseAdapter (def
execute(self, case: DatasetCase, trace: TraceLog) -> AdapterOutcome), extract
the FAQItem from the provided case (e.g., faq = case.input or case.data) and use
faq.question/faq.answer for the agent call and comparison, properly handle or
forward the trace parameter (at minimum accept it; optionally record events on
trace if needed), return AdapterOutcome(success=success, output=response) with
the missing space after the comma fixed, and ensure references to
AdapterOutcome, DatasetCase, TraceLog, and FAQItem remain correct.
- Around line 46-61: The test incorrectly calls adapter.execute(item) with a
single argument while BaseAdapter.execute requires two parameters (case:
DatasetCase and trace: TraceLog), and it should use the agentunit framework test
runner; fix by replacing the manual loop with construction of a Scenario using
Scenario(name="demo", adapter=adapter, dataset=dataset) and call
run_suite([scenario]) to execute tests, ensuring any call sites or mocks conform
to BaseAdapter.execute(case, trace) signature (use a DatasetCase instance and a
TraceLog when invoking directly if needed).
🧹 Nitpick comments (1)
tests/test_suite_template.py (1)
4-4: Remove unused import.The
Scenarioclass is imported but never used in this test file.♻️ Remove unused import
-from agentunit import Scenario
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_suite_template.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_suite_template.py (1)
src/agentunit/adapters/base.py (2)
AdapterOutcome(18-25)BaseAdapter(28-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test (Python 3.12)
- GitHub Check: Test (Python 3.10)
🔇 Additional comments (3)
tests/test_suite_template.py (3)
7-10: LGTM!The
FAQItemdataclass is well-defined and appropriate for modeling FAQ data.
12-16: LGTM!The
MockDatasetcorrectly implements the iterator protocol and provides appropriate test fixtures.
18-27: LGTM!The
MockAgentprovides a simple, deterministic mock for testing the adapter flow.
| def execute(self, input_data: FAQItem) -> AdapterOutcome: | ||
| # The core logic we want to test | ||
| response = self.agent.answer(input_data.question) | ||
|
|
||
| # Simple exact match check | ||
| success = (response == input_data.answer) | ||
|
|
||
| return AdapterOutcome(success=success,output=response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: execute signature doesn't match BaseAdapter interface.
The execute method signature is incompatible with BaseAdapter. According to the base class definition, execute must accept (self, case: DatasetCase, trace: TraceLog) -> AdapterOutcome, but this implementation only takes input_data: FAQItem. This adapter won't work with the agentunit framework and violates the Liskov Substitution Principle.
Additionally, Line 44 has a minor formatting issue (missing space after comma).
🔧 Proposed fix
- def execute(self, input_data: FAQItem) -> AdapterOutcome:
+ def execute(self, case: FAQItem, trace: Any) -> AdapterOutcome:
# The core logic we want to test
- response = self.agent.answer(input_data.question)
+ response = self.agent.answer(case.question)
# Simple exact match check
- success = (response == input_data.answer)
+ success = (response == case.answer)
- return AdapterOutcome(success=success,output=response)
+ return AdapterOutcome(success=success, output=response)Note: You may need to properly handle the trace parameter if tracing is required for this adapter.
Run the following script to verify how other adapters in the codebase implement the BaseAdapter.execute method:
#!/bin/bash
# Description: Find all adapter implementations to verify the correct execute signature
# Search for classes that inherit from BaseAdapter
ast-grep --pattern $'class $_ (BaseAdapter):
$$$
'
# Search for execute method implementations in adapter files
rg -n -A 5 'def execute' --type=py -g '*adapter*.py'🤖 Prompt for AI Agents
In @tests/test_suite_template.py around lines 37 - 44, The execute method
signature is incompatible with BaseAdapter: change the method signature of
execute to match BaseAdapter (def execute(self, case: DatasetCase, trace:
TraceLog) -> AdapterOutcome), extract the FAQItem from the provided case (e.g.,
faq = case.input or case.data) and use faq.question/faq.answer for the agent
call and comparison, properly handle or forward the trace parameter (at minimum
accept it; optionally record events on trace if needed), return
AdapterOutcome(success=success, output=response) with the missing space after
the comma fixed, and ensure references to AdapterOutcome, DatasetCase, TraceLog,
and FAQItem remain correct.
tests/test_suite_template.py
Outdated
| def test_suite_template_flow(): | ||
| """ | ||
| Verifies that the FAQAdapter logic from the documentation template | ||
| works correctly with a mock agent. | ||
| """ | ||
| # 1. Setup the scenario with our adapter and dataset | ||
| dataset = MockDataset() | ||
| adapter = FAQAdapter() | ||
| adapter.prepare() | ||
|
|
||
| for item in dataset: | ||
| outcome = adapter.execute(item) | ||
|
|
||
| # 3. Assertions | ||
| assert outcome.success is True | ||
| assert outcome.output == item.answer No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find test files using Scenario class
rg -n -B 2 -A 5 'Scenario' --type=py -g 'test_*.py' -g '*_test.py'Repository: aviralgarg05/agentunit
Length of output: 35601
🏁 Script executed:
# Also check for Scenario class definition
fd -e py -x rg -l 'class Scenario'Repository: aviralgarg05/agentunit
Length of output: 314
🏁 Script executed:
# Check the BaseAdapter class and its execute method signature
rg -n -A 10 'class BaseAdapter' --type=pyRepository: aviralgarg05/agentunit
Length of output: 751
🏁 Script executed:
# Check the current test file to understand its imports and structure
cat -n tests/test_suite_template.pyRepository: aviralgarg05/agentunit
Length of output: 2315
🏁 Script executed:
# Check the complete BaseAdapter execute method signature
rg -n -A 15 'def execute' src/agentunit/adapters/base.pyRepository: aviralgarg05/agentunit
Length of output: 640
🏁 Script executed:
# Check how execute is called in the actual runner/framework code
rg -n 'execute(' src/agentunit/core/runner.py -B 2 -A 2Repository: aviralgarg05/agentunit
Length of output: 132
🏁 Script executed:
# Check how execute is called in the runner
rg -n 'adapter.execute' src/agentunit/core/ -B 2 -A 2Repository: aviralgarg05/agentunit
Length of output: 486
🏁 Script executed:
# Check the Scenario class definition and how it uses adapters
rg -n -A 20 'class Scenario' src/agentunit/core/scenario.py | head -60Repository: aviralgarg05/agentunit
Length of output: 727
🏁 Script executed:
# Look at test_runner.py to see how scenarios are properly used with adapters
head -100 tests/test_runner.pyRepository: aviralgarg05/agentunit
Length of output: 3173
Fix the execute() signature and use the agentunit framework's testing infrastructure.
The test calls adapter.execute(item) with a single argument, but BaseAdapter.execute() requires two parameters: case (DatasetCase) and trace (TraceLog). Additionally, the test should use the Scenario class (already imported) with the framework's run_suite() instead of manually iterating the dataset. This demonstrates proper framework integration, which is essential for a template that documents recommended patterns.
See tests/test_runner.py for the correct approach:
scenario = Scenario(name="demo", adapter=adapter, dataset=dataset)
result = run_suite([scenario])🤖 Prompt for AI Agents
In @tests/test_suite_template.py around lines 46 - 61, The test incorrectly
calls adapter.execute(item) with a single argument while BaseAdapter.execute
requires two parameters (case: DatasetCase and trace: TraceLog), and it should
use the agentunit framework test runner; fix by replacing the manual loop with
construction of a Scenario using Scenario(name="demo", adapter=adapter,
dataset=dataset) and call run_suite([scenario]) to execute tests, ensuring any
call sites or mocks conform to BaseAdapter.execute(case, trace) signature (use a
DatasetCase instance and a TraceLog when invoking directly if needed).
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |

Description
This PR adds a unit test (
tests/test_suite_template.py) that implements theFAQAdapterpattern described indocs/templates/suite_template.py.It verifies that the documentation example correctly implements the
BaseAdapterinterface (usingprepareandexecute) and produces validAdapterOutcomeobjects.Type of Change
Changes Made
tests/test_suite_template.pywhich:FAQAdapterinheriting fromBaseAdapter.Testing
poetry run pytest tests/test_suite_template.py)Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.