Fix ScenarioRun Duration Tracking in SwarmAdapter #63

Jagriti-student · 2026-01-12T08:18:16Z

Currently, ScenarioRun duration is hardcoded to 0.0 in SwarmAdapter, which does not reflect the actual runtime of the scenario.

This PR updates SwarmAdapter.end_session to:

Use the already-calculated duration_seconds metric.

Convert it to milliseconds.

Populate the duration_ms field in ScenarioRun.

This ensures that scenario durations are accurately recorded, improving metrics tracking and reporting consistency.

Closes #54

Summary by CodeRabbit

Release Notes

New Features
- Added optional dependency availability checking with warnings when integrations are unavailable.
Bug Fixes
- Fixed session duration calculation in scenario results.
- Improved CSV export handling for empty result datasets.
Refactor
- Internal code improvements for consistency and maintainability.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…markdown

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

…nted

continue · 2026-01-12T08:18:19Z

Learn more

All Green is an AI agent that automatically:

✅ Addresses code review comments

✅ Fixes failing CI checks

✅ Resolves merge conflicts

Unsubscribe from All Green comments

coderabbitai · 2026-01-12T08:18:39Z

Walkthrough

The changes implement duration tracking in the SwarmAdapter by computing duration_ms from start and end timestamps for scenario runs, adds optional Swarm import guarding, and fixes empty result set handling in CSV export by removing an early return condition.

Changes

Cohort / File(s)	Summary
SwarmAdapter Duration & Import Updates `src/agentunit/adapters/swarm_adapter.py`	Added HAS_SWARM guard with warning for missing Swarm dependency; updated `end_session` to compute `duration_ms` from start/end timestamps and apply to both `ScenarioRun` and `ScenarioResult` creation (previously hardcoded to 0.0); adjusted `ScenarioResult` imports and conditional scenario name derivation; minor formatting updates across return statements.
Results CSV Empty Handling `src/agentunit/reporting/results.py`	Removed early return in `to_csv` method when result rows are empty, allowing CSV file creation with empty headers instead of exiting prematurely.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	Description includes motivation and issue reference but is missing most required template sections like Type of Change, Testing, Code Quality, and Documentation checkboxes.	Complete the PR description template by filling in Type of Change, Testing details, Code Quality checks, and other relevant sections for comprehensive review documentation.
Out of Scope Changes check	⚠️ Warning	Changes to results.py (removing early return in to_csv for empty rows) are outside the scope of issue #54, which focuses solely on SwarmAdapter duration tracking.	Remove the results.py changes or create a separate PR for the to_csv modification, as it is unrelated to the duration tracking fix in SwarmAdapter.
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly describes the main change: fixing duration tracking in ScenarioRun within SwarmAdapter.
Linked Issues check	✅ Passed	The PR successfully implements the core requirement from issue #54: computing duration_ms from duration_seconds and populating it in ScenarioRun within end_session.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc345d5 and 96ca15c.

📒 Files selected for processing (1)

src/agentunit/reporting/results.py

💤 Files with no reviewable changes (1)

src/agentunit/reporting/results.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2026-01-12T08:19:30Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 38.46154% with 8 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/agentunit/adapters/swarm_adapter.py	38.46%	8 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

pyproject.toml (1)

23-37: Confirm numpy is still available or remove direct imports.

Direct numpy imports were found in the codebase:

tests/test_comparison.py:5

src/agentunit/production/__init___old.py:13 (appears to be a deprecated backup file)

While scipy will bring in numpy as a transitive dependency, the test file has a direct import. Either add numpy back as an explicit dependency or remove the direct numpy imports and rely solely on scipy's numpy.

src/agentunit/adapters/swarm_adapter.py (1)

1-1: Pipeline failure: Ruff format check failed.

The CI pipeline indicates that this file needs reformatting. Please run poetry run ruff format src/agentunit/adapters/swarm_adapter.py to fix the formatting issues before merging.

🤖 Fix all issues with AI agents

In @src/agentunit/reporting/results.py:
- Around line 203-204: When rows is empty the function returns target without
creating the file; change the early-return branch so it creates the file (either
an empty file or headers-only) before returning—for example, invoke
target.write_text("", encoding="utf-8") (or write header text) in the branch
that currently checks `if not rows:` and then `return target`; locate this in
the results handling code that uses the `rows` variable and `target` to ensure
the file is always created.

🧹 Nitpick comments (3)

src/agentunit/reporting/results.py (2)
82-92: _flatten_metrics only handles single-level nesting.

The helper only flattens one level of nested dictionaries. If value is a dict containing another dict, the inner dict will be serialized as a string in the CSV (via DictWriter's default behavior), which may not be the intended outcome.

Consider using recursion if deeper nesting is expected, or document this limitation.
♻️ Optional: Recursive flattening
-def _flatten_metrics(metrics: dict[str, Any], prefix: str = "metric") -> dict[str, Any]:
+def _flatten_metrics(metrics: dict[str, Any], prefix: str = "metric") -> dict[str, Any]:
+    """Flatten nested metric dictionaries recursively."""
     flat: dict[str, Any] = {}
 
     for key, value in metrics.items():
+        full_key = f"{prefix}_{key}"
         if isinstance(value, dict):
-            for inner_key, inner_value in value.items():
-                flat[f"{prefix}_{key}_{inner_key}"] = inner_value
+            flat.update(_flatten_metrics(value, full_key))
         else:
-            flat[f"{prefix}_{key}"] = value
+            flat[full_key] = value
 
     return flat
206-211: Consider a more intuitive column ordering.

Alphabetical sorting places columns like case_id before scenario_name. For better readability, consider placing key identifying columns first.
♻️ Optional: Fixed column order with dynamic metrics
-        fieldnames = sorted({key for row in rows for key in row})
+        # Fixed columns first, then sorted metric columns
+        fixed_cols = ["scenario_name", "case_id", "success", "duration_ms", "error"]
+        metric_cols = sorted({k for row in rows for k in row if k not in fixed_cols})
+        fieldnames = fixed_cols + metric_cols
src/agentunit/adapters/swarm_adapter.py (1)
353-369: Consider consolidating local imports.

ScenarioRun and ScenarioResult are imported separately at lines 353 and 369. These could be combined into a single import statement for cleaner code.
♻️ Consolidate imports
         # Create trace log
         from agentunit.core.trace import TraceLog
+        from agentunit.reporting.results import ScenarioResult, ScenarioRun
 
         trace = TraceLog()
         trace.record(
             "session_complete",
             session_id=session_id,
             metrics=metrics,
         )
 
         # Create scenario run
-        from agentunit.reporting.results import ScenarioRun
-
         scenario_run = ScenarioRun(
             ...
         )
 
         # Create result
-        from agentunit.reporting.results import ScenarioResult
-
         result = ScenarioResult(
             ...
         )

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f9b43d and f303197.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

pyproject.toml
src/agentunit/adapters/swarm_adapter.py
src/agentunit/reporting/results.py
tests/test_reporting.py

💤 Files with no reviewable changes (1)

tests/test_reporting.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/agentunit/adapters/swarm_adapter.py (2)

src/agentunit/core/trace.py (2)

TraceLog (26-78)

record (33-34)

src/agentunit/reporting/results.py (1)

ScenarioRun (29-36)

🪛 GitHub Actions: CI

src/agentunit/adapters/swarm_adapter.py

[error] 1-1: Ruff format check failed. 1 file would be reformatted: src/agentunit/adapters/swarm_adapter.py. Command: poetry run ruff format --check src/ tests/.

🔇 Additional comments (7)

pyproject.toml (3)

27-27: LGTM on langchain constraint.

The version constraint >=0.0.353,<0.4.0 is reasonable and unchanged from before.

78-82: LGTM on pytest configuration.

The -ra flag configuration is appropriate for showing test summary information. This appears to be a formatting-only change with no functional impact.

35-41: Extras configuration and dependency versions are correct.

The integration-tests extra correctly references the optional langgraph dependency, and jsonschema version 4.25.1 is available on PyPI. The configuration aligns with the PR objectives.

src/agentunit/reporting/results.py (1)

100-120: LGTM!

The addition of started_at and finished_at fields to SuiteResult with ISO formatting in to_dict() and to_json() is well-implemented. These timing metadata fields complement the duration tracking changes in the adapter.

src/agentunit/adapters/swarm_adapter.py (3)

17-25: LGTM!

Good pattern for handling optional dependencies. The import guard with HAS_SWARM flag combined with the check in __init__ (line 51-56) provides clear error messaging when the dependency is missing.

331-337: LGTM - Correctly implements duration tracking.

This change properly addresses the PR objective by calculating duration_ms from session timestamps. The conversion from seconds to milliseconds (duration_seconds * 1000) is correct, and the fallback to 0.0 provides defensive handling for edge cases.

355-366: LGTM!

The ScenarioRun creation correctly uses the calculated duration_ms value (line 364), which is the core fix for this PR. The conditional logic for scenario_name is appropriately defensive.

src/agentunit/reporting/results.py

aviralgarg05

Fix the issue

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Jagriti-student · 2026-01-12T09:37:52Z

I have solve the issue.

aviralgarg05

LGTM!

Jagriti-student added 24 commits December 7, 2025 14:43

Add basic evaluation example script

b052329

Fix typos and improve clarity in docstrings across core modules

dd2f4fe

Add Google-style docstrings to BaseAdapter methods

80b0706

Format base adapter using ruff

8e7b8c1

docs: add instructions for running CI checks locally

0669ffd

Remove example file unrelated to CI documentation

fe9d27b

Add py.typed marker for type checker support

7e21593

Add test for markdown emoji encoding

e53ab49

Fix test_reporting: correct class usage, fields, and Windows-safe to_…

ac0efa1

…markdown

All tests passing: fixed dependencies and formatting

dd4d11a

Merge branch 'main' into test-markdown-emoji

35b5efc

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Merge branch 'main' into test-markdown-emoji

8efa96d

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Merge branch 'main' into test-markdown-emoji

ec02604

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Merge branch 'main' into test-markdown-emoji

5dd7810

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Update dependencies / poetry config

b6b5d9f

Fix emoji markdown test and align ScenarioRun signature

27d79e0

Fix reporting tests and update dependencies

12a7eb4

Fix missing required dependencies (jsonschema, scipy)

d3dd11c

Update all files

b4034a1

Add CSV export support for SuiteResult

580323d

Fix SIM118 linter issue in SuiteResult.to_csv

f899ead

Fix Ruff formatting issues in SuiteResult.to_csv

fc43108

Fix CSV export: iterate over dict keys correctly and pass Ruff lint

637bb2d

Fix: SwarmAdapter imports and end_session duration tracking, fully li…

f303197

…nted

Format files and remove lint error

cc345d5

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

src/agentunit/reporting/results.py Outdated Show resolved Hide resolved

aviralgarg05 requested changes Jan 12, 2026

View reviewed changes

Jagriti-student added 2 commits January 12, 2026 15:03

Fix the issue of csv file

eed13b0

Merge branch 'main' into implement-duration-track

96ca15c

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

aviralgarg05 approved these changes Jan 12, 2026

View reviewed changes

aviralgarg05 merged commit 23afcf8 into aviralgarg05:main Jan 12, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix ScenarioRun Duration Tracking in SwarmAdapter #63

Fix ScenarioRun Duration Tracking in SwarmAdapter #63

Uh oh!

Jagriti-student commented Jan 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

continue bot commented Jan 12, 2026

Uh oh!

coderabbitai bot commented Jan 12, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

aviralgarg05 left a comment

Uh oh!

Jagriti-student commented Jan 12, 2026

Uh oh!

aviralgarg05 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix ScenarioRun Duration Tracking in SwarmAdapter #63

Fix ScenarioRun Duration Tracking in SwarmAdapter #63

Uh oh!

Conversation

Jagriti-student commented Jan 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

continue bot commented Jan 12, 2026

Uh oh!

coderabbitai bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov-commenter commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aviralgarg05 left a comment

Choose a reason for hiding this comment

Uh oh!

Jagriti-student commented Jan 12, 2026

Uh oh!

aviralgarg05 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Jagriti-student commented Jan 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 12, 2026 •

edited

Loading

codecov-commenter commented Jan 12, 2026 •

edited

Loading