Skip to content

Conversation

@Jagriti-student
Copy link
Contributor

@Jagriti-student Jagriti-student commented Jan 14, 2026

Description

This PR replaces hardcoded duration and average_response_time metrics in AG2Adapter with real-time measurements.

Changes

  • Track ScenarioRun duration using monotonic timestamps
  • Compute average response time from message timestamps
  • Remove hardcoded metric values

Fixes #55

Summary by CodeRabbit

  • Bug Fixes

    • Session duration now accurately computed from actual session times.
    • Metrics now correctly include calculated average response times.
    • Improved error handling for malformed CSV rows with detailed error messages.
  • New Features

    • CSV data loading now supports configurable delimiters for tools and context fields.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
@continue
Copy link

continue bot commented Jan 14, 2026

All Green - Keep your PRs mergeable

Learn more

All Green is an AI agent that automatically:

✅ Addresses code review comments

✅ Fixes failing CI checks

✅ Resolves merge conflicts


Unsubscribe from All Green comments

@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

Walkthrough

The pull request implements proper duration and response time tracking in the AG2 adapter by computing values from actual message timestamps instead of hardcoded placeholders, and enhances CSV dataset loading with configurable field delimiters and per-row error handling.

Changes

Cohort / File(s) Summary
Metrics Computation Improvements
src/agentunit/adapters/autogen_ag2.py
Computes session duration from end_time - start_time instead of hardcoded 0.0; implements per-message response time calculation and derives average_response_time from computed values instead of placeholder.
CSV Dataset Parsing Enhancement
src/agentunit/datasets/base.py
Adds configurable tools_delimiter and context_delimiter parameters to load_local_csv(); introduces helper function to parse delimited CSV fields; adds per-row try/except error handling with contextual AgentUnitError.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • aviralgarg05
🚥 Pre-merge checks | ✅ 2 | ❌ 3
❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes changes to base.py (CSV delimiter functionality) that appear unrelated to the AG2Adapter duration/response time tracking objectives stated in issue #55. Remove changes to src/agentunit/datasets/base.py or create a separate PR for CSV delimiter improvements, as these are outside the scope of issue #55.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description is minimal but addresses the core changes. However, it lacks comprehensive detail expected by the template (missing Type of Change, detailed Testing section, Code Quality checklist, and other standard sections). Expand the description to include Type of Change selection, Testing section with test results, Code Quality checklist items, and Documentation updates to match the repository template.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: implementing duration and response time tracking in the AG2Adapter, which is the primary focus of the PR.
Linked Issues check ✅ Passed The PR successfully addresses both requirements from issue #55: duration tracking via monotonic timestamps and average_response_time calculation from message timestamps, with hardcoded values removed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 48cfb0d and dda8a15.

📒 Files selected for processing (2)
  • src/agentunit/adapters/autogen_ag2.py
  • src/agentunit/datasets/base.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/agentunit/datasets/base.py (2)
src/agentunit/cli/__init__.py (1)
  • get (80-89)
src/agentunit/core/exceptions.py (1)
  • AgentUnitError (8-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test (Python 3.12)
  • GitHub Check: Test (Python 3.10)
🔇 Additional comments (6)
src/agentunit/adapters/autogen_ag2.py (3)

321-321: Duration calculation looks correct.

The computation properly converts the timedelta to milliseconds. Note that datetime.now() is used for timestamps rather than time.monotonic(). While this works for most cases, datetime.now() can be affected by system clock adjustments (e.g., NTP sync). If precise elapsed time measurement is critical, consider using monotonic time for duration tracking.


343-354: LGTM!

The response time calculation correctly computes the time difference between consecutive messages and handles the edge case of no interactions by returning 0.0.


356-365: LGTM!

The metrics dictionary now properly includes the computed average_response_time, fulfilling the PR objective of replacing the hardcoded placeholder.

src/agentunit/datasets/base.py (3)

82-94: LGTM!

The helper function is well-designed with proper handling of edge cases: null/non-string inputs, empty delimiters, whitespace trimming, and empty results. The defensive check for empty delimiter prevents a potential ValueError from str.split("").


96-100: LGTM!

Good API design with configurable delimiters and sensible defaults. The change is backward compatible with existing callers.


110-132: LGTM!

The per-row error handling provides good context for debugging malformed CSV data. The exception chaining with from exc preserves the original error information while wrapping it in a domain-specific AgentUnitError.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 0% with 20 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/agentunit/datasets/base.py 0.00% 13 Missing ⚠️
src/agentunit/adapters/autogen_ag2.py 0.00% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Owner

@aviralgarg05 aviralgarg05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@aviralgarg05 aviralgarg05 merged commit 3767aab into aviralgarg05:main Jan 14, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Duration & Response Time Tracking in AG2Adapter

3 participants