refactor: extract payload builders and tracing into reusable modules #1038
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
EvalTracingManagerclass that encapsulates OpenTelemetry tracing logic for evaluation runs_payload_builderspackage with abstract base class and concrete implementations for coded and legacy evaluationsWhy This Refactoring Was Needed
The
StudioWebProgressReporterclass had grown to over 1200 lines with significant code duplication between coded and legacy evaluation handling. This refactoring:Improves maintainability: Separates concerns into focused modules - tracing logic in
_eval_tracing.pyand payload building in_payload_builders/Enables reusability: The new abstractions can be used independently for:
Reduces duplication: Shared utilities like GUID conversion, usage extraction, and completion metrics building are now in a single base class
Facilitates testing: Smaller, focused classes are easier to unit test in isolation
New Modules
_eval_tracing.pyEvalTracingManager: Manages OpenTelemetry tracing for evaluation runs including parent trace creation, eval run traces, and evaluator span management_payload_builders/BasePayloadBuilder: Abstract base class with shared utilities for GUID conversion, usage extraction from spans, completion metrics, and request spec buildingCodedPayloadBuilder: Handles coded agent evaluation payloads with string IDs and/coded/endpoint suffixLegacyPayloadBuilder: Handles legacy (low-code) agent payloads with GUID conversion andassertionRunsformatTest plan
uv run just lint)uv run mypy src/uipath/_cli/_evals/)uv run just build)🤖 Generated with Claude Code