feat: support 'same-as-agent' model option for legacy evaluators #1048

Chibionos · 2025-12-23T05:11:50Z

Summary

Add support for the same-as-agent model configuration in legacy LLM-based evaluators
When an evaluator specifies same-as-agent as its model, it now resolves to the actual model from agent.json settings
Added logging for model resolution and evaluator creation to improve visibility

Changes

Updated EvaluatorFactory.create_evaluator() to accept agent_model parameter
Updated _create_legacy_evaluator_internal() to pass agent_model to LLM-based evaluators
Updated _create_legacy_llm_as_judge_evaluator() to resolve same-as-agent to actual model
Updated _create_legacy_trajectory_evaluator() to resolve same-as-agent to actual model
Added _get_agent_model() method to runtime to load model from agent.json
Added INFO-level logging for model resolution and evaluator creation
Fixed error message in trajectory evaluator (was incorrectly saying "LLM evaluator")

Test plan

Tested with calculator_same_as_agent example containing evaluators with "model": "same-as-agent"
Verified both LLM-as-judge and Trajectory evaluators resolve model correctly
Verified logging shows model resolution and creation messages

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/uipath/_cli/_evals/_runtime.py

Add support for the 'same-as-agent' model configuration in legacy LLM-based evaluators. When an evaluator specifies 'same-as-agent' as its model, it now resolves to the actual model from agent.json settings instead of throwing an error. Changes: - Updated EvaluatorFactory to accept and pass agent_model parameter - Added _get_agent_model() method to runtime to load model from agent.json - Added logging for model resolution and evaluator creation - Fixed error message in trajectory evaluator (was incorrectly saying "LLM evaluator") 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

akshaylive

IMO it's better to do something like this:

from typing import Protocol, runtime_checkable


@runtime_checkable
class LLMAgentProtocol(Protocol):
    def get_agent_model(self) -> str:
        ...

And switch the implementation of _get_agent_model to:

def _get_agent_model(self, runtime: UiPathRuntimeProtocol) -> str | None:
    if isinstance(runtime, LLMAgentProtocol):
        return runtime.get_agent_model()
    else:
        return None

That way, react agent can implement that method and it should work seamlessly.

Implements the Protocol-based approach for getting agent model: - Adds LLMAgentFactoryProtocol with get_agent_model() method - Updates _get_agent_model() to check if factory implements protocol - Falls back to file-based approach if protocol not implemented This allows runtime factories to provide agent model information directly, enabling cleaner 'same-as-agent' resolution for evaluators. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Chibionos · 2025-12-23T08:46:44Z

IMO it's better to do something like this:
from typing import Protocol, runtime_checkable


@runtime_checkable
class LLMAgentProtocol(Protocol):
    def get_agent_model(self) -> str:
        ...
And switch the implementation of _get_agent_model to:
def _get_agent_model(self, runtime: UiPathRuntimeProtocol) -> str | None:
    if isinstance(runtime, LLMAgentProtocol):
        return runtime.get_agent_model()
    else:
        return None
That way, react agent can implement that method and it should work seamlessly.

Good catch implemented your Protocol-based pattern.

akshaylive · 2025-12-23T16:33:20Z

I don't think the factory should contain the model. It's a runtime concept. I.e; the factory can create different runtimes with different LLMs based on the entrypoint, so I'd rather keep it in runtime. This is why runtime: UiPathRuntimeProtocol has to be piped everywhere but it should be doable, right?

Also, please create a unit test to test the change :)

mjnovice · 2025-12-23T21:04:13Z

src/uipath/_cli/_evals/_evaluator_factory.py

+        cls,
+        data: dict[str, Any],
+        evaluators_dir: Path | None = None,
+        agent_model: str | None = None,


nit: I would suggest passing in agent_model_settings object instead, so that @mathurk @AAgnihotry do not have to do double work :)

mjnovice · 2025-12-23T21:05:18Z

src/uipath/_cli/_evals/_runtime.py


        return result

+    def _get_agent_model(self) -> str | None:


what happens when we specify a custom model settings cc - @mathurk @AAgnihotry

mathurk

lgtm

github-actions bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository labels Dec 23, 2025

Chibionos requested review from akshaylive, mjnovice and radu-mocanu and removed request for radu-mocanu December 23, 2025 05:13

chatgpt-codex-connector bot reviewed Dec 23, 2025

View reviewed changes

src/uipath/_cli/_evals/_runtime.py Outdated Show resolved Hide resolved

Chibionos force-pushed the feat/same-as-agent-evaluator-model branch from 171a24b to 48a0c5e Compare December 23, 2025 05:32

Chibionos force-pushed the feat/same-as-agent-evaluator-model branch from 48a0c5e to 9e71de8 Compare December 23, 2025 05:45

akshaylive reviewed Dec 23, 2025

View reviewed changes

mjnovice reviewed Dec 23, 2025

View reviewed changes

mathurk approved these changes Dec 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support 'same-as-agent' model option for legacy evaluators #1048

feat: support 'same-as-agent' model option for legacy evaluators #1048

Uh oh!

Chibionos commented Dec 23, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

akshaylive left a comment •

edited

Loading

Uh oh!

Chibionos commented Dec 23, 2025

Uh oh!

akshaylive commented Dec 23, 2025 •

edited

Loading

Uh oh!

mjnovice Dec 23, 2025

Uh oh!

mjnovice Dec 23, 2025

Uh oh!

mathurk left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: support 'same-as-agent' model option for legacy evaluators #1048

Are you sure you want to change the base?

feat: support 'same-as-agent' model option for legacy evaluators #1048

Uh oh!

Conversation

Chibionos commented Dec 23, 2025

Summary

Changes

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

akshaylive left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Chibionos commented Dec 23, 2025

Uh oh!

akshaylive commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjnovice Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

mjnovice Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

mathurk left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

akshaylive left a comment •

edited

Loading

akshaylive commented Dec 23, 2025 •

edited

Loading