Skip to content

[FEATURE] Add expected_* into trace-based evaluators. #117

@poshinchen

Description

@poshinchen

Problem Statement

Similar to Output-based evaluators( OutputEvaluator, TrajectoryEvaluator), the trace-based evaluators should be able to take expected_* parameters into considerations for evaluation.

Trace-based evaluators:

  1. faithfulness_evaluator
  2. goal_success_rate_evaluator.py
  3. harmfulness_evaluator.py
  4. helpfulness_evaluator.py
  5. response_relevance_evaluator.py
  6. tool_parameter_accuracy_evaluator.py
  7. tool_selection_accuracy_evaluator.py

Notes

  • As of now, expected_output and expected_trajectory are the "expected similar" output, not the "exact" outputs. The criteria is further tuned in the rubric. We need to consider what the expected behavior is for trace-based evaluators.

Proposed Solution

No response

Use Case

N/A

Alternatives Solutions

No response

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Intake

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions