feat: Optional Case specific Goal for GoalSuccessRateEvaluator#75
feat: Optional Case specific Goal for GoalSuccessRateEvaluator#75dbermuehler wants to merge 1 commit intostrands-agents:mainfrom
Conversation
…ccessRateEvaluator
strands-agent
left a comment
There was a problem hiding this comment.
Review: ✅ LGTM with Minor Suggestions
Great feature addition! This allows users to provide case-specific goals for the GoalSuccessRateEvaluator, which is very useful for more targeted evaluations.
What's Good:
- Clean Implementation: The goal is read from
evaluation_case.metadata.get("goal")- elegant and backward-compatible - New Prompt Template (v1): Good versioning approach with a new v1 template that explicitly documents the optional goal
- Sensible Default: Changed default version to "v1" - users get the new feature by default
- Refactored
_format_prompt: Now takes the full evaluation case instead of just session_input, allowing access to metadata
Questions/Suggestions:
-
Missing
__init__.pyupdate? - Doessrc/strands_evals/evaluators/prompt_templates/goal_success_rate/__init__.pyneed to be updated to includegoal_success_rate_v1? Without this, theget_template("v1")call won't find the new template. -
Tests: The checklist indicates no tests were added. Consider adding a test case like:
def test_goal_success_rate_evaluator_with_custom_goal(): evaluator = GoalSuccessRateEvaluator() case = EvaluationData( input="What's the weather?", output="It's sunny!", metadata={"goal": "Get accurate weather information"} ) # Test that custom goal is used in evaluation
-
Documentation: Would be helpful to add a brief example in the feature request issue (#74) showing how to use the new metadata field.
Non-Blocking:
- Consider running
hatch run prepareto ensure all checks pass (checklist item)
Overall this is a clean, useful feature. Happy to approve once the above items are addressed! 🦆
🤖 This is an experimental AI agent response from the Strands team, powered by Strands Agents. We're exploring how AI agents can help with community support and development. Your feedback helps us improve! If you'd prefer human assistance, please let us know.
Description
Related Issues
#74
Documentation PR
Type of Change
New feature
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.