Define --full as the composition of --all, --max-steps 100, and --auto-respond. Also fix a few reward parsing bugs. #252
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note
CLI and config updates
fullwithallin config (hud/cli/eval.py) and add--allflag;--fullnow sets--all,--auto-respond, and--max-steps 100.auto_respondis now an explicit boolean flag (no implicit default via--full);max_stepsdefaults to10.allfor multi-task runs; display updated to showalland explicitmax_steps.Reward handling changes
EvalContext.__aexit__from evaluate tools; runners stop manually settingctx.reward(hud/datasets/runner.py).self.rewarddirectly (hud/eval/context.py).Tool result structure
MCPToolResultwithstructuredContentfor both local and remote tool calls (hud/environment/environment.py).result.structuredContent(hud/agents/base.py).Written by Cursor Bugbot for commit 2b60eb4. This will update automatically on new commits. Configure here.