-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add Ralph Loop outer verification for task completion #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds an outer verification loop (Ralph Loop) that validates whether the agent's final answer truly completes the task. If the answer is incomplete, feedback is injected and the inner ReAct loop re-enters with up to 3 attempts (configurable). The PR also fixes an authentication issue by passing API keys from config to the LLM adapter, removes the unstable TimerTool, and renames react_agent.py to agent.py to reflect the new dual-loop architecture.
Changes:
- Implemented Ralph Loop outer verification with
Verifierprotocol andLLMVerifierimplementation - Added
Config.get_api_key()helper and fixedLiteLLMAdapterauthentication - Removed
TimerTooland all associated tests
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
agent/verification.py |
New verification interface with Verifier protocol, VerificationResult dataclass, and LLMVerifier implementation |
agent/base.py |
Added _ralph_loop() method that wraps _react_loop with verification logic |
agent/agent.py |
Updated ReActAgent.run() to use _ralph_loop instead of _react_loop directly |
config.py |
Added RALPH_LOOP_MAX_ITERATIONS config and get_api_key() helper method |
main.py |
Removed TimerTool import and added API key parameter to LiteLLMAdapter |
utils/terminal_ui.py |
Added print_unfinished_answer() function for displaying intermediate results |
tools/notify.py |
Simplified description by removing config requirement details |
tools/timer.py |
Deleted entire file (95 lines) |
test/test_ralph_loop.py |
New test file with 8 unit tests for Ralph Loop functionality |
test/test_timer_tool.py |
Deleted entire test file (100 lines) |
test/test_smart_edit_integration.py |
Updated import from react_agent to agent |
test/test_basic.py |
Updated import from react_agent to agent |
rfc/006-ralph-loop.md |
New RFC document describing Ralph Loop design |
rfc/003-asyncio-migration.md |
Updated reference from react_agent.py to agent.py |
docs/*.md |
Updated imports from react_agent to agent across documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Final answer as a string. | ||
| """ | ||
| if verifier is None: | ||
| verifier = LLMVerifier(self.llm, terminal_ui) |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable terminal_ui is used without being imported or defined in this scope. It should likely be self._tui or imported from the utils module.
| terminal_ui.console.print( | ||
| f"\n[bold dark_orange]⚠ Verification skipped " | ||
| f"(max iterations {max_iterations} reached), returning last result[/bold dark_orange]" | ||
| ) |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable terminal_ui is used without being imported or defined. This should be replaced with a proper reference to the terminal UI instance, likely self._tui if available, or imported from utils.
| terminal_ui.console.print( | ||
| f"\n[bold green]✓ Verification passed " | ||
| f"(attempt {iteration}/{max_iterations}): {verification.reason}[/bold green]" | ||
| ) |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable terminal_ui is referenced without being defined in scope. Replace with appropriate terminal UI reference.
| f"Please address the feedback and provide a complete answer." | ||
| ) | ||
| # Print the incomplete result so the user can see what the agent produced | ||
| terminal_ui.print_unfinished_answer(result) |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable terminal_ui is used without being defined in this method's scope. This needs to be properly imported or referenced.
| terminal_ui.console.print( | ||
| f"\n[bold yellow]⟳ Verification feedback (attempt {iteration}/{max_iterations}): " | ||
| f"{verification.reason}[/bold yellow]" | ||
| ) |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable terminal_ui is referenced but not defined in the method scope. Ensure proper import or instance reference.
| messages.append(LLMMessage(role="user", content=feedback)) | ||
|
|
||
| # Should not reach here, but return last result as safety fallback | ||
| return result # type: ignore[possibly-undefined] |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback return statement uses result which may be undefined if max_iterations is 0 or negative. Consider adding validation for max_iterations > 0 at the start of the method, or initializing result to an empty string before the loop.
Add an opt-in outer verification loop that checks whether the agent's final answer truly satisfies the original task. When enabled, a verifier (separate LLM call) judges completion after each inner ReAct loop run. If incomplete, feedback is injected and the inner loop re-enters. Disabled by default (RALPH_LOOP_ENABLED=false). Safety-capped at 3 outer iterations by default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The verifier now distinguishes between one-time tasks and recurring tasks (e.g. "every N minutes do X"). For recurring tasks, an agent that stopped after a single execution is judged INCOMPLETE with feedback to continue the timer cycle. This fixes interval/cron timer tasks being incorrectly marked as complete after one iteration. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The module now contains both the ReAct loop and the Ralph verification loop, so the broader name is more accurate. Updated all imports across source, tests, examples, and docs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Each inner loop result is now printed via print_final_answer() before verification, so the user can see the agent's output even when the verifier judges it incomplete and re-enters the loop. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use a distinct yellow "Unfinished Answer" panel (rounded border) instead of the green "Final Answer" panel, so the user can clearly distinguish intermediate incomplete results from the final output. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
TimerTool (delay/interval/cron) was unstable. Remove the tool, its tests, and the recurring-task judgment rule from the verification prompt. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The outer verification loop is now the default behavior with no toggle. Only RALPH_LOOP_MAX_ITERATIONS remains as a configurable option. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Config.get_api_key()helper and pass API key from.aloop/configtoLiteLLMAdapter, fixingAuthenticationErrorwhen key was only in config file.react_agent.py→agent.pyto reflect it now contains both ReAct and Ralph loops.TimerTool(unstable delay/interval/cron scheduling).print_unfinished_answer()to display intermediate results with a distinct yellow "Unfinished Answer" panel.New files
agent/verification.py—Verifierprotocol,LLMVerifier,VerificationResulttest/test_ralph_loop.py— 8 async unit testsrfc/006-ralph-loop.md— design documentConfiguration
RALPH_LOOP_MAX_ITERATIONS(default3) — max outer verification attemptsTest plan
python main.py --task "Calculate 1+1"🤖 Generated with Claude Code