feat: add Ralph Loop outer verification for task completion #48

luohaha · 2026-01-30T15:32:40Z

Summary

Add an outer verification loop (Ralph Loop) that checks whether the agent's final answer truly completes the task. If incomplete, feedback is injected and the inner ReAct loop re-enters. Safety-capped at configurable max iterations (default 3).
Add Config.get_api_key() helper and pass API key from .aloop/config to LiteLLMAdapter, fixing AuthenticationError when key was only in config file.
Rename react_agent.py → agent.py to reflect it now contains both ReAct and Ralph loops.
Remove TimerTool (unstable delay/interval/cron scheduling).
Add print_unfinished_answer() to display intermediate results with a distinct yellow "Unfinished Answer" panel.

New files

agent/verification.py — Verifier protocol, LLMVerifier, VerificationResult
test/test_ralph_loop.py — 8 async unit tests
rfc/006-ralph-loop.md — design document

Configuration

RALPH_LOOP_MAX_ITERATIONS (default 3) — max outer verification attempts

Test plan

All 8 ralph loop unit tests pass (mocked, no API keys needed)
Full test suite passes (292 passed, 2 skipped)
Strict typecheck passes
Smoke test with configured provider: python main.py --task "Calculate 1+1"

🤖 Generated with Claude Code

Copilot

Pull request overview

This PR adds an outer verification loop (Ralph Loop) that validates whether the agent's final answer truly completes the task. If the answer is incomplete, feedback is injected and the inner ReAct loop re-enters with up to 3 attempts (configurable). The PR also fixes an authentication issue by passing API keys from config to the LLM adapter, removes the unstable TimerTool, and renames react_agent.py to agent.py to reflect the new dual-loop architecture.

Changes:

Implemented Ralph Loop outer verification with Verifier protocol and LLMVerifier implementation
Added Config.get_api_key() helper and fixed LiteLLMAdapter authentication
Removed TimerTool and all associated tests

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`agent/verification.py`	New verification interface with `Verifier` protocol, `VerificationResult` dataclass, and `LLMVerifier` implementation
`agent/base.py`	Added `_ralph_loop()` method that wraps `_react_loop` with verification logic
`agent/agent.py`	Updated `ReActAgent.run()` to use `_ralph_loop` instead of `_react_loop` directly
`config.py`	Added `RALPH_LOOP_MAX_ITERATIONS` config and `get_api_key()` helper method
`main.py`	Removed `TimerTool` import and added API key parameter to `LiteLLMAdapter`
`utils/terminal_ui.py`	Added `print_unfinished_answer()` function for displaying intermediate results
`tools/notify.py`	Simplified description by removing config requirement details
`tools/timer.py`	Deleted entire file (95 lines)
`test/test_ralph_loop.py`	New test file with 8 unit tests for Ralph Loop functionality
`test/test_timer_tool.py`	Deleted entire test file (100 lines)
`test/test_smart_edit_integration.py`	Updated import from `react_agent` to `agent`
`test/test_basic.py`	Updated import from `react_agent` to `agent`
`rfc/006-ralph-loop.md`	New RFC document describing Ralph Loop design
`rfc/003-asyncio-migration.md`	Updated reference from `react_agent.py` to `agent.py`
`docs/*.md`	Updated imports from `react_agent` to `agent` across documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-30T17:25:45Z

agent/base.py

+            Final answer as a string.
+        """
+        if verifier is None:
+            verifier = LLMVerifier(self.llm, terminal_ui)


The variable terminal_ui is used without being imported or defined in this scope. It should likely be self._tui or imported from the utils module.

Copilot · 2026-01-30T17:25:46Z

agent/base.py

+                terminal_ui.console.print(
+                    f"\n[bold dark_orange]⚠ Verification skipped "
+                    f"(max iterations {max_iterations} reached), returning last result[/bold dark_orange]"
+                )


The variable terminal_ui is used without being imported or defined. This should be replaced with a proper reference to the terminal UI instance, likely self._tui if available, or imported from utils.

Copilot · 2026-01-30T17:25:46Z

agent/base.py

+                terminal_ui.console.print(
+                    f"\n[bold green]✓ Verification passed "
+                    f"(attempt {iteration}/{max_iterations}): {verification.reason}[/bold green]"
+                )


The variable terminal_ui is referenced without being defined in scope. Replace with appropriate terminal UI reference.

Copilot · 2026-01-30T17:25:46Z

agent/base.py

+                f"Please address the feedback and provide a complete answer."
+            )
+            # Print the incomplete result so the user can see what the agent produced
+            terminal_ui.print_unfinished_answer(result)


The variable terminal_ui is used without being defined in this method's scope. This needs to be properly imported or referenced.

Copilot · 2026-01-30T17:25:46Z

agent/base.py

+            terminal_ui.console.print(
+                f"\n[bold yellow]⟳ Verification feedback (attempt {iteration}/{max_iterations}): "
+                f"{verification.reason}[/bold yellow]"
+            )


The variable terminal_ui is referenced but not defined in the method scope. Ensure proper import or instance reference.

Copilot · 2026-01-30T17:25:46Z

agent/base.py

+                messages.append(LLMMessage(role="user", content=feedback))
+
+        # Should not reach here, but return last result as safety fallback
+        return result  # type: ignore[possibly-undefined]


The fallback return statement uses result which may be undefined if max_iterations is 0 or negative. Consider adding validation for max_iterations > 0 at the start of the method, or initializing result to an empty string before the loop.

Add an opt-in outer verification loop that checks whether the agent's final answer truly satisfies the original task. When enabled, a verifier (separate LLM call) judges completion after each inner ReAct loop run. If incomplete, feedback is injected and the inner loop re-enters. Disabled by default (RALPH_LOOP_ENABLED=false). Safety-capped at 3 outer iterations by default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The verifier now distinguishes between one-time tasks and recurring tasks (e.g. "every N minutes do X"). For recurring tasks, an agent that stopped after a single execution is judged INCOMPLETE with feedback to continue the timer cycle. This fixes interval/cron timer tasks being incorrectly marked as complete after one iteration. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The module now contains both the ReAct loop and the Ralph verification loop, so the broader name is more accurate. Updated all imports across source, tests, examples, and docs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Each inner loop result is now printed via print_final_answer() before verification, so the user can see the agent's output even when the verifier judges it incomplete and re-enters the loop. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Use a distinct yellow "Unfinished Answer" panel (rounded border) instead of the green "Final Answer" panel, so the user can clearly distinguish intermediate incomplete results from the final output. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

TimerTool (delay/interval/cron) was unstable. Remove the tool, its tests, and the recurring-task judgment rule from the verification prompt. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The outer verification loop is now the default behavior with no toggle. Only RALPH_LOOP_MAX_ITERATIONS remains as a configurable option. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Li0k requested a review from Copilot January 30, 2026 17:24

Copilot AI reviewed Jan 30, 2026

View reviewed changes

luohaha and others added 9 commits January 31, 2026 11:19

refactor: rename react_agent.py to agent.py

ac95cd8

The module now contains both the ReAct loop and the Ralph verification loop, so the broader name is more accurate. Updated all imports across source, tests, examples, and docs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: remove config requirement from NotifyTool description

ad69b39

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

refactor: remove TimerTool and related verification logic

51edb10

TimerTool (delay/interval/cron) was unstable. Remove the tool, its tests, and the recurring-task judgment rule from the verification prompt. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

refactor: remove RALPH_LOOP_ENABLED, ralph loop is always on

9f8ff8c

The outer verification loop is now the default behavior with no toggle. Only RALPH_LOOP_MAX_ITERATIONS remains as a configurable option. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: sort imports in verification.py for isort

bd1e2d4

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

luohaha force-pushed the ralph-loop branch from 0edaf10 to bd1e2d4 Compare January 31, 2026 03:22

luohaha merged commit dc66379 into main Jan 31, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Ralph Loop outer verification for task completion #48

feat: add Ralph Loop outer verification for task completion #48

luohaha commented Jan 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 30, 2026

Uh oh!

Copilot AI Jan 30, 2026

Uh oh!

Copilot AI Jan 30, 2026

Uh oh!

Copilot AI Jan 30, 2026

Uh oh!

Copilot AI Jan 30, 2026

Uh oh!

Copilot AI Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add Ralph Loop outer verification for task completion #48

feat: add Ralph Loop outer verification for task completion #48

Conversation

luohaha commented Jan 30, 2026

Summary

New files

Configuration

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants