pi-supervisor

A pi extension that supervises the coding agent and steers it toward a defined outcome. It observes every conversation turn, injects guiding messages when the agent drifts, and signals when the goal is reached — like a tech lead watching over a dev's shoulder.

A supervisor as the intelligent overseer keeping the agent on track.

Status: Early release.

pi-supervise.mp4

How It Works

/supervise Implement a secure JWT auth system with refresh tokens and full test coverage

Then start the conversation normally — the supervisor watches from outside without modifying the agent's context.

After each run — a separate supervisor LLM analyzes the conversation against the goal (all sensitivities)
Mid-run, between tool calls — also checks for drift on medium and high sensitivity and can steer the agent without waiting for it to finish
On completion — supervisor signals done and stops automatically

The supervisor is a pure outside observer. It runs in a separate in-memory pi session sharing only the API credentials and never touches the main agent's context window or system prompt.

Install

pi install npm:pi-supervisor

Or load directly for development:

pi -e ~/projects/pi-supervisor/src/index.ts

Commands

Command	Description
`/supervise <outcome>`	Start supervising toward a desired outcome
`/supervise stop`	Stop active supervision
`/supervise status`	Show current outcome, model, sensitivity, and intervention history
`/supervise widget`	Toggle the status widget on/off
`/supervise model`	Open the interactive model picker
`/supervise model <provider/modelId>`	Set supervisor model directly
`/supervise sensitivity <low\|medium\|high>`	Adjust steering aggressiveness

Examples

/supervise Refactor the auth module to use dependency injection and add 90% test coverage

/supervise model
# Opens pi's model selector — pick any model with a configured API key

/supervise sensitivity low
# Only steer when seriously off track

/supervise stop

The agent can also initiate supervision itself by calling the start_supervision tool — useful when it recognises a task needs goal tracking. Once active, supervision is locked: only the user can change or stop it.

UI

Footer (always visible while supervising):

🎯

Widget (one line, updated live):

◉ Supervising · Goal: "Refactor auth module…" · claude-haiku · ↗ 2 · ⟳ turn 4
  The agent has added the DI container but hasn't updated the existing call sites yet…

The second line shows the supervisor's reasoning as it streams. Toggle the widget with /supervise widget.

Sensitivity Levels

Level	When it checks	Confidence threshold	Steering style
`low`	End of each run only	—	Only if seriously off track
`medium` (default)	End of run + every 3rd tool cycle mid-run	≥ 0.90	On clear drift
`high`	End of run + every tool cycle mid-run	≥ 0.85	Proactively

End-of-run (agent_end): fires once per user prompt after the agent finishes and goes idle. The supervisor must decide done, steer, or continue.

Mid-run (turn_end): fires after each LLM tool-call cycle while the agent is still working. Steering is injected immediately (interrupting the current run) only when confidence exceeds the threshold. The agent has at least 2 sub-turns to settle before mid-run checks begin.

Supervisor Model

The supervisor runs on a separate model — it can be a cheaper/faster model than the one doing the actual work.

Resolution order:

Previous session state (persists within a session)
.pi/supervisor-config.json in the project root (saved by /supervise model)
Active chat model (ctx.model) — so it works out of the box with no configuration
Built-in default: anthropic/claude-haiku-4-5-20251001

Change at any time with /supervise model (interactive picker) or /supervise model <provider/id> (direct). The selection is saved to .pi/supervisor-config.json if the .pi/ directory exists.

Focus and Goal Discipline

The supervisor is a pure outside observer — it does not modify the agent's system prompt. Goal discipline is enforced entirely through steering messages when the agent drifts. If the agent asks an out-of-scope clarifying question, the supervisor redirects it back to the goal rather than answering.

Stagnation Detection

If the supervisor sends 5 consecutive steering messages without declaring the goal done, it switches to a lenient evaluation mode: if the goal is ≥80% achieved, it declares done rather than looping forever on minor improvements. The threshold is configurable via MAX_IDLE_STEERS in src/index.ts.

Customizing the Supervisor: SUPERVISOR.md

The supervisor's reasoning is controlled by its system prompt — not the goal. The goal is always set at runtime via /supervise <outcome>. SUPERVISOR.md defines how the supervisor thinks: its rules, persona, and project-specific constraints.

Discovery order (mirrors pi's SYSTEM.md convention):

Priority	Location	Use for
1	`.pi/SUPERVISOR.md`	Project-specific rules
2	`~/.pi/agent/SUPERVISOR.md`	Global personal rules
3	Built-in template	Fallback

The active source is shown when you run /supervise <outcome> or /supervise status.

Built-in system prompt

The default prompt the supervisor uses when no SUPERVISOR.md is found:

You are a supervisor monitoring a coding AI assistant conversation.
Your job: ensure the assistant fully achieves a specific outcome without needing the human to intervene.

═══ WHEN THE AGENT IS IDLE (finished its turn, waiting for user input) ═══
This is your most important moment. The agent has stopped and is waiting.
You MUST choose "done" or "steer". Never return "continue" when the agent is idle.

- "done"  → only when the outcome is completely and verifiably achieved.
- "steer" → everything else: incomplete work, partial progress, open questions, waiting for confirmation.

If the agent asked a clarifying question or needs a decision:
  FIRST check: is this question necessary to achieve the goal?
  - YES (directly blocks goal progress): answer with a sensible default and tell agent to proceed.
  - NO (out of scope, nice-to-have, unrelated feature): do NOT answer it. Redirect:
    "That's outside the scope of the goal. Focus on: [restate the specific missing piece]."
  DO NOT answer: passwords, credentials, secrets, anything requiring real user knowledge.

Your steer message speaks AS the user. Make it clear, direct, and actionable (1–3 sentences).
Do not ask the agent to verify its own work — tell it what to do next.

═══ WHEN THE AGENT IS ACTIVELY WORKING (mid-turn) ═══
Only intervene if it is clearly heading in the wrong direction.
Trust the agent to complete what it has started. Avoid interrupting productive work.

═══ STEERING RULES ═══
- Be specific: reference the outcome, missing pieces, or the question being answered.
- Never repeat a steering message that had no effect — escalate or change approach.
- A good steer answers the agent's question OR redirects to the missing piece of the outcome.
- If the agent is taking shortcuts to satisfy the goal without properly achieving it, always steer and remind it not to take shortcuts.

"done" CRITERIA: The core outcome is complete and functional. Minor polish, style tweaks, or
optional improvements do NOT block "done". Prefer stopping when the goal is substantially
achieved rather than looping forever chasing perfection.

Respond ONLY with valid JSON — no prose, no markdown fences.
Response schema (strict JSON):
{
  "action": "continue" | "steer" | "done",
  "message": "...",     // Required when action === "steer"
  "reasoning": "...",   // Brief internal reasoning
  "confidence": 0.85    // Float 0-1
}

Writing a custom SUPERVISOR.md

You must preserve the JSON response schema. Everything else is up to you.

You are a supervisor for a TypeScript project. Your priorities: type safety and test coverage.

Rules:
- Steer if the agent uses `any` types or skips tests for new code
- When steering, be direct: one sentence max, reference the specific file/function if possible
- "done" only when the new code has types and tests — not before
- Do not steer about code style, naming, or documentation

Response schema (strict JSON, required):
{
  "action": "continue" | "steer" | "done",
  "message": "...",
  "reasoning": "...",
  "confidence": 0.85
}

Session Persistence

Supervision state (outcome, model, sensitivity, intervention history) is stored in the pi session file and restored automatically on restart, session switch, fork, and tree navigation.

Project Structure

src/
  index.ts              # Extension entry point, event wiring, /supervise command, start_supervision tool
  types.ts              # SupervisorState, SteeringDecision, ConversationMessage
  state.ts              # SupervisorStateManager — in-memory state + session persistence
  engine.ts             # Snapshot building, SUPERVISOR.md loading, prompt construction, analyze()
  model-client.ts       # One-shot supervisor LLM calls via pi's AgentSession API
  workspace-config.ts   # .pi/supervisor-config.json read/write for model persistence
  ui/
    status-widget.ts    # 🎯 footer badge + one-line widget with live thinking stream
    model-picker.ts     # Interactive model picker using pi's ModelSelectorComponent

License

MIT — tintinweb

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
media		media
src		src
.gitignore		.gitignore
.npmignore		.npmignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pi-supervisor

How It Works

Install

Commands

Examples

UI

Sensitivity Levels

Supervisor Model

Focus and Goal Discipline

Stagnation Detection

Customizing the Supervisor: SUPERVISOR.md

Built-in system prompt

Writing a custom SUPERVISOR.md

Session Persistence

Project Structure

License

About

Uh oh!

Releases 2

Languages

License

tintinweb/pi-supervisor

Folders and files

Latest commit

History

Repository files navigation

pi-supervisor

How It Works

Install

Commands

Examples

UI

Sensitivity Levels

Supervisor Model

Focus and Goal Discipline

Stagnation Detection

Customizing the Supervisor: SUPERVISOR.md

Built-in system prompt

Writing a custom SUPERVISOR.md

Session Persistence

Project Structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Languages