Skip to content

feat: Implement Convoy Runtime for Multi-Agent Orchestration #97

@eltmon

Description

@eltmon

Summary

Panopticon has extensive designed but unimplemented infrastructure for multi-agent orchestration (convoys). The review-agent, in particular, is supposed to spawn specialized sub-agents for security, performance, and correctness reviews, but cannot actually do this because:

  1. The subagent types referenced (code-review-performance, code-review-security, etc.) are not valid Claude Code Task tool subagent_types
  2. The convoy runtime that would spawn and coordinate these agents does not exist
  3. Commands referenced in skills (pan convoy start/status/synthesize) don't exist

This issue tracks building the convoy runtime to make multi-agent orchestration a reality.


Current State Analysis

What EXISTS (But Isn't Wired Up)

Asset Location Status
Agent Templates /agents/*.md ✅ Defined, ❌ Not used
code-review-correctness.md /agents/ Haiku model, correctness focus
code-review-security.md /agents/ Sonnet model, OWASP Top 10
code-review-performance.md /agents/ Haiku model, N+1, execSync
code-review-synthesis.md /agents/ Sonnet model, combines findings
planning-agent.md /agents/ Codebase exploration
triage-agent.md /agents/ Issue categorization
health-monitor.md /agents/ Agent health checks
codebase-explorer.md /agents/ Code navigation
Convoy Templates src/lib/convoy-templates.ts ✅ Defined, ❌ Never imported
CODE_REVIEW_TEMPLATE convoy-templates.ts 3 parallel reviewers + synthesis
PLANNING_TEMPLATE convoy-templates.ts Single planning agent
TRIAGE_TEMPLATE convoy-templates.ts Parallel issue triage
HEALTH_MONITOR_TEMPLATE convoy-templates.ts Health monitoring
Skills referencing convoys ~/.panopticon/skills/ ❌ Reference non-existent commands
pan-convoy-synthesis/ skills References pan convoy status/synthesize
pan-code-review/ skills Describes parallel review workflow

What DOESN'T Exist

Component Expected Actual
pan convoy CLI command pan convoy start/status/stop/synthesize Does not exist
Convoy runtime Spawn agents, track state, coordinate No implementation
Custom subagent_types code-review-performance, etc. Not valid - Task tool only accepts: Bash, general-purpose, Explore, Plan

The Broken Flow

review-agent.md prompt says:

Task(subagent_type='code-review-performance', prompt='Review for performance issues: [files]')
Task(subagent_type='code-review-security', prompt='Review for security issues: [files]')
Task(subagent_type='beads-completion-check', prompt='Check if all beads are closed')

What actually happens: Nothing. These are not valid subagent_types.


Requirements

1. Convoy Runtime (src/lib/convoy.ts)

Build the core convoy orchestration engine:

interface ConvoyState {
  id: string;                    // Unique convoy ID (e.g., "convoy-review-1706123456")
  template: string;              // Template name (e.g., "code-review")
  status: 'running' | 'completed' | 'failed' | 'partial';
  agents: ConvoyAgentState[];    // State of each agent in convoy
  startedAt: string;             // ISO timestamp
  completedAt?: string;
  outputDir: string;             // Where agents write output
  context: Record<string, any>;  // Shared context (files to review, PR URL, etc.)
}

interface ConvoyAgentState {
  role: string;                  // From template (e.g., "security")
  subagent: string;              // Template name (e.g., "code-review-security")
  tmuxSession: string;           // Tmux session name
  status: 'pending' | 'running' | 'completed' | 'failed';
  startedAt?: string;
  completedAt?: string;
  outputFile?: string;           // Where this agent wrote its output
  exitCode?: number;
}

Functions to implement:

// Start a new convoy
async function startConvoy(
  templateName: string,
  context: { projectPath: string; files?: string[]; prUrl?: string; issueId?: string }
): Promise<ConvoyState>

// Get convoy status
function getConvoyStatus(convoyId: string): ConvoyState | undefined

// List active convoys
function listConvoys(filter?: { status?: string }): ConvoyState[]

// Stop a convoy (kills all agent sessions)
async function stopConvoy(convoyId: string): Promise<void>

// Wait for convoy completion
async function waitForConvoy(convoyId: string, timeoutMs?: number): Promise<ConvoyState>

2. CLI Commands (src/cli/commands/convoy/)

# Start a convoy from template
pan convoy start code-review --files "src/**/*.ts" --pr-url "https://..."
pan convoy start planning --issue MIN-123

# Check convoy status
pan convoy status <convoy-id>
pan convoy status --all

# List convoys
pan convoy list
pan convoy list --status running

# Stop a convoy
pan convoy stop <convoy-id>

# Run synthesis after convoy completes
pan convoy synthesize <convoy-id>

3. Agent Spawning

Each agent in a convoy needs to be spawned as a Claude Code process in tmux:

async function spawnConvoyAgent(
  convoy: ConvoyState,
  agent: ConvoyAgent,
  context: Record<string, any>
): Promise<void> {
  const sessionName = `convoy-${convoy.id}-${agent.role}`;
  const templatePath = path.join(AGENTS_DIR, `${agent.subagent}.md`);
  const template = readFileSync(templatePath, 'utf-8');
  
  // Parse frontmatter for model, tools
  const { model, tools, content } = parseAgentTemplate(template);
  
  // Build prompt with context
  const prompt = buildAgentPrompt(content, context);
  
  // Spawn Claude in tmux
  await spawnInTmux(sessionName, {
    command: 'claude',
    args: ['--model', model, '-p', prompt],
    env: {
      PANOPTICON_CONVOY_ID: convoy.id,
      PANOPTICON_AGENT_ROLE: agent.role,
    }
  });
}

4. Parallel Execution with Dependencies

The convoy runtime must respect the dependsOn field in templates:

// CODE_REVIEW_TEMPLATE phases:
// Phase 1 (parallel): correctness, security, performance
// Phase 2 (after phase 1): synthesis

async function executeConvoy(convoy: ConvoyState): Promise<void> {
  const phases = getExecutionOrder(convoy.template); // Already implemented in convoy-templates.ts
  
  for (const phase of phases) {
    // Spawn all agents in this phase in parallel
    await Promise.all(
      phase.map(agent => spawnConvoyAgent(convoy, agent, convoy.context))
    );
    
    // Wait for all agents in phase to complete
    await waitForPhaseCompletion(convoy, phase);
  }
}

5. Integration with Review Pipeline

Replace the current spawnReviewAgent() with convoy-based review:

Before (current broken implementation):

// review-agent.ts
const args = ['--model', 'sonnet', '--print', '-p', prompt];
// Single agent, can't spawn subagents

After:

// When review is triggered:
const convoy = await startConvoy('code-review', {
  projectPath: workspace.path,
  prUrl: workspace.prUrl,
  issueId: workspace.issueId,
  files: await getChangedFiles(workspace)
});

// Wait for convoy completion
const result = await waitForConvoy(convoy.id, 20 * 60 * 1000);

// Synthesis output is at .claude/reviews/<timestamp>-synthesis.md

6. Dashboard Integration

Add convoy visibility to the dashboard:

  • Convoy panel showing active convoys and their agents
  • Real-time status updates via WebSocket
  • Agent output streaming from each convoy agent
  • Convoy history for completed reviews

API endpoints:

GET  /api/convoys              - List all convoys
GET  /api/convoys/:id          - Get convoy details
POST /api/convoys/start        - Start a new convoy
POST /api/convoys/:id/stop     - Stop a convoy
GET  /api/convoys/:id/output   - Get combined output

Implementation Plan

Phase 1: Core Runtime

  1. Create src/lib/convoy.ts with state management
  2. Implement agent template parsing (frontmatter: model, tools)
  3. Implement spawnConvoyAgent() function
  4. Implement startConvoy() with parallel phase execution
  5. Implement waitForConvoy() with timeout handling
  6. Add convoy state persistence (.panopticon/convoys/)

Phase 2: CLI Commands

  1. Create src/cli/commands/convoy/index.ts with subcommands
  2. Implement pan convoy start <template>
  3. Implement pan convoy status [convoy-id]
  4. Implement pan convoy list
  5. Implement pan convoy stop <convoy-id>
  6. Implement pan convoy synthesize <convoy-id>

Phase 3: Review Pipeline Integration

  1. Refactor spawnReviewAgent() to use convoy system
  2. Update review prompt to not reference invalid subagent_types
  3. Connect synthesis output to review status updates
  4. Update dashboard review flow to use convoys

Phase 4: Dashboard Integration

  1. Add convoy API endpoints to server
  2. Create ConvoyPanel component
  3. Add real-time convoy status via WebSocket
  4. Show convoy agents in specialist view

Cleanup Required

Dead Code to Remove/Update

Item Action
convoy-templates.ts KEEP - Will be used by convoy runtime
review-agent.md prompt UPDATE - Remove references to invalid subagent_types
pan-convoy-synthesis/SKILL.md UPDATE - Will work once CLI exists
pan-code-review/SKILL.md UPDATE - Reference convoy system

Skills to Update

  1. pan-convoy-synthesis/ - Currently references non-existent pan convoy commands. Keep but update once commands exist.

  2. pan-code-review/ - Describes using Task(subagent_type='code-review-*') which doesn't work. Update to use convoy system instead.

Prompts to Update

  1. src/lib/cloister/prompts/review-agent.md - Remove sections telling agent to spawn subagents via Task tool (lines 29-82). Instead, the orchestration happens in code.

Testing Strategy

Unit Tests

  • Convoy state management
  • Template parsing
  • Phase execution order calculation (already exists)
  • Agent spawn/completion detection

Integration Tests

  • Full convoy lifecycle (start → run → complete)
  • Parallel agent execution
  • Dependency waiting
  • Timeout handling
  • Failure recovery

E2E Tests

  • pan convoy start code-review with real files
  • Review pipeline with actual PR
  • Dashboard convoy visualization

Success Criteria

  1. pan convoy start code-review --files "src/**/*.ts" spawns 3 parallel reviewers + synthesis
  2. Review pipeline uses convoy system instead of single-agent print mode
  3. Dashboard shows active convoys with real-time agent status
  4. All convoy-related skills work with actual commands
  5. Agent templates in /agents/ are actively used
  6. No dead code or references to non-existent features

Related Issues

Files to Modify

New files:

  • src/lib/convoy.ts - Core convoy runtime
  • src/cli/commands/convoy/index.ts - CLI command group
  • src/cli/commands/convoy/start.ts
  • src/cli/commands/convoy/status.ts
  • src/cli/commands/convoy/list.ts
  • src/cli/commands/convoy/stop.ts
  • src/cli/commands/convoy/synthesize.ts
  • src/dashboard/frontend/src/components/ConvoyPanel.tsx

Files to modify:

  • src/lib/cloister/review-agent.ts - Use convoy system
  • src/lib/cloister/prompts/review-agent.md - Remove invalid subagent references
  • src/dashboard/server/index.ts - Add convoy API endpoints
  • ~/.panopticon/skills/pan-code-review/SKILL.md - Update for convoy
  • ~/.panopticon/skills/pan-convoy-synthesis/SKILL.md - Will work once commands exist

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestin-progressWork is actively being done

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions