A Go implementation of Recursive Language Models (RLM) - an inference-time scaling strategy that enables LLMs to handle arbitrarily long contexts by treating prompts as external objects that can be programmatically examined and recursively processed.
RLM-go provides a Go REPL environment where LLM-generated code can:
- Access context stored as a variable
- Make recursive sub-LLM calls via
Query()andQueryBatched() - Use standard Go operations for text processing
- Signal completion with
FINAL()orFINAL_VAR()
Unlike the Python RLM which uses socket IPC, rlm-go uses direct function injection via Yaegi - a Go interpreter. This eliminates:
- Socket server overhead
- Serialization/deserialization
- Process boundaries
The result is ~100x less latency per sub-LLM call compared to socket IPC.
- Go 1.23 or later (for building from source)
- An LLM API key:
ANTHROPIC_API_KEYfor Claude models (default)GEMINI_API_KEYfor Gemini modelsOPENAI_API_KEYfor OpenAI models
- (Optional) Podman or Docker for isolated sandbox execution
| Provider | Models | Env Variable |
|---|---|---|
| Anthropic | claude-sonnet-4-20250514, claude-opus-4-20250514, etc. | ANTHROPIC_API_KEY |
| gemini-3-flash-preview, gemini-3-pro-preview | GEMINI_API_KEY |
|
| OpenAI | gpt-5, gpt-5-mini | OPENAI_API_KEY |
The provider is auto-detected based on model name. Anthropic is the default.
# Download and install the latest release
curl -fsSL https://raw.githubusercontent.com/XiaoConstantine/rlm-go/main/install.sh | bashThis installs the rlm binary to ~/.local/bin/rlm.
go install github.com/XiaoConstantine/rlm-go/cmd/rlm@latestgit clone https://github.com/XiaoConstantine/rlm-go.git
cd rlm-go
go build -o rlm ./cmd/rlmgo get github.com/XiaoConstantine/rlm-goRLM includes a skill for Claude Code that provides documentation and usage guidance for large context processing.
rlm install-claude-codeThis creates a skill at ~/.claude/skills/rlm/SKILL.md that teaches Claude Code:
- When to use RLM (contexts >50KB, token efficiency needed)
- Command usage and options
- The Query() and FINAL() patterns
- Token efficiency benefits (40% savings on large contexts)
After installation, restart Claude Code to activate the skill.
# Basic usage with Anthropic (default)
rlm -context file.txt -query "Summarize the key points"
# Use Gemini
rlm -model gemini-3-flash-preview -context file.txt -query "Analyze this data"
# Use OpenAI
rlm -model gpt-5-mini -context file.txt -query "Summarize this"
# Verbose output with iteration details
rlm -context logs.json -query "Find all errors" -verbose
# JSON output for programmatic use
rlm -context data.csv -query "Extract anomalies" -json
# Pipe context from stdin
cat largefile.txt | rlm -query "What patterns do you see?"| Flag | Description | Default |
|---|---|---|
-context |
Path to context file | - |
-context-string |
Context string directly | - |
-query |
Query to run against context | Required |
-model |
LLM model to use | claude-sonnet-4-20250514 |
-max-iterations |
Maximum iterations | 30 |
-verbose |
Enable verbose output | false |
-json |
Output result as JSON | false |
-log-dir |
Directory for JSONL logs | - |
package main
import (
"context"
"fmt"
"log"
"os"
"github.com/XiaoConstantine/rlm-go/pkg/rlm"
)
func main() {
// Create your LLM client (implements rlm.LLMClient and repl.LLMClient)
client := NewAnthropicClient(os.Getenv("ANTHROPIC_API_KEY"), "claude-sonnet-4-20250514")
// Create RLM instance
r := rlm.New(client, client,
rlm.WithMaxIterations(10),
rlm.WithVerbose(true),
)
// Run completion with long context
result, err := r.Complete(context.Background(), longDocument, "What are the key findings?")
if err != nil {
log.Fatal(err)
}
fmt.Printf("Answer: %s\n", result.Response)
fmt.Printf("Iterations: %d\n", result.Iterations)
fmt.Printf("Total Tokens: %d\n", result.TotalUsage.TotalTokens)
}┌──────────────────────────────────────────────┐
│ Single Go Process │
│ │
│ ┌──────────────┐ ┌──────────────────┐ │
│ │ Yaegi │───►│ LLM Client │ │
│ │ Interpreter │ │ (your impl) │ │
│ │ │ └──────────────────┘ │
│ │ - context │ │
│ │ - Query() │◄── direct function call │
│ │ - fmt.* │ │
│ └──────────────┘ │
│ ▲ │
│ │ Eval(code) │
│ ┌──────┴──────┐ │
│ │ RLM Loop │ │
│ └─────────────┘ │
└──────────────────────────────────────────────┘
No sockets. No IPC. No subprocess.
You need to implement two interfaces:
// For root LLM orchestration
type LLMClient interface {
Complete(ctx context.Context, messages []core.Message) (core.LLMResponse, error)
}
// For sub-LLM calls from REPL
type REPLClient interface {
Query(ctx context.Context, prompt string) (repl.QueryResponse, error)
QueryBatched(ctx context.Context, prompts []string) ([]repl.QueryResponse, error)
}See examples/basic for a complete Anthropic client implementation.
- Context Loading: Your context is injected into a Yaegi interpreter as a
contextvariable - Iteration Loop: The root LLM generates Go code in
```goblocks - Code Execution: Yaegi executes the code with access to
Query(),QueryBatched(),fmt,strings,regexp - Sub-LLM Calls:
Query()calls your LLM client directly (no IPC) - Completion: LLM signals done with
FINAL("answer")orFINAL_VAR(varName)
// Pre-imported packages
fmt, strings, regexp
// RLM functions
Query(prompt string) string // Single sub-LLM call
QueryBatched(prompts []string) []string // Concurrent sub-LLM calls
// Multi-depth recursion (when enabled)
QueryWithRLM(prompt string, depth int) string // Spawn nested RLM
CurrentDepth() int // Get current recursion depth
MaxDepth() int // Get max allowed depth
CanRecurse() bool // Check if more recursion allowed
// Your context
context // string variable with your dataEnable sub-LLMs to spawn their own sub-LLMs for complex decomposition tasks:
r := rlm.New(client, replClient,
rlm.WithMaxRecursionDepth(3), // Allow 3 levels of nesting
rlm.WithRecursionCallback(func(depth int, prompt string) {
log.Printf("Recursive call at depth %d", depth)
}),
)In the REPL, use QueryWithRLM() to spawn a nested RLM that can itself use Query():
// Depth 0 (root)
result := QueryWithRLM("Analyze each section in detail", 1)
// The sub-RLM (depth 1) can use Query() or QueryWithRLM() up to MaxDepthrlm.New(client, replClient,
rlm.WithMaxIterations(30), // Default: 30
rlm.WithSystemPrompt(custom), // Override system prompt
rlm.WithVerbose(true), // Enable console logging
rlm.WithLogger(logger), // Attach JSONL logger for session recording
)By default, rlm-go executes LLM-generated code in-process using Yaegi for maximum performance. For production environments or when running untrusted code, you can enable isolated sandbox execution using Podman (recommended) or Docker.
| Mode | Isolation | Latency | Security |
|---|---|---|---|
| Local (default) | None | ~0ms | Trusted code only |
| Podman/Docker | Full container | 50-200ms | Untrusted code safe |
Podman (Recommended - Open Source, Daemonless)
# macOS
brew install podman
podman machine init
podman machine start
# Linux (Fedora/RHEL)
sudo dnf install podman
# Linux (Ubuntu/Debian)
sudo apt install podmanDocker (Alternative)
# macOS
brew install --cask docker
# Start Docker Desktop
# Linux
sudo apt install docker.io
sudo systemctl start dockerimport (
"github.com/XiaoConstantine/rlm-go/pkg/rlm"
"github.com/XiaoConstantine/rlm-go/pkg/sandbox"
)
// Auto-detect best available backend (podman > docker > local)
r := rlm.New(client, replClient, rlm.WithSandbox())
// Use specific backend
r := rlm.New(client, replClient,
rlm.WithSandboxBackend(sandbox.BackendPodman))
// Custom configuration
cfg := sandbox.Config{
Backend: sandbox.BackendPodman,
Image: "golang:1.23-alpine", // Container image
Memory: "512m", // Memory limit
CPUs: 1.0, // CPU limit
Timeout: 60 * time.Second, // Execution timeout
NetworkMode: sandbox.NetworkNone, // Disable network (secure)
}
r := rlm.New(client, replClient, rlm.WithSandboxConfig(cfg))When sandbox mode is enabled:
- Code runs in an isolated container with resource limits
- Network is disabled by default (
--network=none) - Containers are auto-removed after execution (
--rm) Query()andQueryBatched()work via JSON IPC protocol- First execution pulls the Go image (~250MB, cached after)
# Check if Podman/Docker is available
podman --version # or: docker --version
# Test container execution
podman run --rm golang:1.23-alpine go version
# Run sandbox tests
go test ./pkg/sandbox/... -v -run TestContainerRLM-go provides complete token usage accounting across all LLM calls:
result, _ := r.Complete(ctx, context, query)
// Aggregated token usage
fmt.Printf("Prompt tokens: %d\n", result.TotalUsage.PromptTokens)
fmt.Printf("Completion tokens: %d\n", result.TotalUsage.CompletionTokens)
fmt.Printf("Total tokens: %d\n", result.TotalUsage.TotalTokens)Token counts include both root LLM calls and all sub-LLM calls made via Query() and QueryBatched().
Record sessions for analysis and visualization:
import "github.com/XiaoConstantine/rlm-go/pkg/logger"
// Create logger
log, _ := logger.New("./logs", "session-001")
defer log.Close()
// Attach to RLM
r := rlm.New(client, replClient, rlm.WithLogger(log))Log format includes:
- Session metadata (model, max iterations, context info)
- Per-iteration details (prompts, responses, executed code)
- Sub-LLM call records with token counts
- Compatible with the Python RLM visualizer
Compare RLM accuracy against baseline direct LLM calls:
go run ./cmd/benchmark/main.go \
-tasks tasks.json \
-model claude-sonnet-4-20250514 \
-num-tasks 10 \
-log-dir ./logs \
-output results.json \
-verboseFeatures:
- Load tasks from JSON or generate samples
- Track accuracy, execution time, token usage
- Flexible answer matching (exact, word-boundary, numeric)
Interactive CLI viewer for JSONL session logs:
go run ./cmd/rlm-viewer/main.go ./logs/session.jsonl
# Watch mode for real-time viewing
go run ./cmd/rlm-viewer/main.go -watch ./logs/session.jsonl
# Filter by iteration
go run ./cmd/rlm-viewer/main.go -iter 3 ./logs/session.jsonl
# Interactive navigation mode
go run ./cmd/rlm-viewer/main.go -interactive ./logs/session.jsonlFeatures:
- Color-coded output (system, user, assistant messages)
- Code block display with execution results
- Token usage tracking per LLM call
- Interactive navigation
rlm-go/
├── pkg/
│ ├── core/ # Core types (Message, CompletionResult, UsageStats)
│ ├── rlm/ # Main RLM orchestration engine
│ ├── repl/ # Yaegi-based Go interpreter
│ ├── sandbox/ # Isolated execution (Podman/Docker)
│ ├── parsing/ # LLM response parsing utilities
│ └── logger/ # JSONL session logging
├── cmd/
│ ├── benchmark/ # RLM vs baseline comparison tool
│ └── rlm-viewer/ # JSONL log viewer
└── examples/
└── basic/ # Complete Anthropic client example
# Run all tests
go test -v ./...
# Run with race detection and coverage
go test -race -v ./... -coverprofile coverage.txt- Recursive Language Models Paper (Zhang, Kraska, Khattab - MIT CSAIL)
- Python RLM Implementation
- Yaegi Go Interpreter
MIT License - see LICENSE