reins

The open-source toolkit for Harness Engineering — OpenAI's methodology for building software where humans steer and agents execute.

OpenAI published the methodology. We built the tooling.

Why Reins exists

Reins came from real project pressure: as the harness improved, coding agents became more autonomous, consistent, and reliable. The problem was portability. Those gains were trapped in one repo.

Reins packages the same approach so you can apply it to any project: scaffold the harness, score readiness, diagnose gaps, and iteratively evolve toward stronger agent autonomy.

The model (for humans and agents)

Layer	Role	Source of truth
Skill	Control plane. Teaches the agent when to run Reins and how to interpret output.	`skill/Reins/SKILL.md`
CLI	Execution plane. Produces deterministic JSON for `init`, `audit`, `doctor`, `evolve`.	`cli/reins/src/lib/commands/` (+ routed via `cli/reins/src/index.ts`)
Human	Steering plane. Sets goals, accepts tradeoffs, and decides product/taste direction.	Prompts + repo decisions

If this split is unclear, agents drift: they either skip Reins or use it incorrectly. Reins is designed so agents can repeatedly improve repo quality with explicit, machine-readable feedback loops.

Quick start

1. Install the skill so your agent knows how to use Reins:

npx skills add WellDunDun/reins

The skill teaches your agent when and how to run every Reins command — command priority (local source vs. npx), JSON output parsing, and when to pair audit with doctor for remediation detail. Once installed, you talk:

You:   "Audit this codebase and show me the weakest dimensions"
Agent: runs reins audit, parses JSON, summarizes gaps

You:   "Scaffold harness engineering in this repo"
Agent: runs reins init, then walks you through customization

You:   "Evolve to the next maturity level"
Agent: runs audit, identifies current level, executes the evolution path

2. Or run the CLI directly for a quick score without the skill:

# "." means "current directory"
npx reins-cli@latest audit .

{
  "total_score": 6,
  "max_score": 18,
  "maturity_level": "L1: Assisted",
  "recommendations": [
    "Create ARCHITECTURE.md with domain map and layer rules",
    "Add linter configuration to enforce architectural constraints",
    "Create docs/golden-principles.md with mechanical taste rules"
  ]
}

Keep Reins fresh

# Check whether your installed skills are outdated
npx skills check

# Update installed skills (including Reins) when updates are available
npx skills update

If you run Reins directly (without the skill), prefer npx reins-cli@latest ... so agents always use the latest published CLI.

The steering loop

Install/refresh skill -> Audit -> Doctor/Evolve -> Apply changes -> Re-audit

That loop is the product: repeatedly steering agents toward a better repository state.

Why teams adopt Reins

Most agent rollouts fail for one boring reason: agents can edit code, but the repository doesn't teach them how to reason safely.

Reins gives you a repeatable operating system for agent work:

A map (AGENTS.md, architecture docs, indexed decisions)
A score (0-18 maturity audit with concrete gaps)
A plan (next-step evolution path by maturity level)
A guardrail model (risk-policy.json + CI enforcement signals)

Where Reins fits

Agent-first development has multiple layers. Reins operates at the repository structure layer — complementary to session orchestration tools, not competing with them.

block-beta
    columns 1
    block:L3:1
        columns 2
        A["SESSION EXECUTION"] B["GSD, Flow-Next, etc."]
    end
    block:L2:1
        columns 2
        C["REPO READINESS"] D["Reins"]
    end
    block:L1:1
        columns 2
        E["THE CODEBASE"] F["Your project"]
    end

    style L3 fill:#1e293b,stroke:#3b82f6,color:#e2e8f0
    style L2 fill:#1e3a5f,stroke:#60a5fa,color:#e2e8f0
    style L1 fill:#0f172a,stroke:#475569,color:#94a3b8
    style A fill:transparent,stroke:none,color:#93c5fd
    style B fill:transparent,stroke:none,color:#64748b
    style C fill:transparent,stroke:none,color:#60a5fa
    style D fill:transparent,stroke:none,color:#f472b6
    style E fill:transparent,stroke:none,color:#93c5fd
    style F fill:transparent,stroke:none,color:#64748b

Concern	Reins	Session orchestrators
When you use it	Once per repo, then evolve periodically	Every coding session
What it produces	Docs, audit scores, maturity roadmaps	Working code
What it prevents	Organizational rot, undocumented architecture	Context rot, wasted tokens

Use them together. Reins scaffolds your repo so AGENTS.md tells the agent where everything is, ARCHITECTURE.md defines the rules, and golden principles are enforced in CI. Then a session orchestrator runs the actual coding work on top of that well-structured repo.

The four commands

graph LR
    Init["reins init\nScaffold"] --> Audit["reins audit\nScore 0-18"]
    Audit --> Evolve["reins evolve\nLevel up"]
    Evolve --> Doctor["reins doctor\nHealth check"]
    Doctor --> Audit

    style Init fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
    style Audit fill:#1e3a5f,stroke:#60a5fa,color:#e2e8f0
    style Evolve fill:#1e3a5f,stroke:#818cf8,color:#e2e8f0
    style Doctor fill:#1e3a5f,stroke:#a78bfa,color:#e2e8f0

reins init .           # Scaffold the full structure
reins init . --pack auto  # Adaptive pack selection from project signals
reins init . --pack agent-factory  # Optional advanced automation pack
reins audit .          # Score against harness principles (0-18)
reins evolve .         # Roadmap to next maturity level
reins doctor .         # Health check with prescriptive fixes

The maturity model

Every repo sits on a maturity spectrum. The audit tells you where you are. The evolve command tells you what to do next.

graph LR
    L0["L0: Manual\n0-4"] --> L1["L1: Assisted\n5-8"]
    L1 --> L2["L2: Steered\n9-13"]
    L2 --> L3["L3: Autonomous\n14-16"]
    L3 --> L4["L4: Self-Correcting\n17-18"]

    style L0 fill:#1e293b,stroke:#475569,color:#94a3b8
    style L1 fill:#1e293b,stroke:#3b82f6,color:#93c5fd
    style L2 fill:#1e3a5f,stroke:#60a5fa,color:#e2e8f0
    style L3 fill:#1e3a5f,stroke:#818cf8,color:#e2e8f0
    style L4 fill:#312e81,stroke:#a78bfa,color:#e2e8f0

Score	Level	What it means
0-4	L0: Manual	Traditional engineering, no agent infra
5-8	L1: Assisted	Agents help, humans still write code
9-13	L2: Steered	Humans steer, agents execute most code
14-16	L3: Autonomous	Agents handle full lifecycle
17-18	L4: Self-Correcting	System maintains and improves itself

What `reins init` scaffolds

AGENTS.md                        # Concise map (~100 lines) for agents
ARCHITECTURE.md                  # Domain map, layer rules, dependency direction
risk-policy.json                 # Risk tiers + docs drift rules (policy-as-code)
docs/
  golden-principles.md           # Mechanical taste rules enforced in CI
  design-docs/
    index.md                     # Design doc registry with verification status
    core-beliefs.md              # Agent-first operating principles
  product-specs/
    index.md                     # Product spec registry
  exec-plans/
    active/                      # Currently executing plans
    completed/                   # Historical plans with outcomes
    tech-debt-tracker.md         # Known debt with priority and ownership
  references/                    # External LLM-friendly reference docs
  generated/                     # Auto-generated docs (schema, API specs)

Optional pack:

reins init . --pack auto
reins init . --pack agent-factory

--pack auto keeps base scaffold for unknown stacks and selects agent-factory when the repo looks Node/JS compatible.

--pack agent-factory adds an advanced automation layer:

scripts/lint-structure.mjs (hard structural gate)
scripts/doc-gardener.mjs + scripts/check-changed-doc-freshness.mjs (docs freshness loop)
scripts/pr-review.mjs (soft golden-principles reviewer)
.github/workflows/risk-policy-gate.yml (risk-tier + docs drift checks)
.github/workflows/pr-review-bot.yml (PR feedback loop)
.github/workflows/structural-lint.yml (CI enforcement gate)

reins evolve now includes pack recommendations and reins evolve . --apply can scaffold compatible pack automation into an existing repo.

The six audit dimensions

Each scored 0-3, totaling 0-18:

graph TD
    Score["Total Score\n0-18"]
    RK["Repository Knowledge\n0-3"]
    AE["Architecture Enforcement\n0-3"]
    AL["Agent Legibility\n0-3"]
    GP["Golden Principles\n0-3"]
    AW["Agent Workflow\n0-3"]
    GC["Garbage Collection\n0-3"]

    RK --> Score
    AE --> Score
    AL --> Score
    GP --> Score
    AW --> Score
    GC --> Score

    style Score fill:#312e81,stroke:#a78bfa,color:#e2e8f0
    style RK fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
    style AE fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
    style AL fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
    style GP fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
    style AW fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
    style GC fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0

Dimension	What it checks
Repository Knowledge	AGENTS.md, docs/, versioned execution plans
Architecture Enforcement	ARCHITECTURE.md, dependency rules, linters, policy signals
Agent Legibility	Bootable app, observability (or CLI diagnosability), lean dependencies
Golden Principles	Documented taste rules, CI gate depth, cleanup process
Agent Workflow	Agent config, risk policy, PR templates, CI enforcement
Garbage Collection	Debt tracking, doc-gardening, quality grades, docs drift rules

Self-apply: 18/18

Reins audits itself in CI. Current score:

{
  "total_score": 18,
  "max_score": 18,
  "maturity_level": "L4: Self-Correcting"
}

CI gates: lint, test, typecheck, self-audit. Merging to master runs publish: if cli/reins/package.json has a version not yet on npm, it publishes reins-cli and creates a GitHub Release. PRs from this repository that modify cli/reins/** and do not manually bump cli/reins/package.json are auto-patched by .github/workflows/auto-bump-cli-version.yml (fork PRs are skipped). For CLI repositories, Reins treats strong diagnosability signals (for example doctor surfaces, CLI diagnostics in CI, and help/error coverage) as the equivalent of service observability infrastructure.

Project structure

reins/
  cli/reins/            # The CLI tool (Bun + TypeScript, zero deps)
    src/index.ts        # Thin CLI router
    src/lib/commands/   # Command handlers (init/audit/doctor/evolve)
    src/lib/audit/      # Audit runtime context + scoring internals
    package.json
  skill/                # Agent skill (Claude Code)
    Reins/
      SKILL.md          # Skill definition and routing
      HarnessMethodology.md  # Full methodology reference
      Workflows/
        Scaffold.md     # Scaffold workflow
        Audit.md        # Audit workflow
        Evolve.md       # Evolve workflow

Requirements

Bun v1.0+ or Node.js 18+
No other dependencies

Contributing

See CONTRIBUTING.md for guidelines.

Methodology

Based on OpenAI's Harness Engineering (February 2026, Ryan Lopopolo). The five pillars:

Repository as system of record — all knowledge versioned in-repo
Layered domain architecture — strict layer ordering with forward-only dependencies
Agent legibility — optimize for agent understanding, not just human readability
Golden principles — encode human taste mechanically, enforce in CI
Garbage collection — background agents clean drift continuously

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
cli/reins		cli/reins
docs		docs
skill/Reins		skill/Reins
.coderabbit.yaml		.coderabbit.yaml
.gitignore		.gitignore
.gitkeep		.gitkeep
AGENTS.md		AGENTS.md
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
package.json		package.json
risk-policy.json		risk-policy.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reins

Why Reins exists

The model (for humans and agents)

Quick start

Keep Reins fresh

The steering loop

Why teams adopt Reins

Where Reins fits

The four commands

The maturity model

What `reins init` scaffolds

The six audit dimensions

Self-apply: 18/18

Project structure

Requirements

Contributing

Methodology

License

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

License

WellDunDun/reins

Folders and files

Latest commit

History

Repository files navigation

reins

Why Reins exists

The model (for humans and agents)

Quick start

Keep Reins fresh

The steering loop

Why teams adopt Reins

Where Reins fits

The four commands

The maturity model

What reins init scaffolds

The six audit dimensions

Self-apply: 18/18

Project structure

Requirements

Contributing

Methodology

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

What `reins init` scaffolds

Packages