NFP is a Lean 4 project for mathematically rigorous reasoning about transformer-style computations, with a focus on mechanistic interpretability (e.g. induction heads) and provable norm/error bounds.
NFP stands for Neural Formal Pathways.
This repo contains:
- A Lean library (under
Nfp/) for finite probability and a lightweight “transformer semantics” layer. - A CLI executable (
lake exe nfp …) that loads transformer weights stored in a compact binary format (.nfpt) and produces rigorous bounds and diagnostics.
Goal: no “hand-wavy” numerics in the bound path. Heuristic estimates (e.g. power iteration) may exist for diagnostics, but the bounds reported as “rigorous” are computed via conservative inequalities.
Heavy rewrite in progress, please refer to the tabula-rasa branch for current status
This is research tooling. Interfaces may change; please treat results as experimental unless they are backed by a certificate/check you trust.
The Lean library defines the core math objects (finite probability, mixers, linearizations, and operator-norm-style bounds) and proves a number of lemmas about them. The CLI sound path produces certificates using exact Rat arithmetic and a trusted checker that verifies internal arithmetic relationships between certificate fields.
At present, the checker does not include a bridge theorem that connects certificate validity to the Lean-defined Jacobian bounds (for example, a theorem of the form ||layerJacobian - I|| <= C). Treat sound certificates as internally consistent bound reports, not as a fully formal end-to-end verification of transformer Jacobians.
For known gaps and ongoing upgrades, see SOUNDNESS_LIMITATIONS.md.
NFP’s long-term direction is verified circuit discovery:
- Use fast, exploratory tooling to propose candidate circuits (e.g. induction-style head interactions),
- then produce checkable evidence (bounds / certificates) that a skeptical reader can re-run and validate.
Concretely, the intended split is:
-
Discovery / exploration (untrusted, fast): Heuristic search, ranking, and diagnostics are allowed here (and should be clearly labelled as such). This includes things like candidate search (
induction) and comparison estimates printed under diagnostics/verbose flags. -
Certification / checking (trusted, boring): Anything described as “rigorous” should be justified by conservative inequalities or by a certificate that a checker can validate. The long-term aim is that Lean does as little “real inference” as possible: instead of running large forward passes, it should mostly check small, structured proof obligations (e.g. inequality chains, norm bounds, interval/rational arithmetic).
Current state: certify is already an example of this direction (sound-mode reporting using exact Rat arithmetic rather than trusted floats),
but the certificate story is still evolving and interfaces may change.
Model trajectory: GPT-2 support is currently a proving ground for the end-to-end workflow (export → analyze/search → bound/certify). The goal is to gradually cover more modern decoder blocks (e.g. RoPE-style position handling) while keeping the certification/checking layer lightweight.
Minimal local demo (no network needed):
lake build -q --wfail
lake build nfp -q --wfail
lake exe nfp certify tests/fixtures/tiny_sound_binary.nfpt \
--output reports/tiny_sound_demo.txtExpected artifacts:
reports/tiny_sound_demo.txt
Optional (rebuild the tiny binary from text fixtures and run a fixed induction cert):
./scripts/demo_tiny_local_binary.sh
./scripts/demo_tiny_induction_cert.shExpected artifacts (optional path):
reports/tiny_sound_local_binary.txtreports/tiny_induction_cert.txt
End-to-end GPT-2 demo (requires network/model download):
./scripts/demo_gpt2_sound.sh
./scripts/demo_gpt2_induction_sound.shExpected artifacts:
reports/gpt2_sound_demo.txtreports/gpt2_induction_sound_scan.txt
Notes:
- If a legacy
.nfptheader is missinggelu_kind,demo_gpt2_sound.shwritesmodels/gpt2_with_gelu_kind.nfptand uses that for certification. demo_gpt2_induction_sound.shcan take a while on CPU; use--top 1,--fast, or--jobs 2to shorten the scan or run it on a larger machine.- You can also set
NFP_BIN=./.lake/build/bin/nfpto avoid repeatedlake exestartup overhead.
- Lean 4 (pinned by
lean-toolchain) and Lake.- Easiest install:
elan(Lean toolchain manager).
- Easiest install:
- A standard build toolchain for Lean (C/C++ compiler,
make, etc.). - (Optional) Python for the export scripts in
scripts/.
Lean version is pinned in lean-toolchain (currently leanprover/lean4:v4.26).
Clone and build:
lake update
lake buildRun the CLI (see subcommands below):
lake exe nfp --helpThe CLI expects a model file in .nfpt format (NFP_BINARY_V1).
- Create a local
models/directory and place your.nfptfiles there (the repo does not version model files; the author’s setup may have used local symlinks). - You can export GPT-2 weights from Hugging Face using the scripts in
scripts/.
.nfpt files use a small text header followed by a binary payload:
NFP_BINARY_V1
num_layers=...
num_heads=...
model_dim=...
head_dim=...
hidden_dim=...
vocab_size=...
seq_len=...
BINARY_START
The payload is raw little-endian bytes in a fixed order (tokens, embeddings, then weights).
Note: global sound certification supports NFP_BINARY_V1. Local sound certification
supports NFP_BINARY_V1 (fixed-point union-box) and legacy NFP_TEXT_V1/V2.
The export scripts use torch + transformers.
Example (write models/gpt2_rigorous.nfpt):
python scripts/export_gpt2.py models/gpt2_rigorous.nfptIf you prefer a locked Python environment, use uv or a venv and install dependencies from pyproject.toml:
uv run python scripts/export_gpt2.py models/gpt2_rigorous.nfptThis demo downloads GPT-2 weights on demand, exports a binary .nfpt, and runs the
global sound certificate.
./scripts/demo_gpt2_sound.shArtifacts:
models/gpt2.nfpt(binary export)reports/gpt2_sound_demo.txt(sound certificate report)
This demo builds the rigorous induction dataset (if needed), finds candidate induction head pairs, and ranks them by sound logit-diff lower bounds.
./scripts/demo_gpt2_induction_sound.shArtifacts:
models/gpt2_rigorous.nfpt(binary export)reports/gpt2_induction_sound_scan.txt(sound scan report)
This demo converts the tiny text fixtures into a binary .nfpt and runs a local
sound certificate (with --delta).
./scripts/demo_tiny_local_binary.shArtifacts:
tests/fixtures/tiny_sound_binary.nfpt(binary fixture)reports/tiny_sound_local_binary.txt(local sound certificate report)
This demo computes a minimal induction head certificate on the tiny fixture.
./scripts/demo_tiny_induction_cert.shArtifacts:
reports/tiny_induction_cert.txt(induction cert report)
The main entrypoint is:
lake exe nfp <command> [args] [flags]By default, nfp mirrors everything printed to stdout into logs/ as a timestamped .log file.
Runs the default end-to-end analysis for the supplied model and prints a human-readable report.
lake exe nfp analyze models/gpt2_rigorous.nfpt \
--threshold 0.1 --verify --verbose --output report.txt--threshold(-t) sets the minimum effect threshold used for verification (default:0.1).--verifyoptionally runs causal verification using model-provided inputs.--verboseprints model metadata and per-stage status messages.--output(-o) writes the report to a file instead of stdout.
Searches for candidate induction circuits and ranks head pairs by a mechanical score.
lake exe nfp induction models/gpt2_rigorous.nfpt \
--threshold 0.0 --diagnostics --diagTop 5 --adaptive --verbose--threshold(-t) sets the minimum normalized effect (default:0.0).--correct/--incorrectmanually pick logit IDs for the induction target (otherwise the target is inferred from tokens).--verifyruns causal verification via head ablation on the top-10 candidates.--diagnosticsenables bound breakdowns;--diagTopcontrols how many candidates receive diagnostics (default:5).--adaptiveturns on the adaptive bound scheduler. Tuning flags include--targetSlack(default:8.0),--maxUpgrades(default:120),--minRelImprove(default:0.01),--krylovSteps(default:2), and--adaptiveScope(layernorm | all, default:layernorm).--verboseprints detailed scoring metrics for each candidate.
Computes a conservative certificate report in sound mode using exact Rat arithmetic (no trusted floats).
Note: global sound certification supports NFP_BINARY_V1. Local sound certification
supports NFP_BINARY_V1 (fixed-point union-box) and legacy NFP_TEXT_V1/V2.
certify supports both:
- global certification (weights only), and
- local certification (weights + a small input region around a concrete prompt/input).
lake exe nfp certify models/gpt2_rigorous.nfpt \
--output cert.txt- For local (input-dependent) LayerNorm certification, pass an ℓ∞ radius
δ:
lake exe nfp certify models/gpt2_rigorous.nfpt \
--delta 0.01If you want to override the embedded input, pass a separate input .nfpt:
- LayerNorm ε is read from the model header (
layer_norm_eps). gelu_kindin the model header selects the GeLU derivative target (tanhorexact).--deltasets the local ℓ∞ radiusδ(default:0). Providing--deltaenables local certification.--partitionDepthrequests input partitioning depth (default:0; scaffold only, must remain0for now).--inputoptionally provides an input.nfptfile used for local certification.--output(-o) writes the report to a file (otherwise it prints to stdout).
Computes sound per-head contribution bounds (global weight-only, or local with --delta).
lake exe nfp head_bounds models/gpt2_rigorous.nfptFor local bounds (uses input embeddings in the model file when present):
lake exe nfp head_bounds models/gpt2_rigorous.nfpt --delta 0.01--deltaenables local head bounds;--inputcan override the embedded input.- LayerNorm ε is read from the model header (
layer_norm_eps). --scalePow10controls fixed-point scaling for global bounds (default:9).--output(-o) writes the report to a file (otherwise it prints to stdout).
Computes a sound local attention pattern bound for a single head (binary only),
propagating per-position intervals up to the target layer (bounded by maxSeqLen).
The pattern compares logits for keys whose token matches the query’s offset token
(e.g., --offset -1 matches the previous token).
lake exe nfp head_pattern models/gpt2_rigorous.nfpt --layer 0 --head 0 --delta 0.01 --offset -1--offsetselects the target key position relative to the query (default:-1for previous token).--maxSeqLencaps the sequence length analyzed for pattern bounds (default:256).--deltasets the local input radius; LayerNorm ε is read from the model header (layer_norm_eps).--tightPatternenables a slower but tighter pattern bound near the target layer.--tightPatternLayerssets how many layers use tight bounds (default:1; implies--tightPattern).--perRowPatternLayerssets how many layers use per-row MLP propagation (default:0).--bestMatchswitches to a single-query best-match bound (default query: last position).--sweepprints best-match bounds for all valid query positions (requires--bestMatch).--queryPoschooses the query position for best-match bounds (default: last position).
Computes a minimal sound induction-head certificate by combining two pattern certificates and a value-coordinate lower bound (binary only).
lake exe nfp induction_cert models/gpt2_rigorous.nfpt \
--layer1 0 --head1 0 --layer2 1 --head2 0 --coord 0 --delta 0.01 \
--target 42 --negative 17--layer1/--head1selects the previous-token head;--layer2/--head2selects the token-match head.--coordchooses the output coordinate used for the value lower bound.--offset1/--offset2adjust the token-match offsets (default:-1).--target/--negativeoptionally add a logit-diff lower bound using unembedding columns.--tightPatternenables a slower but tighter pattern bound near the target layer.--tightPatternLayerssets how many layers use tight bounds (default:1; implies--tightPattern).--perRowPatternLayerssets how many layers use per-row MLP propagation (default:0).--bestMatchswitches to single-query best-match bounds (default query: last position).--queryPoschooses the query position for best-match bounds (default: last position).
Generates RoPE-related linearization bounds used by the certificate/checking pipeline.
lake exe nfp rope --seqLen 4 --pairs 8--seqLeninstantiates the bound at the given sequence length (default:4).--pairssets the number of RoPE pairs; the dimension is2 * pairs(default:8).
At a high level, the “rigorous” path avoids heuristic operator-norm estimation and instead uses upper bounds derived from standard inequalities (examples you may see in logs):
- Frobenius-norm based bounds.
- Gram-matrix based bounds.
- Schur / Brauer-style eigenvalue bounds for symmetric matrices.
- Row-wise softmax operator bounds using quantities like
rowMaxP,rowTrace, Gershgorin-style estimates, and a “moment” bound.
The CLI may still compute power-iteration estimates for comparison, but those are explicitly labelled as diagnostics and are not used to produce the rigorous ub=… values.
A typical workflow:
# 1) Build
lake update
lake build
# 2) Export a model (optional)
python scripts/export_gpt2.py models/gpt2_rigorous.nfpt
# 3) Run induction search with diagnostics
lake exe nfp induction models/gpt2_rigorous.nfpt -v -d | sed -n '1,220p'Main.lean— CLI wiring and command definitions.Nfp/— library code (probability, transformer semantics, soundness/cert machinery, discovery routines).scripts/— Python helpers to export models and generate induction datasets.models/— local model files (not versioned here if large; author’s setup may have used local symlinks).
This project is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later). See the LICENSE file.