Makes judgment structure in code visible so humans can maintain it over time.
Don’t use this if any are true:
- Your org doesn’t review and act on technical debt.
- Plan signers won’t read the plan.
- You can’t stop if it becomes ceremonial.
- You need approval/certification.
If any are true, stop. This tool can make systems less safe by creating false confidence.
A visibility tool that:
- Records where judgment happens (trace events)
- Compares observed structure to a declared Plan
- Creates obligations for human review
- Accumulates visible review debt
Core claim: Preserves judgment structure long enough for humans to exercise it.
Key limit: Cannot create, enforce, or guarantee judgment happens.
Operational cost: Requires human time to write Plans, maintain transcripts/tests, and review obligations. If you won’t staff that work, don’t deploy.
This framework does NOT:
- Verify authority claims (only records them)
- Prove coverage completeness (only tests known paths)
- Supply judgment where none exists
- Enforce human review or scale it
- Approve, certify, or validate anything
If you need any of the above, look elsewhere.
This framework is about judgment integrity, not vulnerability prevention.
- It likely reduces LLM‑introduced authorization bypass and silent mutation bugs
- It provides no meaningful protection against injection/crypto/dependency vulnerabilities
- Using this alone can create a false sense of security about correctness
Recommendation: Use this for judgment integrity (who decided, what evidence, what’s reversible). Pair it with traditional security tooling (SAST, dependency scanning, security‑focused code review).
from fidelity_framework_v1 import Plan, AnchorSpec, DecisionAuthority
plan = Plan(
version="v1.0",
name="File Processor",
owner="jane.doe@company.com",
signature="jane.doe@company.com attested 2024-01-15", # Claim only
anchor_specs=[
AnchorSpec(
anchor_id="commit",
location_id="write_output",
event_types=["authority_check", "execute"],
event_order=["authority_check", "execute"],
allowed_decisions=["allow", "deny"],
allowed_actor_claims=["on_call_engineer"],
finality="file_overwrite",
)
],
decision_authorities=[
DecisionAuthority(
decision_point="force_override",
authority_holder="on_call_engineer",
escalation_path="engineering_manager",
conditions="business_hours_only",
)
],
schema_versions=["1.0"],
stakeholders=["analytics_team", "reporting_consumers"],
)Plan note: assess_plan() checks coverage, order, event types, decisions, actor_claims, schema_version, and trace integrity.
from fidelity_framework_v1 import emit_observed_event, set_correlation_id
set_correlation_id("req-1234")
def write_output(path: str, data: str, force: bool = False):
emit_observed_event(
anchor_id='commit',
location_id='write_output',
event_type='authority_check',
decision='allow' if force else 'deny',
actor_claim='on_call_engineer', # Recorded, not verified
authority_mechanism='--force' if force else None,
)
if os.path.exists(path) and not force:
raise PermissionError("Use --force to overwrite")
emit_observed_event(
anchor_id='commit',
location_id='write_output',
event_type='execute',
decision='allow',
actor_claim='on_call_engineer',
path=path,
)Statuses: UNASSESSABLE, OBLIGATIONS_PRESENT, NO_NEW_OBLIGATIONS_CREATED. None imply approval.
If trace integrity is missing (e.g., correlation IDs), assessment becomes UNASSESSABLE.
- Authority laundering: authority claims are just strings; not verified.
- Status treated as approval: pressure turns any status into “PASS.”
- Known‑path bias: transcripts only cover remembered paths.
- Checker as linter: assessor flags patterns; humans decide meaning.
- Review doesn’t scale: obligations outpace capacity.
- Non‑blocking CI is ignored: tool has no authority to block.
- Missing correlation context:
- Runtime emits
TRACE_CONTEXT_MISSING(no crash) - Subsequent events have
correlation_id=None - Assessment sets
unassessable=True - Obligation created: “Restore trace integrity; rerun assessment”
- No sentinel IDs like
correlation_id="MISSING"
- Runtime emits
- Compliance theater: plans rubber‑stamped, obligations ignored → false confidence
Deploy if ALL true:
- Real review capacity (people + time)
- Management acts when review debt > 50
- Plans are read before signing (spot check this)
- You can stop if ceremonial signs appear
- You need visibility into judgment erosion over time
Do NOT deploy if ANY true:
- Technical debt metrics are ignored
- Plans will be rubber‑stamped to meet deadlines
- “Authority” is symbolic rather than real
- You need approval/certification
- You can’t stop if it becomes ceremonial
Upstream is visibility‑only. For enforcement, see docs/FORKING_PATH.md.
ActorResolver(IAM binding)ProvenanceProvider(callsite fingerprints)SinkPolicy(regex → AST)GatePolicy(optional CI gating)LedgerBackend(ticketing/workflows)
from fidelity_framework_v1 import GoldenTranscript, TranscriptEventMatcher, match_transcript
transcript = GoldenTranscript(
name="Force overwrite path",
expected_subsequence=[
TranscriptEventMatcher('input', 'validate', {'validated': True}),
TranscriptEventMatcher('commit', 'authority_check', {'authority_mechanism': '--force'}),
TranscriptEventMatcher('commit', 'execute', {}),
],
allow_extra_events=True,
)
ok, msg = match_transcript(get_events(), transcript)
assert ok, msgNotes:
- Matching checks
event.dataand top‑level fields (e.g.,decision,actor_claim). - If
allow_extra_events=False, transcript matching requires an exact sequence with no extras. - Coverage obligation is emitted when an anchor has zero observed events.
- Partial coverage is not inferred; only zero‑observation is detected.
from fidelity_framework_v1 import assert_no_forbidden_sinks
assert_no_forbidden_sinks("src/processor.py")Limitation: Default regex patterns don’t catch everything (e.g., subprocess, eval, requests.post).
Core provides: in‑memory ObligationLedger, open_obligations().
Core does NOT provide: persistence, debt metrics, auto‑aging.
Always exit 0 in upstream; the framework does not claim authority to block merges.
- Is this just testing? No. It claims structure visible, not correctness.
- Compliance/audit? Only if auditors accept “records claims, doesn’t verify.”
- Disagree with findings? Override via human judgment; framework makes it visible.
- Scale? No. Human review capacity is the bottleneck.
- Plan was wrong? Update it; framework doesn’t verify plans.
Possible improvements (not enabled by default due to complexity):
- Instrumentation honesty checks (error‑path transcripts, negative tests, CI sink scans)
- Coverage rigor (fail tests if any AnchorSpec is never observed)
- Correlation‑scoped assessments (avoid interleaving noise)
- Order‑matching flexibility (repeated anchors / subsequence ordering)
- Transcript tightening (require
location_id/correlation_idin systems)
Tradeoff: less false confidence, more test burden and maintenance overhead.
- Instrumentation honesty is assumed (events can be late, missing, or placeholder).
- Plan omissions are invisible to assessment.
- Order checks can misfire under interleaving or repeated loops.
- Transcripts can match across flows/components unless you include
location_idorcorrelation_id. - Assessment does not group by
correlation_id(ordering/coverage run over full event list). - Schema mismatch makes assessment unassessable.
- Retrofit overhead can be high. Brownfield adoption is labor‑intensive; see docs/FORKING_PATH.md for fork ideas that reduce retrofit friction.
- Technical: fidelity_framework_v1.py
- Conceptual: docs/judgment_preserving_doctrine_demo_v1.md
MIT License — use freely, modify as needed.
Final warning: Deploy with eyes open or not at all. This tool makes judgment structure visible—it doesn’t create, enforce, or guarantee judgment happens. If you can’t act on what it surfaces, don’t surface it.