Skip to content

Makes judgment structure in code visible so humans can maintain it over time.

License

Notifications You must be signed in to change notification settings

etalbert102/fidelity_framework

Repository files navigation

Judgment Fidelity Framework v1.0

Makes judgment structure in code visible so humans can maintain it over time.


Quick Fit Check

Don’t use this if any are true:

  1. Your org doesn’t review and act on technical debt.
  2. Plan signers won’t read the plan.
  3. You can’t stop if it becomes ceremonial.
  4. You need approval/certification.

If any are true, stop. This tool can make systems less safe by creating false confidence.


What This Is

A visibility tool that:

  • Records where judgment happens (trace events)
  • Compares observed structure to a declared Plan
  • Creates obligations for human review
  • Accumulates visible review debt

Core claim: Preserves judgment structure long enough for humans to exercise it.
Key limit: Cannot create, enforce, or guarantee judgment happens.

Operational cost: Requires human time to write Plans, maintain transcripts/tests, and review obligations. If you won’t staff that work, don’t deploy.


What It Cannot Do (Non‑Claims)

This framework does NOT:

  • Verify authority claims (only records them)
  • Prove coverage completeness (only tests known paths)
  • Supply judgment where none exists
  • Enforce human review or scale it
  • Approve, certify, or validate anything

If you need any of the above, look elsewhere.


Security Warning

This framework is about judgment integrity, not vulnerability prevention.

  • It likely reduces LLM‑introduced authorization bypass and silent mutation bugs
  • It provides no meaningful protection against injection/crypto/dependency vulnerabilities
  • Using this alone can create a false sense of security about correctness

Recommendation: Use this for judgment integrity (who decided, what evidence, what’s reversible). Pair it with traditional security tooling (SAST, dependency scanning, security‑focused code review).


How It Works (Minimal)

1) Human writes and signs a Plan

from fidelity_framework_v1 import Plan, AnchorSpec, DecisionAuthority

plan = Plan(
    version="v1.0",
    name="File Processor",
    owner="jane.doe@company.com",
    signature="jane.doe@company.com attested 2024-01-15",  # Claim only
    anchor_specs=[
        AnchorSpec(
            anchor_id="commit",
            location_id="write_output",
            event_types=["authority_check", "execute"],
            event_order=["authority_check", "execute"],
            allowed_decisions=["allow", "deny"],
            allowed_actor_claims=["on_call_engineer"],
            finality="file_overwrite",
        )
    ],
    decision_authorities=[
        DecisionAuthority(
            decision_point="force_override",
            authority_holder="on_call_engineer",
            escalation_path="engineering_manager",
            conditions="business_hours_only",
        )
    ],
    schema_versions=["1.0"],
    stakeholders=["analytics_team", "reporting_consumers"],
)

Plan note: assess_plan() checks coverage, order, event types, decisions, actor_claims, schema_version, and trace integrity.

2) Runtime emits observed events only

from fidelity_framework_v1 import emit_observed_event, set_correlation_id

set_correlation_id("req-1234")

def write_output(path: str, data: str, force: bool = False):
    emit_observed_event(
        anchor_id='commit',
        location_id='write_output',
        event_type='authority_check',
        decision='allow' if force else 'deny',
        actor_claim='on_call_engineer',  # Recorded, not verified
        authority_mechanism='--force' if force else None,
    )
    if os.path.exists(path) and not force:
        raise PermissionError("Use --force to overwrite")
    emit_observed_event(
        anchor_id='commit',
        location_id='write_output',
        event_type='execute',
        decision='allow',
        actor_claim='on_call_engineer',
        path=path,
    )

3) Assessment creates obligations (never approvals)

Statuses: UNASSESSABLE, OBLIGATIONS_PRESENT, NO_NEW_OBLIGATIONS_CREATED. None imply approval.

If trace integrity is missing (e.g., correlation IDs), assessment becomes UNASSESSABLE.


Known Failure Modes (Concise)

  • Authority laundering: authority claims are just strings; not verified.
  • Status treated as approval: pressure turns any status into “PASS.”
  • Known‑path bias: transcripts only cover remembered paths.
  • Checker as linter: assessor flags patterns; humans decide meaning.
  • Review doesn’t scale: obligations outpace capacity.
  • Non‑blocking CI is ignored: tool has no authority to block.
  • Missing correlation context:
    • Runtime emits TRACE_CONTEXT_MISSING (no crash)
    • Subsequent events have correlation_id=None
    • Assessment sets unassessable=True
    • Obligation created: “Restore trace integrity; rerun assessment”
    • No sentinel IDs like correlation_id="MISSING"
  • Compliance theater: plans rubber‑stamped, obligations ignored → false confidence

When to Deploy

Deploy if ALL true:

  • Real review capacity (people + time)
  • Management acts when review debt > 50
  • Plans are read before signing (spot check this)
  • You can stop if ceremonial signs appear
  • You need visibility into judgment erosion over time

Do NOT deploy if ANY true:

  • Technical debt metrics are ignored
  • Plans will be rubber‑stamped to meet deadlines
  • “Authority” is symbolic rather than real
  • You need approval/certification
  • You can’t stop if it becomes ceremonial

Forking Path (Control Systems)

Upstream is visibility‑only. For enforcement, see docs/FORKING_PATH.md.

Extension Interfaces (Visibility‑Only Upstream)

  • ActorResolver (IAM binding)
  • ProvenanceProvider (callsite fingerprints)
  • SinkPolicy (regex → AST)
  • GatePolicy (optional CI gating)
  • LedgerBackend (ticketing/workflows)

Golden Transcripts (Continuity Tests)

from fidelity_framework_v1 import GoldenTranscript, TranscriptEventMatcher, match_transcript

transcript = GoldenTranscript(
    name="Force overwrite path",
    expected_subsequence=[
        TranscriptEventMatcher('input', 'validate', {'validated': True}),
        TranscriptEventMatcher('commit', 'authority_check', {'authority_mechanism': '--force'}),
        TranscriptEventMatcher('commit', 'execute', {}),
    ],
    allow_extra_events=True,
)

ok, msg = match_transcript(get_events(), transcript)
assert ok, msg

Notes:

  • Matching checks event.data and top‑level fields (e.g., decision, actor_claim).
  • If allow_extra_events=False, transcript matching requires an exact sequence with no extras.
  • Coverage obligation is emitted when an anchor has zero observed events.
  • Partial coverage is not inferred; only zero‑observation is detected.

Forbidden Sink Scanning (Bypass Detection)

from fidelity_framework_v1 import assert_no_forbidden_sinks
assert_no_forbidden_sinks("src/processor.py")

Limitation: Default regex patterns don’t catch everything (e.g., subprocess, eval, requests.post).


Review Debt Tracking

Core provides: in‑memory ObligationLedger, open_obligations().

Core does NOT provide: persistence, debt metrics, auto‑aging.


CI Integration (Visibility Only)

Always exit 0 in upstream; the framework does not claim authority to block merges.


FAQ (Short)

  • Is this just testing? No. It claims structure visible, not correctness.
  • Compliance/audit? Only if auditors accept “records claims, doesn’t verify.”
  • Disagree with findings? Override via human judgment; framework makes it visible.
  • Scale? No. Human review capacity is the bottleneck.
  • Plan was wrong? Update it; framework doesn’t verify plans.

Optional Mitigations (Forks/Process)

Possible improvements (not enabled by default due to complexity):

  • Instrumentation honesty checks (error‑path transcripts, negative tests, CI sink scans)
  • Coverage rigor (fail tests if any AnchorSpec is never observed)
  • Correlation‑scoped assessments (avoid interleaving noise)
  • Order‑matching flexibility (repeated anchors / subsequence ordering)
  • Transcript tightening (require location_id/correlation_id in systems)

Tradeoff: less false confidence, more test burden and maintenance overhead.


Remaining Known Limitations (Explicit)

  • Instrumentation honesty is assumed (events can be late, missing, or placeholder).
  • Plan omissions are invisible to assessment.
  • Order checks can misfire under interleaving or repeated loops.
  • Transcripts can match across flows/components unless you include location_id or correlation_id.
  • Assessment does not group by correlation_id (ordering/coverage run over full event list).
  • Schema mismatch makes assessment unassessable.
  • Retrofit overhead can be high. Brownfield adoption is labor‑intensive; see docs/FORKING_PATH.md for fork ideas that reduce retrofit friction.

Support


License

MIT License — use freely, modify as needed.

Final warning: Deploy with eyes open or not at all. This tool makes judgment structure visible—it doesn’t create, enforce, or guarantee judgment happens. If you can’t act on what it surfaces, don’t surface it.

About

Makes judgment structure in code visible so humans can maintain it over time.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published