Skip to content

Determinism breaks under multi-threaded workloads #32

@siphonite

Description

@siphonite

Problem

Crash point IDs are assigned using a global atomic counter (static CRASH_COUNTER: AtomicUsize).

Code reference:
https://github.com/siphonite/first/blob/main/src/rt.rs#L11

static CRASH_COUNTER: AtomicUsize = AtomicUsize::new(0);

https://github.com/siphonite/first/blob/main/src/rt.rs#L126

let previous = CRASH_COUNTER.fetch_add(1, Ordering::SeqCst);

If the workload spawns threads, the order in which crash_point() is called depends on OS scheduling.

Impact

  • Same crash point ID refers to different logical operations across runs: Run 1 might assign ID 5 to thread A's operation, while Run 2 assigns ID 5 to thread B's operation.
  • “Deterministic crash testing” claim becomes false: The cornerstone of FIRST is deterministic reproducibility.
  • Debugging and reproduction are unreliable: Replaying a crash at ID X might not hit the same code path.

Acceptance Criteria

One must be explicitly chosen:

  • Enforce single-threaded workloads: Detect threads and panic, or clearly document this constraint.
  • Make crash points thread-local: Maintain separate counters per thread (complex to orchestrate).
  • Explicit Determinism: Require users to provide extensive manual instrumentation to ensure ordering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    designdocumentationImprovements or additions to documentation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions