Skip to content

Conversation

@canadaduane
Copy link
Contributor

This PR fixes an issue when trying to fork a doc, using the same PeerID, then applyDiff back to the original doc.

In an initial test, this is what I observed on the JS side:

it("MINIMAL REPRODUCTION: same peer ID + setContainer = hang", () => {
  const doc1 = new LoroDoc();
  doc1.setPeerId("1");

  const doc2 = doc1.fork();
  doc2.setPeerId("1"); // Same peer ID

  // The key prerequisite: setContainer creates a nested container
  const map = doc2.getMap("root");
  map.setContainer("nested", new LoroMap());
  doc2.commit();

  const frontiersBefore = doc1.frontiers();
  const frontiersAfter = doc2.frontiers();
  const diff = doc2.diff(frontiersBefore, frontiersAfter, false);

  // THIS HANGS
  doc1.applyDiff(diff);
});
  1. The Root Cause: The infinite loop in _apply_diff occurs because container_remap contains a self-referential entry (ID -> ID). The resolution logic while let Some(rid) = container_remap.get(&id) assumes that remapping implies a change to a different ID. When rid == id, this assumption is violated, causing the loop to spin forever.

  2. Why old_id == new_id happens: In the "fork and merge" scenario with the same PeerID, both the source document (doc2) and the target document (doc1) start from the same state (same PeerID, same Counter). When doc2 creates a container, it assigns it ID X. When doc1 applies the diff and executes the same creation operation, it deterministically generates the exact same ID X. This is expected behavior for CRDTs and deterministic systems.

  3. Why skipping the insert is correct: The purpose of container_remap is to translate IDs from the "diff space" to the "local space".

    • If old_id != new_id, we need a translation so that subsequent operations in the diff referring to old_id are applied to new_id locally.
    • If old_id == new_id, the ID in the diff is already valid in the local document (because we just created a container with that exact ID). Therefore, no translation is needed. The identity mapping is implicit.
    • By skipping the insertion into container_remap, we ensure that container_remap.get(&id) returns None, the loop terminates immediately, and the code proceeds to use id (which is correct).
  4. Why this level: Fixing it in handler.rs (where the mapping is created) is better than fixing it in loro.rs (where the mapping is consumed).

    • It prevents the invalid state (ID -> ID) from ever entering the data structure.
    • It avoids adding overhead to the hot path in _apply_diff (checking for cycles or equality in every iteration).
    • It correctly models the semantic intent: "If the IDs match, no remapping is required."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant