feat(storage): implement deterministic entity/collection IDs#1787
feat(storage): implement deterministic entity/collection IDs#1787
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| fn compute_id(parent: Id, key: &[u8]) -> Id { | ||
| let mut hasher = Sha256::new(); | ||
| hasher.update(parent.as_bytes()); | ||
| hasher.update(DOMAIN_SEPARATOR_ENTRY); |
There was a problem hiding this comment.
Breaking change to compute_id breaks existing stored data
High Severity
Adding DOMAIN_SEPARATOR_ENTRY to the existing compute_id function changes how entry IDs are computed for all UnorderedMap and UnorderedSet operations. Any existing stored data will become inaccessible because lookups now compute different IDs than what was used to store the data. The PR claims backward compatibility is maintained and the new ID generation is "opt-in via new_with_field_name()", but this change to compute_id affects all map/set operations regardless of how the collection was created.
- Add new_with_field_name() method to Collection that generates deterministic IDs using SHA256 hash of parent_id + field_name - Add new_with_field_name() to all collection types: - Counter (handles positive/negative maps with deterministic IDs) - UnorderedMap - UnorderedSet - Vector - ReplicatedGrowableArray - Update #[app::state] macro to generate Default implementation that uses new_with_field_name() for collection fields - Add tests to verify determinism: - Same field names produce same IDs - Different field names produce different IDs - Parent ID affects child collection IDs Implements #1769 CIP Section: Protocol Invariants Invariant: I9 (Deterministic Entity IDs) Acceptance Criteria: ✅ Same code on two nodes produces identical collection IDs ✅ Nested collections derive IDs correctly (parent + field) ✅ Existing apps continue to work (backward compatibility) ✅ Unit tests verify determinism
The macro was generating a Default implementation that required all fields to implement Default, which breaks apps with non-Default types like enums. Users should manually implement Default or use new_with_field_name() in their init functions for deterministic IDs.
Counter's internal maps were using '{field_name}_positive' and
'{field_name}_negative' as field names, which could silently collide
with user-created collections. For example, a Counter with field name
'visits' would create internal maps with IDs derived from 'visits_positive',
which would collide with a user-created UnorderedMap with field name
'visits_positive'.
Fix by using a reserved internal prefix '__counter_internal_' for
Counter's internal maps. This ensures:
- Counter('visits') creates maps with IDs from '__counter_internal_visits_positive'
- User collections named 'visits_positive' use IDs from 'visits_positive'
- No collision possible
Added test to verify no collision occurs.
Fixes collision issue in crates/storage/src/collections/counter.rs:216-228
The compute_id and compute_collection_id functions both compute SHA256(parent_bytes + name_bytes) without domain separation, which can cause collisions. For example: - A nested collection with field name 'key' creates ID from SHA256(parent_id + 'key') - A map entry with key 'key' creates ID from SHA256(parent_id + 'key') - Both get identical IDs, causing data corruption Fix by adding domain separators: - compute_id uses '__calimero_entry__' separator - compute_collection_id uses '__calimero_collection__' separator This ensures map entries and nested collections never collide even with the same parent and name. Added test to verify no collision occurs. Fixes collision issue in: - crates/storage/src/collections.rs:66-74 (compute_collection_id) - crates/storage/src/collections.rs:57-63 (compute_id)
…ections Counter::new_with_field_name was defined in a generic impl block for any StorageAdaptor, while other collections (UnorderedMap, UnorderedSet, Vector, RGA) only expose new_with_field_name for MainStorage. This created an API inconsistency. Fix by moving new_with_field_name to the MainStorage-only impl block, matching the pattern used by all other collection types. Fixes API inconsistency in crates/storage/src/collections/counter.rs:188-201
f69ac8d to
d3bff4c
Compare
Merobox Proposals Workflows FailedThe following proposal workflow(s) failed:
Please check the workflow logs for more details. |
SDK JS Workflows FailedThe following SDK JS workflow(s) failed:
Please check the workflow logs for more details. |
Merobox Workflows FailedThe following workflow(s) failed after retries:
Please check the workflow logs for more details. |
Combined feature from PR #1786 and #1787: - Add field_name and crdt_type to Metadata struct - Add Element::new_with_field_name and new_with_field_name_and_crdt_type - Update Collection::new_with_field_name to store metadata - Update all collection types to pass their CrdtType: - UnorderedMap -> CrdtType::UnorderedMap - UnorderedSet -> CrdtType::UnorderedSet - Vector -> CrdtType::Vector - Counter -> CrdtType::Counter - RGA -> CrdtType::Rga - Add BorshSerialize/Deserialize/Ord/PartialOrd to CrdtType - Add UserStorage and FrozenStorage to CrdtType enum - Custom Borsh de/serialization for backward compatibility This enables: - Deterministic collection IDs (from #1787) - Schema inference from database metadata (from #1786) - CRDT type-aware merge dispatch
Combined feature from PR #1786 and #1787: - Add field_name and crdt_type to Metadata struct - Add Element::new_with_field_name and new_with_field_name_and_crdt_type - Update Collection::new_with_field_name to store metadata - Update all collection types to pass their CrdtType: - UnorderedMap -> CrdtType::UnorderedMap - UnorderedSet -> CrdtType::UnorderedSet - Vector -> CrdtType::Vector - Counter -> CrdtType::Counter - RGA -> CrdtType::Rga - Add BorshSerialize/Deserialize/Ord/PartialOrd to CrdtType - Add UserStorage and FrozenStorage to CrdtType enum - Custom Borsh de/serialization for backward compatibility This enables: - Deterministic collection IDs (from #1787) - Schema inference from database metadata (from #1786) - CRDT type-aware merge dispatch
Combined feature from PR #1786 and #1787: - Add field_name and crdt_type to Metadata struct - Add Element::new_with_field_name and new_with_field_name_and_crdt_type - Update Collection::new_with_field_name to store metadata - Update all collection types to pass their CrdtType: - UnorderedMap -> CrdtType::UnorderedMap - UnorderedSet -> CrdtType::UnorderedSet - Vector -> CrdtType::Vector - Counter -> CrdtType::Counter - RGA -> CrdtType::Rga - Add BorshSerialize/Deserialize/Ord/PartialOrd to CrdtType - Add UserStorage and FrozenStorage to CrdtType enum - Custom Borsh de/serialization for backward compatibility This enables: - Deterministic collection IDs (from #1787) - Schema inference from database metadata (from #1786) - CRDT type-aware merge dispatch


Summary
Implements issue #1769: Deterministic Entity/Collection IDs
This PR adds deterministic ID generation for collections based on parent ID and field name, ensuring the same application code produces identical collection IDs across all nodes.
Changes
Core Implementation
compute_collection_id()function that generates deterministic IDs using SHA256 hash ofparent_id + field_namenew_with_field_name()method toCollectionthat uses deterministic IDsnew_with_field_name()to all collection types:Counter(special handling for positive/negative maps)UnorderedMapUnorderedSetVectorReplicatedGrowableArrayTesting
CIP Reference
CIP Section: Protocol Invariants
Invariant: I9 (Deterministic Entity IDs)
Acceptance Criteria
Files Modified
crates/storage/src/collections.rs- Core deterministic ID generationcrates/storage/src/collections/counter.rs- Counter supportcrates/storage/src/collections/unordered_map.rs- UnorderedMap supportcrates/storage/src/collections/unordered_set.rs- UnorderedSet supportcrates/storage/src/collections/vector.rs- Vector supportcrates/storage/src/collections/rga.rs- RGA supportapps/state-schema-conformance/src/lib.rs- Updated to use deterministic IDsUsage
To use deterministic IDs, call
new_with_field_name(parent_id, field_name)instead ofnew():For nested collections, pass the parent collection's ID:
Testing
All deterministic tests pass:
test_deterministic_counter_idstest_deterministic_counter_with_parent_idtest_deterministic_map_idstest_deterministic_map_with_parent_idBackward Compatibility
Old
new()methods remain available and functional. The deterministic ID generation is opt-in vianew_with_field_name().Implements #1769
Note
High Risk
High risk because it changes the ID derivation used for map/set entries (
compute_id), which can affect existing persisted data and synchronization semantics even though deterministic collection creation is opt-in.Overview
Implements deterministic collection ID generation via
new_with_field_name(parent_id, field_name)across core collection types (Collection,UnorderedMap,UnorderedSet,Vector,Counter,ReplicatedGrowableArray), so the same app code yields stable IDs across nodes.Adds domain separation to the SHA256-based ID hashing to avoid collisions between nested collection IDs and map entry IDs, plus new unit tests covering determinism, parent-scoping, and collision prevention (including
Counter’s internal positive/negative maps). Updatesstate-schema-conformanceapp initialization to use the deterministic constructors for all collections.Written by Cursor Bugbot for commit 3b55655. This will update automatically on new commits. Configure here.