feat(storage): combine deterministic IDs with metadata storage by xilosada · Pull Request #1864 · calimero-network/core

xilosada · 2026-02-05T07:22:45Z

Summary

Combines the features from PR #1787 (deterministic entity IDs) and PR #1786 (metadata storage) into a single coherent implementation.

Changes

From #1787 - Deterministic IDs

compute_collection_id(parent_id, field_name) for deterministic ID generation
new_with_field_name(parent_id, field_name) on all collection types
Domain separation to prevent ID collisions

From #1786 - Metadata Storage

Added field_name: Option<String> to Metadata struct
Added crdt_type: Option<CrdtType> to Metadata struct
Element::new_with_field_name() and new_with_field_name_and_crdt_type()
All collections now store their CrdtType in metadata:
- UnorderedMap → CrdtType::UnorderedMap
- UnorderedSet → CrdtType::UnorderedSet
- Vector → CrdtType::Vector
- Counter → CrdtType::Counter
- RGA → CrdtType::Rga

Backward Compatibility

Custom Borsh de/serialization for Metadata handles old data gracefully
BorshSerialize, BorshDeserialize, Ord, PartialOrd added to CrdtType

Benefits

Deterministic IDs: Same code produces identical collection IDs across nodes
Schema Inference: field_name enables tools to infer schema from database
CRDT Merge Dispatch: crdt_type enables type-aware merge strategies

Follow-up

A separate PR will add merodb schema inference using this metadata.

Supersedes

Implements #1769

Note

High Risk
Changes core storage identity/serialization and adds ID reassignment logic that mutates persisted state and indexes; also breaks backward compatibility for existing stored data.

Overview
Ensures deterministic, schema-identifiable storage entities by hashing collection IDs from (parent_id, field_name) with domain separation, and persisting field_name + crdt_type on every stored Element.

The #[app::logic] init path is wrapped to call an auto-generated __assign_deterministic_ids() (via #[app::state]), which reassigns top-level collection IDs after user init() runs; storage adds ID reassignment/cleanup helpers (including index reference removal without tombstones) and exposes new_with_field_name* constructors across collections. CRDT types are now explicit (UserStorage, FrozenStorage, etc.), and app/tests are updated to use field-name constructors and validate deterministic IDs/collision avoidance.

Breaking change: Metadata Borsh layout now includes crdt_type/field_name, so old on-disk data will not deserialize.

^{Written by Cursor Bugbot for commit 46224a8. This will update automatically on new commits. Configure here.}

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 186.1s

🔴 Critical (1)

1. Breaking change to compute_id causes data loss for existing entries

File: crates/storage/src/collections.rs (line 66-73) | Consensus: 1/1 agents ✓

Adding DOMAIN_SEPARATOR_ENTRY to the existing compute_id function changes how IDs are computed for map entries. All existing map entries in databases created before this change will become inaccessible because lookups will compute different IDs. This is a silent data loss issue - existing data won't be found or modified, but won't be deleted either, leading to orphaned entries and duplicate data.

Suggested fix:

Either: (1) Provide a migration path that re-hashes existing entries, or (2) Keep the old `compute_id` for reading and only use domain separation for new entries, or (3) Version the ID computation scheme and handle both in lookups. Consider adding a database version marker to detect old vs new format.

🟡 Warning (1)

1. Custom Borsh deserialization masks corruption errors

File: crates/storage/src/entities.rs (line 495-513) | Consensus: 1/1 agents ✓

The BorshDeserialize implementation uses unwrap_or(None) to handle missing fields for backward compatibility. However, this approach also silently masks actual data corruption errors. If the crdt_type field is partially corrupted (e.g., invalid enum discriminant), it will be silently treated as None instead of returning an error. This could lead to subtle bugs where corrupted data is accepted.

Suggested fix:

Check for EOF explicitly before attempting to read optional fields. For example: try to peek a byte first, or use a length-prefixed format, or read into a buffer first to distinguish between 'no more data' vs 'corrupted data'.

💡 Suggestion (2)

1. Original Element::new() doesn't initialize new metadata fields

File: crates/storage/src/entities.rs (line 182-195) | Consensus: 1/1 agents ✓

The original Element::new() method doesn't set crdt_type and field_name fields. While this maintains backward compatibility, it means entities created via the old API won't have CRDT type information, potentially causing merge dispatch to fall back to LWW semantics even for built-in CRDT types.

Suggested fix:

Document this limitation clearly, or consider deprecating `Element::new()` in favor of the new constructors that accept field_name and crdt_type.

2. Counter internal maps have inconsistent CRDT types

File: crates/storage/src/collections/counter.rs (line 207-232) | Consensus: 1/1 agents ✓

The positive map is created with CrdtType::Counter while the negative map is created via new_with_field_name_internal which sets CrdtType::UnorderedMap. This asymmetry could cause confusion during CRDT merge dispatch - the negative map will be treated as a generic UnorderedMap rather than as part of a Counter.

Suggested fix:

Consider creating both internal maps with a consistent CRDT type (either both as Counter or introduce a new internal marker type like `CrdtType::CounterInternal`), or document why the asymmetry is intentional.

📝 Nitpick (1)

1. LwwRegister fields not updated to use new_with_field_name

File: apps/state-schema-conformance/src/lib.rs (line 125-127) | Consensus: 1/1 agents ✓

The LwwRegister fields (status, user_id, counter) still use the old LwwRegister::new() method instead of a hypothetical new_with_field_name. This is inconsistent with the collection types and means these fields won't have deterministic IDs or stored field names. While LwwRegister may not need this (as it's a simple register), the inconsistency could be confusing.

Suggested fix:

Either add `new_with_field_name` to LwwRegister for consistency, or document why LwwRegister doesn't need deterministic IDs (since it's stored inline, not as a separate collection).

_{🤖 Generated by AI Code Reviewer | Review ID: review-d4cf6b50}

crates/storage/src/collections.rs

crates/storage/src/entities.rs

crates/storage/src/collections/counter.rs

apps/state-schema-conformance/src/lib.rs

github-actions · 2026-02-05T08:14:40Z

Merobox Workflows Failed

The following workflow(s) failed after retries:

blobs/workflows/blobs-example.yml
blobs/workflows/blobs.yml
collaborative-editor/workflows/collaborative-editor.yml
kv-store-init/workflows/kv-store-init.yml
kv-store-with-handlers/workflows/kv-store-with-handlers.yml
kv-store-with-user-and-frozen-storage/workflows/test_frozen_storage.yml
kv-store-with-user-and-frozen-storage/workflows/test_user_storage.yml
kv-store/workflows/bundle-invitation-test.yml
nested-crdt-test/workflows/nested-crdt-test.yml
private_data/workflows/example.yml
private_data/workflows/private-data.yml
team-metrics-custom/workflows/team-metrics-custom.yml
team-metrics-macro/workflows/team-metrics-macro.yml
xcall-example/workflows/xcall-example.yml

Please check the workflow logs for more details.

- Add new_with_field_name() method to Collection that generates deterministic IDs using SHA256 hash of parent_id + field_name - Add new_with_field_name() to all collection types: - Counter (handles positive/negative maps with deterministic IDs) - UnorderedMap - UnorderedSet - Vector - ReplicatedGrowableArray - Update #[app::state] macro to generate Default implementation that uses new_with_field_name() for collection fields - Add tests to verify determinism: - Same field names produce same IDs - Different field names produce different IDs - Parent ID affects child collection IDs Implements #1769 CIP Section: Protocol Invariants Invariant: I9 (Deterministic Entity IDs) Acceptance Criteria: ✅ Same code on two nodes produces identical collection IDs ✅ Nested collections derive IDs correctly (parent + field) ✅ Existing apps continue to work (backward compatibility) ✅ Unit tests verify determinism

The macro was generating a Default implementation that required all fields to implement Default, which breaks apps with non-Default types like enums. Users should manually implement Default or use new_with_field_name() in their init functions for deterministic IDs.

Counter's internal maps were using '{field_name}_positive' and '{field_name}_negative' as field names, which could silently collide with user-created collections. For example, a Counter with field name 'visits' would create internal maps with IDs derived from 'visits_positive', which would collide with a user-created UnorderedMap with field name 'visits_positive'. Fix by using a reserved internal prefix '__counter_internal_' for Counter's internal maps. This ensures: - Counter('visits') creates maps with IDs from '__counter_internal_visits_positive' - User collections named 'visits_positive' use IDs from 'visits_positive' - No collision possible Added test to verify no collision occurs. Fixes collision issue in crates/storage/src/collections/counter.rs:216-228

The compute_id and compute_collection_id functions both compute SHA256(parent_bytes + name_bytes) without domain separation, which can cause collisions. For example: - A nested collection with field name 'key' creates ID from SHA256(parent_id + 'key') - A map entry with key 'key' creates ID from SHA256(parent_id + 'key') - Both get identical IDs, causing data corruption Fix by adding domain separators: - compute_id uses '__calimero_entry__' separator - compute_collection_id uses '__calimero_collection__' separator This ensures map entries and nested collections never collide even with the same parent and name. Added test to verify no collision occurs. Fixes collision issue in: - crates/storage/src/collections.rs:66-74 (compute_collection_id) - crates/storage/src/collections.rs:57-63 (compute_id)

…ections Counter::new_with_field_name was defined in a generic impl block for any StorageAdaptor, while other collections (UnorderedMap, UnorderedSet, Vector, RGA) only expose new_with_field_name for MainStorage. This created an API inconsistency. Fix by moving new_with_field_name to the MainStorage-only impl block, matching the pattern used by all other collection types. Fixes API inconsistency in crates/storage/src/collections/counter.rs:188-201

Combined feature from PR #1786 and #1787: - Add field_name and crdt_type to Metadata struct - Add Element::new_with_field_name and new_with_field_name_and_crdt_type - Update Collection::new_with_field_name to store metadata - Update all collection types to pass their CrdtType: - UnorderedMap -> CrdtType::UnorderedMap - UnorderedSet -> CrdtType::UnorderedSet - Vector -> CrdtType::Vector - Counter -> CrdtType::Counter - RGA -> CrdtType::Rga - Add BorshSerialize/Deserialize/Ord/PartialOrd to CrdtType - Add UserStorage and FrozenStorage to CrdtType enum - Custom Borsh de/serialization for backward compatibility This enables: - Deterministic collection IDs (from #1787) - Schema inference from database metadata (from #1786) - CRDT type-aware merge dispatch

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 127.6s

🔴 Critical issues require attention

✅ 4 Fixed | 🆕 5 New

✅ Fixed Issues

The following issues from previous reviews have been addressed:

~~Breaking change to compute_id causes data loss for existing entries~~ (crates/storage/src/collections.rs:66)
~~Custom Borsh deserialization masks corruption errors~~ (crates/storage/src/entities.rs:495)
~~Counter internal maps have inconsistent CRDT types~~ (crates/storage/src/collections/counter.rs:207)
~~LwwRegister fields not updated to use new_with_field_name~~ (apps/state-schema-conformance/src/lib.rs:125)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Breaking change: adding domain separator to compute_id changes all existing entry IDs

File: crates/storage/src/collections.rs (line 66-71) | Consensus: 1/1 agents ✓

The compute_id function now includes DOMAIN_SEPARATOR_ENTRY in the hash computation. This changes the ID calculation for ALL existing map entries, meaning existing data stored before this change will become inaccessible because lookups will compute different IDs than what was used when the data was stored. This is a silent data loss scenario.

Suggested fix:

Either: (1) Add a migration mechanism to rehash existing entries, (2) Version the storage format and handle old entries without the domain separator, or (3) Only apply domain separation to new collections created with `new_with_field_name` while keeping backward compatibility for `new()`.

🟡 Warning (1)

1. Backward compatibility deserialization swallows all errors, not just EOF

File: crates/storage/src/entities.rs (line 498-502) | Consensus: 1/1 agents ✓

The custom BorshDeserialize for Metadata uses unwrap_or(None) when reading crdt_type and field_name. While this handles missing fields in old data correctly, it also silently converts actual deserialization errors (like corrupted data or invalid discriminants) to None, potentially masking data corruption issues.

Suggested fix:

Differentiate between EOF (expected for old data) and other errors (potential corruption). Consider checking the error kind: `match Option::<CrdtType>::deserialize_reader(reader) { Ok(v) => v, Err(e) if matches!(e.kind(), std::io::ErrorKind::UnexpectedEof) => None, Err(e) => return Err(e) }`

💡 Suggestion (2)

1. LwwRegister lacks new_with_field_name API unlike other collection types

File: apps/state-schema-conformance/src/lib.rs (line 129-131) | Consensus: 1/1 agents ✓

All collection types (UnorderedMap, UnorderedSet, Vector, Counter, RGA) received new_with_field_name methods, but LwwRegister still only has new(). The state-schema-conformance app uses LwwRegister::new(...) for status, user_id, counter, name, and active fields. This creates an API inconsistency where some CRDT types support deterministic IDs and metadata storage while others don't.

Suggested fix:

Add `new_with_field_name` to `LwwRegister` for API consistency, or document why LwwRegister intentionally doesn't need this (e.g., if registers don't have their own storage IDs).

2. Counter's internal map prefix lacks runtime enforcement

File: crates/storage/src/collections/counter.rs (line 215-229) | Consensus: 1/1 agents ✓

The counter uses __counter_internal_ prefix to namespace its internal positive/negative maps. While well-documented in comments, there's no runtime check preventing a user from accidentally creating a collection with a field name like __counter_internal_visits_positive, which could collide with a Counter named visits.

Suggested fix:

Consider adding a runtime check in `compute_collection_id` or `new_with_field_name` that rejects field names starting with reserved prefixes like `__counter_internal_` and logs a warning, or document this as a reserved namespace in public API docs.

📝 Nitpick (1)

1. New CrdtType variants lack documentation

File: crates/storage/src/collections/crdt_meta.rs (line 27-34) | Consensus: 1/1 agents ✓

The UserStorage and FrozenStorage variants were added to CrdtType but have minimal doc comments compared to other variants. It would be helpful to explain when these types are used and their merge semantics.

Suggested fix:

Add documentation similar to other variants, e.g., `/// User-specific isolated storage (no cross-user merging)` for UserStorage.

_{🤖 Generated by AI Code Reviewer | Review ID: review-4f002b0e}

crates/storage/src/collections.rs

crates/storage/src/entities.rs

apps/state-schema-conformance/src/lib.rs

crates/storage/src/collections/counter.rs

crates/storage/src/collections/crdt_meta.rs

- new_with_field_name() now takes only field_name (no parent_id) - new() generates random IDs (for nested collections in maps) - Deterministic IDs derived from field_name only - Update apps and tests to use simplified API This follows the POC branch approach where: - Top-level state fields use new_with_field_name('field') - Nested map values use new() - merge happens by key, not ID

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 236.7s

🔴 Critical issues require attention

✅ 9 Fixed | 🆕 5 New

✅ Fixed Issues

The following issues from previous reviews have been addressed:

~~Breaking change to compute_id causes data loss for existing entries~~ (crates/storage/src/collections.rs:66)
~~Custom Borsh deserialization masks corruption errors~~ (crates/storage/src/entities.rs:495)
~~Counter internal maps have inconsistent CRDT types~~ (crates/storage/src/collections/counter.rs:218)
~~LwwRegister fields not updated to use new_with_field_name~~ (apps/state-schema-conformance/src/lib.rs:125)
~~Breaking change: adding domain separator to compute_id changes all existing entry IDs~~ (crates/storage/src/collections.rs:66)
~~Backward compatibility deserialization swallows all errors, not just EOF~~ (crates/storage/src/entities.rs:498)
~~LwwRegister lacks new_with_field_name API unlike other collection types~~ (apps/state-schema-conformance/src/lib.rs:129)
~~Counter's internal map prefix lacks runtime enforcement~~ (crates/storage/src/collections/counter.rs:226)
~~New CrdtType variants lack documentation~~ (crates/storage/src/collections/crdt_meta.rs:27)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Breaking change: compute_id hash computation modified

File: crates/storage/src/collections.rs (line 66-72) | Consensus: 1/1 agents ✓

Adding DOMAIN_SEPARATOR_ENTRY to the existing compute_id function changes the hash computation for ALL existing map entries. Any data stored with the old algorithm (without domain separator) will have different computed IDs, causing lookups to fail and effectively making existing data inaccessible. This is a breaking change for any production deployment.

Suggested fix:

Either: (1) Add a migration path that re-hashes existing entries, (2) Create a separate function for new entries while keeping backward compatibility for old lookups, or (3) Version the ID computation and try both algorithms during lookup.

🟡 Warning (3)

1. Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used

File: crates/storage/src/collections/crdt_meta.rs (line 30-33) | Consensus: 1/1 agents ✓

The UserStorage and FrozenStorage variants are added to CrdtType enum but are never assigned anywhere in the codebase. The FrozenStorage::new_with_field_name and UserStorage::new_with_field_name methods use Element::new_with_field_name which sets crdt_type: None, not the corresponding variant. This violates the no-dead-code rule and the stated goal of storing CRDT types in metadata.

Suggested fix:

Either use these variants in `FrozenStorage::new_with_field_name` and `UserStorage::new_with_field_name` by calling `Element::new_with_field_name_and_crdt_type(..., CrdtType::FrozenStorage)`, or remove these unused variants.

2. Backward compatibility deserialization swallows all errors

File: crates/storage/src/entities.rs (line 493-498) | Consensus: 1/1 agents ✓

The custom BorshDeserialize implementation uses unwrap_or(None) which catches ANY deserialization error, not just EOF/missing data. If the crdt_type or field_name bytes are corrupted (e.g., partial write, invalid enum variant), the error is silently ignored and None is returned. This could mask data corruption.

Suggested fix:

Check specifically for EOF/unexpected end of input errors and only return None for those cases. Propagate other errors (invalid data, corruption) to the caller. For example: `match Option::<CrdtType>::deserialize_reader(reader) { Ok(v) => v, Err(e) if e.kind() == ErrorKind::UnexpectedEof => None, Err(e) => return Err(e) }`

3. FrozenStorage doesn't set its CrdtType in metadata

File: crates/storage/src/collections/frozen.rs (line 51-56) | Consensus: 1/1 agents ✓

The new_with_field_name method creates an Element with crdt_type: None instead of CrdtType::FrozenStorage. This is inconsistent with other collections (UnorderedMap, Vector, etc.) that properly set their CRDT type. The inner UnorderedMap gets CrdtType::UnorderedMap, but the outer FrozenStorage wrapper has no type information.

Suggested fix:

Change `Element::new_with_field_name(...)` to `Element::new_with_field_name_and_crdt_type(..., CrdtType::FrozenStorage)` and similarly for UserStorage.

💡 Suggestion (1)

1. Internal prefix collision possible with user-defined collections

File: crates/storage/src/collections/counter.rs (line 230-245) | Consensus: 1/1 agents ✓

The reserved prefix __counter_internal_ is documented but not enforced. A user could create a collection with field name __counter_internal_visits_positive which would collide with a Counter's internal map. While unlikely, this could cause subtle bugs in a production system.

Suggested fix:

Consider validating field names in `new_with_field_name` methods to reject names starting with reserved prefixes like `__counter_internal_`, `__frozen_storage_`, `__user_storage_`, etc. Alternatively, document this as a reserved namespace in public API documentation.

_{🤖 Generated by AI Code Reviewer | Review ID: review-7e88dbc7}

crates/storage/src/collections.rs

crates/storage/src/collections/crdt_meta.rs

crates/storage/src/entities.rs

crates/storage/src/collections/frozen.rs

crates/storage/src/collections/counter.rs

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 632.0s

🔴 Critical issues require attention

✅ 23 Fixed | 🆕 5 New

✅ Fixed Issues

The following issues from previous reviews have been addressed:

~~Breaking change to compute_id causes data loss for existing entries~~ (crates/storage/src/collections.rs:66)
~~Custom Borsh deserialization masks corruption errors~~ (crates/storage/src/entities.rs:495)
~~Counter internal maps have inconsistent CRDT types~~ (crates/storage/src/collections/counter.rs:218)
~~LwwRegister fields not updated to use new_with_field_name~~ (apps/state-schema-conformance/src/lib.rs:125)
~~Breaking change: adding domain separator to compute_id changes all existing entry IDs~~ (crates/storage/src/collections.rs:66)
~~Backward compatibility deserialization swallows all errors, not just EOF~~ (crates/storage/src/entities.rs:498)
~~LwwRegister lacks new_with_field_name API unlike other collection types~~ (apps/state-schema-conformance/src/lib.rs:129)
~~Counter's internal map prefix lacks runtime enforcement~~ (crates/storage/src/collections/counter.rs:226)
~~New CrdtType variants lack documentation~~ (crates/storage/src/collections/crdt_meta.rs:27)
~~Breaking change: compute_id hash computation modified~~ (crates/storage/src/collections.rs:66)
~~Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used~~ (crates/storage/src/collections/crdt_meta.rs:30)
~~Backward compatibility deserialization swallows all errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage doesn't set its CrdtType in metadata~~ (crates/storage/src/collections/frozen.rs:51)
~~Internal prefix collision possible with user-defined collections~~ (crates/storage/src/collections/counter.rs:230)
~~Breaking change: compute_id adds domain separator to existing function~~ (crates/storage/src/collections.rs:65)
~~Silent error suppression in Metadata deserialization masks corruption~~ (crates/storage/src/entities.rs:507)
~~FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing~~ (crates/storage/src/collections/frozen.rs:50)
~~Reserved prefix _counter_internal lacks enforcement~~ (crates/storage/src/collections/counter.rs:218)
~~Breaking change: compute_id now includes domain separator~~ (crates/storage/src/collections.rs:69)
~~FrozenStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/frozen.rs:53)
~~UserStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/user.rs:50)
~~Silent error swallowing in Metadata deserialization~~ (crates/storage/src/entities.rs:503)
~~Reserved prefix '_counter_internal' could collide with user collections~~ (crates/storage/src/collections/counter.rs:228)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Breaking change: compute_id now includes domain separator

File: crates/storage/src/collections.rs (line 57-63) | Consensus: 1/1 agents ✓

Adding DOMAIN_SEPARATOR_ENTRY to compute_id() changes the hash computation for ALL existing map entries. Existing data stored with the old compute_id(parent, key) will become inaccessible because lookups will now use compute_id(parent, DOMAIN_SEPARATOR_ENTRY + key), producing different IDs. This is a silent data loss scenario for any existing deployments.

Suggested fix:

Either: (1) Add a migration path that rehashes existing entries, (2) Keep backward compatibility by trying both old and new ID computation on lookup, or (3) Document this as a breaking change requiring data migration and bump the major version.

🟡 Warning (2)

1. Unsafe backward compatibility: unwrap_or(None) silently swallows errors

File: crates/storage/src/entities.rs (line 493-517) | Consensus: 1/1 agents ✓

The custom BorshDeserialize for Metadata uses unwrap_or(None) when reading crdt_type and field_name. This conflates three distinct cases: (1) EOF/no more data (expected for old data), (2) valid None value, and (3) corrupted/malformed data. Corrupted data will be silently treated as None instead of surfacing the error, potentially masking data integrity issues.

Suggested fix:

Check for EOF explicitly before deserializing optional fields. If data exists but deserialization fails, propagate the error rather than defaulting to None. Consider using a version byte prefix for the Metadata format to handle schema evolution safely.

2. FrozenStorage and UserStorage don't set crdt_type in metadata

File: crates/storage/src/collections/frozen.rs (line 47-56) | Consensus: 1/1 agents ✓

While other collections (UnorderedMap, Vector, Counter, etc.) use new_with_field_name_and_crdt_type which sets the CRDT type, FrozenStorage's new_with_field_name only calls Element::new_with_field_name which does NOT set crdt_type. This inconsistency means FrozenStorage entities won't have CRDT type metadata, which could break the merge dispatch feature that depends on crdt_type.

Suggested fix:

Update FrozenStorage and UserStorage to use `Element::new_with_field_name_and_crdt_type` with `CrdtType::FrozenStorage` and `CrdtType::UserStorage` respectively. The CrdtType enum already has these variants.

💡 Suggestion (2)

1. Reserved prefix collision possible with user-created collections

File: crates/storage/src/collections/counter.rs (line 221-243) | Consensus: 1/1 agents ✓

Counter uses __counter_internal_{field_name}_positive naming convention. While the test verifies internal consistency, a user could theoretically create a collection with field name __counter_internal_visits_positive that would have the same ID as a Counter's internal positive map for field visits. Consider documenting that __ prefixed field names are reserved, or validating field names don't start with reserved prefixes.

Suggested fix:

Add validation in `new_with_field_name` methods to reject field names starting with `__` (reserved prefix), or document clearly that such prefixes are reserved and will cause undefined behavior.

2. Element::new doesn't initialize crdt_type and field_name

File: crates/storage/src/entities.rs (line 182-195) | Consensus: 1/1 agents ✓

The existing Element::new method creates elements without crdt_type and field_name fields. While the new methods new_with_field_name and new_with_field_name_and_crdt_type are added, the PR diff shows they need to explicitly add crdt_type: None, field_name: None to the existing constructor. However, if someone calls Element::new() for a collection, the metadata won't reflect the CRDT type.

Suggested fix:

Consider whether `Element::new` should take an optional CrdtType parameter, or document clearly that `new_with_field_name_and_crdt_type` should be preferred for collection storage elements.

_{🤖 Generated by AI Code Reviewer | Review ID: review-1b27efc5}

crates/storage/src/collections.rs

crates/storage/src/entities.rs

crates/storage/src/collections/frozen.rs

crates/storage/src/collections/counter.rs

Add method to reassign collection IDs to deterministic values based on field names. This enables the #[app::state] macro to fix IDs after init() returns, ensuring all top-level collections have deterministic IDs regardless of how they were created. Changes: - Add Element::reassign_id_and_field_name() for updating ID and metadata - Add Collection::reassign_deterministic_id() and variant with crdt_type - Add reassign_deterministic_id() to all collection types: - UnorderedMap, Vector, UnorderedSet, Counter, RGA - UserStorage, FrozenStorage - Make compute_collection_id pub(crate) for UserStorage/FrozenStorage

The #[app::state] macro now generates a __assign_deterministic_ids() method that is called automatically after init() returns. This ensures all top-level collections have deterministic IDs based on their field names, regardless of how they were created in init(). This satisfies CIP Invariant I9: Deterministic Entity IDs > Given the same application code and field names, all nodes MUST > generate identical entity IDs for the same logical entities. Changes: - Add generate_assign_deterministic_ids_impl() to state.rs - Generate __assign_deterministic_ids() method on state structs - Wrap init() call in logic/method.rs to call __assign_deterministic_ids() Benefits: - Existing apps work without code changes (just recompile) - Users can use simple new() in init() without worrying about IDs - IDs are deterministic (macro ensures it) - Migrations 'just work' when recompiling

Previously UserStorage and FrozenStorage used CrdtType::Custom("...") instead of the proper enum variants CrdtType::UserStorage and CrdtType::FrozenStorage that already existed in the CrdtType enum. Changes: - UserStorage::crdt_type() returns CrdtType::UserStorage - FrozenStorage::crdt_type() returns CrdtType::FrozenStorage - new_with_field_name() and reassign_deterministic_id() now set crdt_type in the Element metadata This ensures proper CRDT type identification for merge dispatch.

Remove custom BorshSerialize/BorshDeserialize implementations for Metadata that handled old data without crdt_type and field_name fields. Breaking change: Old data without crdt_type/field_name will fail to deserialize. This is intentional - we require fresh nodes with the new data format. This simplifies the code and removes the unwrap_or(None) fallbacks that were flagged by MeroReviewer.

Add detailed documentation for each CrdtType variant explaining: - Merge semantics for each type - When/how IDs are assigned - CIP Invariant I9 compliance - Nested collection behavior Also add CrdtType::Record variant for composite structs (root app state) that merge field-by-field via the auto-generated Mergeable impl.

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 559.4s

🔴 Critical issues require attention

✅ 27 Fixed | 🆕 5 New

✅ Fixed Issues

The following issues from previous reviews have been addressed:

~~Breaking change to compute_id causes data loss for existing entries~~ (crates/storage/src/collections.rs:66)
~~Custom Borsh deserialization masks corruption errors~~ (crates/storage/src/entities.rs:495)
~~Counter internal maps have inconsistent CRDT types~~ (crates/storage/src/collections/counter.rs:218)
~~LwwRegister fields not updated to use new_with_field_name~~ (apps/state-schema-conformance/src/lib.rs:125)
~~Breaking change: adding domain separator to compute_id changes all existing entry IDs~~ (crates/storage/src/collections.rs:66)
~~Backward compatibility deserialization swallows all errors, not just EOF~~ (crates/storage/src/entities.rs:498)
~~LwwRegister lacks new_with_field_name API unlike other collection types~~ (apps/state-schema-conformance/src/lib.rs:129)
~~Counter's internal map prefix lacks runtime enforcement~~ (crates/storage/src/collections/counter.rs:226)
~~New CrdtType variants lack documentation~~ (crates/storage/src/collections/crdt_meta.rs:69)
~~Breaking change: compute_id hash computation modified~~ (crates/storage/src/collections.rs:66)
~~Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used~~ (crates/storage/src/collections/crdt_meta.rs:30)
~~Backward compatibility deserialization swallows all errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage doesn't set its CrdtType in metadata~~ (crates/storage/src/collections/frozen.rs:51)
~~Internal prefix collision possible with user-defined collections~~ (crates/storage/src/collections/counter.rs:230)
~~Breaking change: compute_id adds domain separator to existing function~~ (crates/storage/src/collections.rs:65)
~~Silent error suppression in Metadata deserialization masks corruption~~ (crates/storage/src/entities.rs:507)
~~FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing~~ (crates/storage/src/collections/frozen.rs:50)
~~Reserved prefix _counter_internal lacks enforcement~~ (crates/storage/src/collections/counter.rs:218)
~~Breaking change: compute_id now includes domain separator~~ (crates/storage/src/collections.rs:69)
~~FrozenStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/frozen.rs:53)
~~UserStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/user.rs:50)
~~Silent error swallowing in Metadata deserialization~~ (crates/storage/src/entities.rs:503)
~~Reserved prefix '_counter_internal' could collide with user collections~~ (crates/storage/src/collections/counter.rs:228)
~~Breaking change: compute_id now includes domain separator~~ (crates/storage/src/collections.rs:57)
~~Unsafe backward compatibility: unwrap_or(None) silently swallows errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage and UserStorage don't set crdt_type in metadata~~ (crates/storage/src/collections/frozen.rs:47)
~~Reserved prefix collision possible with user-created collections~~ (crates/storage/src/collections/counter.rs:221)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Adding domain separator to compute_id breaks backward compatibility

File: crates/storage/src/collections.rs (line 57-63) | Consensus: 1/1 agents ✓

The diff adds DOMAIN_SEPARATOR_ENTRY to the existing compute_id function which changes the hash computation for all existing map entries. Existing databases will have entries stored under IDs computed without the domain separator. After this change, lookups will compute different IDs and fail to find existing data, effectively corrupting existing deployments.

Suggested fix:

Either: (1) only apply domain separation to new collections created with `new_with_field_name`, keeping `compute_id` unchanged for backward compatibility, or (2) implement a migration path that rehashes existing entries, or (3) version the ID computation scheme.

🟡 Warning (3)

1. Fragile string-based type detection for collections

File: crates/sdk/macros/src/state.rs (line 473-481) | Consensus: 1/1 agents ✓

The is_collection_type function uses type_str.contains("UnorderedMap") pattern matching which is fragile and could incorrectly match user-defined types that happen to contain these substrings (e.g., MyUnorderedMapWrapper, VectorField). This could cause the macro to generate reassign_deterministic_id calls for types that don't have this method.

Suggested fix:

Consider using semantic type analysis via syn's type visitor pattern, or at minimum match on the final segment of type paths rather than substring contains. Alternatively, use a marker trait that collection types implement.

2. ID reassignment after init may orphan data written during init

File: crates/sdk/macros/src/logic/method.rs (line 162-169) | Consensus: 1/1 agents ✓

The init wrapper calls state.__assign_deterministic_ids() after the user's init() function returns. If a user performs operations on collections during init() (e.g., inserting data), those entries are stored under the original random IDs. When IDs are then reassigned, the collection's metadata changes but the child entries remain under the old parent ID, potentially causing data orphaning or lookup failures.

Suggested fix:

Consider either: (1) creating collections with deterministic IDs from the start by enhancing the macro to intercept collection creation, (2) migrating child entries during ID reassignment, or (3) documenting that `init()` should only construct empty collections.

3. Truncated diff hides critical implementation details

File: crates/storage/src/entities.rs (line 329-340) | Consensus: 1/1 agents ✓

The reassign_id_and_field_name method implementation is truncated in the diff. The PR description claims 'Custom Borsh de/serialization for Metadata handles old data gracefully' but the Metadata struct changes with field_name and crdt_type fields aren't shown with custom serialization. If the Borsh derives are standard (not custom), deserializing old data (without these fields) will fail.

Suggested fix:

Verify that Metadata has custom BorshDeserialize implementation that handles missing `field_name` and `crdt_type` fields by defaulting to None. Add migration tests that deserialize data from before this PR.

💡 Suggestion (1)

1. Reserved prefix collision not validated

File: crates/storage/src/collections/counter.rs (line 229-243) | Consensus: 1/1 agents ✓

Counter uses __counter_internal_{field_name} prefix for its internal maps, but there's no compile-time or runtime validation that user field names don't start with __counter_internal_. A user could create a counter field named internal_foo or a map named __counter_internal_visits_positive which would collide with Counter's internal naming scheme.

Suggested fix:

Add compile-time validation in the state macro to reject field names starting with reserved prefixes (`__counter_internal_`, `__user_storage_`, `__frozen_storage_`), or use a more collision-resistant scheme like UUID-based prefixes.

_{🤖 Generated by AI Code Reviewer | Review ID: review-a25ce752}

crates/storage/src/collections.rs

crates/sdk/macros/src/state.rs

crates/sdk/macros/src/logic/method.rs

crates/storage/src/entities.rs

crates/storage/src/collections/counter.rs

The init wrapper unconditionally calls __assign_deterministic_ids() on the state after init(), but this method was only generated for structs with CRDT collection fields. Apps using plain BTreeMap/Vec would fail to compile. Fix: Always generate the method, even if empty (no-op for non-CRDT apps). This fixes abi_conformance and other apps that don't use calimero-storage collections.

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 205.8s

🔴 Critical issues require attention

✅ 31 Fixed | 🆕 4 New | ⏳ 1 Open

✅ Fixed Issues

The following issues from previous reviews have been addressed:

~~Breaking change to compute_id causes data loss for existing entries~~ (crates/storage/src/collections.rs:66)
~~Custom Borsh deserialization masks corruption errors~~ (crates/storage/src/entities.rs:495)
~~Counter internal maps have inconsistent CRDT types~~ (crates/storage/src/collections/counter.rs:218)
~~LwwRegister fields not updated to use new_with_field_name~~ (apps/state-schema-conformance/src/lib.rs:125)
~~Breaking change: adding domain separator to compute_id changes all existing entry IDs~~ (crates/storage/src/collections.rs:66)
~~Backward compatibility deserialization swallows all errors, not just EOF~~ (crates/storage/src/entities.rs:498)
~~LwwRegister lacks new_with_field_name API unlike other collection types~~ (apps/state-schema-conformance/src/lib.rs:129)
~~Counter's internal map prefix lacks runtime enforcement~~ (crates/storage/src/collections/counter.rs:226)
~~New CrdtType variants lack documentation~~ (crates/storage/src/collections/crdt_meta.rs:69)
~~Breaking change: compute_id hash computation modified~~ (crates/storage/src/collections.rs:66)
~~Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used~~ (crates/storage/src/collections/crdt_meta.rs:30)
~~Backward compatibility deserialization swallows all errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage doesn't set its CrdtType in metadata~~ (crates/storage/src/collections/frozen.rs:51)
~~Internal prefix collision possible with user-defined collections~~ (crates/storage/src/collections/counter.rs:230)
~~Breaking change: compute_id adds domain separator to existing function~~ (crates/storage/src/collections.rs:65)
~~Silent error suppression in Metadata deserialization masks corruption~~ (crates/storage/src/entities.rs:507)
~~FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing~~ (crates/storage/src/collections/frozen.rs:50)
~~Reserved prefix _counter_internal lacks enforcement~~ (crates/storage/src/collections/counter.rs:218)
~~Breaking change: compute_id now includes domain separator~~ (crates/storage/src/collections.rs:69)
~~FrozenStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/frozen.rs:53)
~~UserStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/user.rs:50)
~~Silent error swallowing in Metadata deserialization~~ (crates/storage/src/entities.rs:503)
~~Reserved prefix '_counter_internal' could collide with user collections~~ (crates/storage/src/collections/counter.rs:228)
~~Unsafe backward compatibility: unwrap_or(None) silently swallows errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage and UserStorage don't set crdt_type in metadata~~ (crates/storage/src/collections/frozen.rs:47)
~~Reserved prefix collision possible with user-created collections~~ (crates/storage/src/collections/counter.rs:221)
~~Adding domain separator to compute_id breaks backward compatibility~~ (crates/storage/src/collections.rs:57)
~~Fragile string-based type detection for collections~~ (crates/sdk/macros/src/state.rs:473)
~~ID reassignment after init may orphan data written during init~~ (crates/sdk/macros/src/logic/method.rs:162)
~~Truncated diff hides critical implementation details~~ (crates/storage/src/entities.rs:329)
~~Reserved prefix collision not validated~~ (crates/storage/src/collections/counter.rs:229)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Missing backward-compatible deserialization for Metadata

File: crates/storage/src/entities.rs (line 186-195) | Consensus: 1/1 agents ✓

The PR description claims 'Custom Borsh de/serialization for Metadata handles old data gracefully', but the diff shows only adding crdt_type: None and field_name: None to constructors. The Metadata struct derives BorshDeserialize, which means deserializing old data (without these fields) will fail with a borsh parsing error. No custom deserializer is visible in the diff.

Suggested fix:

Implement custom BorshDeserialize for Metadata that handles missing fields by defaulting them to None, similar to how the PR description claims it should work.

🟡 Warning (2)

1. Fragile type detection using string matching

File: crates/sdk/macros/src/state.rs (line 470-478) | Consensus: 1/1 agents ✓

The is_collection_type function uses type_str.contains("Counter") and similar patterns to detect CRDT types. This can match unintended types like MyCustomCounter, CounterWrapper, or types from other crates. A type alias like type MyCounter = SomeOtherType; could also cause false positives or negatives.

Suggested fix:

Use more precise matching that checks for the full type path (e.g., `calimero_storage::collections::Counter`), or use a trait-based approach where collection types implement a marker trait that the macro can detect.

2. Redundant double ID assignment in init

File: apps/state-schema-conformance/src/lib.rs (line 111-130) | Consensus: 1/1 agents ✓

Collections are created with new_with_field_name("field_name") which assigns a deterministic ID, but then the macro-generated __assign_deterministic_ids() is unconditionally called which reassigns the same IDs again. While functionally correct (idempotent), this duplicates work. More importantly, if a user intentionally uses a different field name in new_with_field_name, it will be silently overwritten.

Suggested fix:

Either (1) document that `new_with_field_name` in init() is unnecessary because the macro handles it, and update the example to use `new()`, or (2) add a flag to skip reassignment if the ID is already deterministic. The example sets a precedent that may mislead other developers.

💡 Suggestion (1)

1. CrdtType should derive Hash for HashMap/HashSet usage

File: crates/storage/src/collections/crdt_meta.rs (line 15-31) | Consensus: 1/1 agents ✓

The diff adds PartialOrd, Ord, BorshSerialize, BorshDeserialize derives to CrdtType but not Hash. This prevents using CrdtType as a key in HashMap or HashSet, which may be needed for type registries or dispatch tables in future sync protocol work.

Suggested fix:

Add `Hash` to the derive list: `#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, BorshSerialize, BorshDeserialize)]`

⏳ Open Issues

These issues from previous reviews are still present:

🔴 Critical (1)

1. Breaking change: compute_id now includes domain separator

File: crates/storage/src/collections.rs (line 57-63) | Consensus: 1/1 agents ✓

The compute_id function is modified to include DOMAIN_SEPARATOR_ENTRY in the hash computation. This changes the ID for all existing map entries. Existing databases will have entries stored under old IDs (without domain separator) that cannot be found with the new ID computation. This is a silent data loss issue - the data exists but becomes inaccessible.

Suggested fix:

Add a migration path or versioned ID computation. Consider checking for entries with legacy IDs (without domain separator) as a fallback, or provide a database migration tool to rehash all entries.

_{🤖 Generated by AI Code Reviewer | Review ID: review-bb8dc2d3}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issues.}

cursor · 2026-02-05T13:33:06Z

crates/storage/src/collections.rs

+    pub(crate) fn reassign_deterministic_id(&mut self, field_name: &str) {
+        let new_id = compute_collection_id(None, field_name);
+        self.storage.reassign_id_and_field_name(new_id, field_name);
+    }


Storage entries orphaned when collection IDs reassigned

High Severity

When collections are created using new() (as documented to be supported), they are immediately persisted to storage via add_child_to() with their initial random ID. When reassign_deterministic_id() is later called by the macro-generated __assign_deterministic_ids(), only the in-memory ID is changed via reassign_id_and_field_name(). The old storage entry and ROOT's child reference to the original ID are never cleaned up, resulting in orphaned data and stale parent-child relationships.

Additional Locations (1)

crates/storage/src/collections.rs#L140-L157

✅ Resolved - This issue has been addressed in the latest changes.

cursor · 2026-02-05T13:33:06Z

crates/sdk/macros/src/state.rs

+            Some(quote! {
+                self.#field_name.reassign_deterministic_id(#field_name_str);
+            })
+        })


Tuple struct collection fields silently skip ID assignment

Medium Severity

The generate_assign_deterministic_ids_impl function uses field.ident.as_ref()? which returns None for tuple struct fields (e.g., struct MyState(UnorderedMap<K, V>)), silently skipping them. If a tuple struct is used as app state with collection fields created via new(), those collections would retain random IDs instead of getting deterministic ones. This violates the documented behavior that "users can use UnorderedMap::new() in init() while still getting deterministic IDs" and could cause sync failures between nodes without any compile-time or runtime warning.

✅ Resolved - This issue has been addressed in the latest changes.

cursor · 2026-02-05T13:42:46Z

Bugbot Autofix prepared fixes for 2 of the 2 bugs found in the latest run.

✅ Fixed: Storage entries orphaned when collection IDs reassigned
- Fixed by updating reassign_deterministic_id to clean up old storage entries, index entries, and parent-child references before assigning the new deterministic ID.
✅ Fixed: Tuple struct collection fields silently skip ID assignment
- Fixed by using enumerate() and handling tuple struct fields with index-based field names (0, 1, etc.) instead of silently skipping them.

Or push these changes by commenting:

@cursor push 51c1411c99

Preview (51c1411c99)

diff --git a/crates/sdk/macros/src/state.rs b/crates/sdk/macros/src/state.rs
--- a/crates/sdk/macros/src/state.rs
+++ b/crates/sdk/macros/src/state.rs
@@ -343,8 +343,8 @@
     // Only merge fields that are known CRDT types
     let merge_calls: Vec<_> = fields
         .iter()
-        .filter_map(|field| {
-            let field_name = field.ident.as_ref()?;
+        .enumerate()
+        .filter_map(|(idx, field)| {
             let field_type = &field.ty;
 
             // Check if this is a known CRDT type by examining the type path
@@ -366,21 +366,41 @@
                 return None;
             }
 
-            // Generate merge call for CRDT fields
-            Some(quote! {
-                ::calimero_storage::collections::Mergeable::merge(
-                    &mut self.#field_name,
-                    &other.#field_name
-                ).map_err(|e| {
-                    ::calimero_storage::collections::crdt_meta::MergeError::StorageError(
-                        format!(
-                            "Failed to merge field '{}': {:?}",
-                            stringify!(#field_name),
-                            e
+            // Handle both named fields and tuple struct fields
+            if let Some(field_name) = &field.ident {
+                // Named field
+                Some(quote! {
+                    ::calimero_storage::collections::Mergeable::merge(
+                        &mut self.#field_name,
+                        &other.#field_name
+                    ).map_err(|e| {
+                        ::calimero_storage::collections::crdt_meta::MergeError::StorageError(
+                            format!(
+                                "Failed to merge field '{}': {:?}",
+                                stringify!(#field_name),
+                                e
+                            )
                         )
-                    )
-                })?;
-            })
+                    })?;
+                })
+            } else {
+                // Tuple struct field
+                let field_index = syn::Index::from(idx);
+                Some(quote! {
+                    ::calimero_storage::collections::Mergeable::merge(
+                        &mut self.#field_index,
+                        &other.#field_index
+                    ).map_err(|e| {
+                        ::calimero_storage::collections::crdt_meta::MergeError::StorageError(
+                            format!(
+                                "Failed to merge field {}: {:?}",
+                                #idx,
+                                e
+                            )
+                        )
+                    })?;
+                })
+            }
         })
         .collect();
 
@@ -478,8 +498,8 @@
     // Generate reassign calls for each collection field
     let reassign_calls: Vec<_> = fields
         .iter()
-        .filter_map(|field| {
-            let field_name = field.ident.as_ref()?;
+        .enumerate()
+        .filter_map(|(idx, field)| {
             let field_type = &field.ty;
             let type_str = quote! { #field_type }.to_string();
 
@@ -487,10 +507,21 @@
                 return None;
             }
 
-            let field_name_str = field_name.to_string();
-            Some(quote! {
-                self.#field_name.reassign_deterministic_id(#field_name_str);
-            })
+            // Handle both named fields and tuple struct fields
+            if let Some(field_name) = &field.ident {
+                // Named field: use field name for both access and ID
+                let field_name_str = field_name.to_string();
+                Some(quote! {
+                    self.#field_name.reassign_deterministic_id(#field_name_str);
+                })
+            } else {
+                // Tuple struct field: use index for access, index string for ID
+                let field_index = syn::Index::from(idx);
+                let field_name_str = idx.to_string();
+                Some(quote! {
+                    self.#field_index.reassign_deterministic_id(#field_name_str);
+                })
+            }
         })
         .collect();
 

diff --git a/crates/storage/src/collections.rs b/crates/storage/src/collections.rs
--- a/crates/storage/src/collections.rs
+++ b/crates/storage/src/collections.rs
@@ -50,8 +50,9 @@
 use crate as calimero_storage;
 use crate::address::Id;
 use crate::entities::{ChildInfo, Data, Element, StorageType};
+use crate::index::Index;
 use crate::interface::{Interface, StorageError};
-use crate::store::{MainStorage, StorageAdaptor};
+use crate::store::{Key, MainStorage, StorageAdaptor};
 use crate::{AtomicUnit, Collection};
 
 /// Domain separator for map entry IDs to prevent collision with collection IDs.
@@ -221,26 +222,72 @@
     /// all top-level collections have deterministic IDs regardless of how they were
     /// created in `init()`.
     ///
+    /// This method cleans up the old storage entry and parent-child references
+    /// when moving from a random ID to a deterministic one.
+    ///
     /// # Arguments
     /// * `field_name` - The name of the struct field containing this collection
+    #[expect(clippy::expect_used, reason = "fatal error if cleanup fails")]
     pub(crate) fn reassign_deterministic_id(&mut self, field_name: &str) {
         let new_id = compute_collection_id(None, field_name);
+        let old_id = self.storage.id();
+
+        // If already has the correct ID, nothing to do
+        if old_id == new_id {
+            return;
+        }
+
+        // Clean up old storage entry and index
+        let _ignored = S::storage_remove(Key::Entry(old_id));
+        let _ignored = S::storage_remove(Key::Index(old_id));
+
+        // Remove old child reference from ROOT (without creating tombstone)
+        let _ = <Index<S>>::remove_child_reference_only(*ROOT_ID, old_id);
+
+        // Update in-memory ID and field name
         self.storage.reassign_id_and_field_name(new_id, field_name);
+
+        // Add the collection with new ID to ROOT
+        let _ = <Interface<S>>::add_child_to(*ROOT_ID, self)
+            .expect("failed to add collection with new ID");
     }
 
     /// Reassigns the collection's ID with a specific CRDT type.
     ///
+    /// This method cleans up the old storage entry and parent-child references
+    /// when moving from a random ID to a deterministic one.
+    ///
     /// # Arguments
     /// * `field_name` - The name of the struct field containing this collection
     /// * `crdt_type` - The CRDT type for merge dispatch
+    #[expect(clippy::expect_used, reason = "fatal error if cleanup fails")]
     pub(crate) fn reassign_deterministic_id_with_crdt_type(
         &mut self,
         field_name: &str,
         crdt_type: CrdtType,
     ) {
         let new_id = compute_collection_id(None, field_name);
+        let old_id = self.storage.id();
+
+        // If already has the correct ID, nothing to do
+        if old_id == new_id {
+            return;
+        }
+
+        // Clean up old storage entry and index
+        let _ignored = S::storage_remove(Key::Entry(old_id));
+        let _ignored = S::storage_remove(Key::Index(old_id));
+
+        // Remove old child reference from ROOT (without creating tombstone)
+        let _ = <Index<S>>::remove_child_reference_only(*ROOT_ID, old_id);
+
+        // Update in-memory ID, field name, and CRDT type
         self.storage.reassign_id_and_field_name(new_id, field_name);
         self.storage.metadata.crdt_type = Some(crdt_type);
+
+        // Add the collection with new ID to ROOT
+        let _ = <Interface<S>>::add_child_to(*ROOT_ID, self)
+            .expect("failed to add collection with new ID");
     }
 
     /// Inserts an item into the collection.

diff --git a/crates/storage/src/index.rs b/crates/storage/src/index.rs
--- a/crates/storage/src/index.rs
+++ b/crates/storage/src/index.rs
@@ -309,6 +309,20 @@
         Self::mark_deleted(id, time_now())
     }
 
+    /// Removes a child reference from a parent without creating a tombstone.
+    ///
+    /// Used when reassigning collection IDs - we need to remove the old child
+    /// reference from the parent but don't want to create a tombstone since
+    /// the collection is being moved, not deleted.
+    pub(crate) fn remove_child_reference_only(
+        parent_id: Id,
+        child_id: Id,
+    ) -> Result<(), StorageError> {
+        Self::update_parent_after_child_removal(parent_id, child_id)?;
+        Self::recalculate_ancestor_hashes_for(parent_id)?;
+        Ok(())
+    }
+
     /// Updates parent's children list and hash after child removal.
     ///
     /// Step 2 of deletion: Remove child from parent's index and recalculate hash.

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 368.6s

🔴 Critical issues require attention

✅ 34 Fixed | 🆕 5 New

✅ Fixed Issues

The following issues from previous reviews have been addressed:

~~Breaking change to compute_id causes data loss for existing entries~~ (crates/storage/src/collections.rs:67)
~~Custom Borsh deserialization masks corruption errors~~ (crates/storage/src/entities.rs:495)
~~Counter internal maps have inconsistent CRDT types~~ (crates/storage/src/collections/counter.rs:218)
~~LwwRegister fields not updated to use new_with_field_name~~ (apps/state-schema-conformance/src/lib.rs:125)
~~Breaking change: adding domain separator to compute_id changes all existing entry IDs~~ (crates/storage/src/collections.rs:67)
~~Backward compatibility deserialization swallows all errors, not just EOF~~ (crates/storage/src/entities.rs:498)
~~LwwRegister lacks new_with_field_name API unlike other collection types~~ (apps/state-schema-conformance/src/lib.rs:129)
~~Counter's internal map prefix lacks runtime enforcement~~ (crates/storage/src/collections/counter.rs:226)
~~New CrdtType variants lack documentation~~ (crates/storage/src/collections/crdt_meta.rs:69)
~~Breaking change: compute_id hash computation modified~~ (crates/storage/src/collections.rs:67)
~~Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used~~ (crates/storage/src/collections/crdt_meta.rs:30)
~~Backward compatibility deserialization swallows all errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage doesn't set its CrdtType in metadata~~ (crates/storage/src/collections/frozen.rs:51)
~~Internal prefix collision possible with user-defined collections~~ (crates/storage/src/collections/counter.rs:230)
~~Breaking change: compute_id adds domain separator to existing function~~ (crates/storage/src/collections.rs:66)
~~Silent error suppression in Metadata deserialization masks corruption~~ (crates/storage/src/entities.rs:507)
~~FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing~~ (crates/storage/src/collections/frozen.rs:50)
~~Reserved prefix _counter_internal lacks enforcement~~ (crates/storage/src/collections/counter.rs:218)
~~Breaking change: compute_id now includes domain separator~~ (crates/storage/src/collections.rs:70)
~~FrozenStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/frozen.rs:53)
~~UserStorage.new_with_field_name doesn't set CrdtType~~ (crates/storage/src/collections/user.rs:50)
~~Silent error swallowing in Metadata deserialization~~ (crates/storage/src/entities.rs:503)
~~Reserved prefix '_counter_internal' could collide with user collections~~ (crates/storage/src/collections/counter.rs:228)
~~Breaking change: compute_id now includes domain separator~~ (crates/storage/src/collections.rs:58)
~~Unsafe backward compatibility: unwrap_or(None) silently swallows errors~~ (crates/storage/src/entities.rs:493)
~~FrozenStorage and UserStorage don't set crdt_type in metadata~~ (crates/storage/src/collections/frozen.rs:47)
~~Reserved prefix collision possible with user-created collections~~ (crates/storage/src/collections/counter.rs:221)
~~Adding domain separator to compute_id breaks backward compatibility~~ (crates/storage/src/collections.rs:58)
~~Fragile string-based type detection for collections~~ (crates/sdk/macros/src/state.rs:493)
~~ID reassignment after init may orphan data written during init~~ (crates/sdk/macros/src/logic/method.rs:162)
~~Truncated diff hides critical implementation details~~ (crates/storage/src/entities.rs:329)
~~Reserved prefix collision not validated~~ (crates/storage/src/collections/counter.rs:229)
~~High Severity~~ (crates/storage/src/collections.rs:253)
~~Medium Severity~~ (crates/sdk/macros/src/state.rs:525)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. PR description falsely claims backward compatibility

File: crates/storage/src/entities.rs (line 495-497) | Consensus: 1/1 agents ✓

The PR description states 'Custom Borsh de/serialization for Metadata handles old data gracefully' but the actual implementation uses derive macros with a comment explicitly stating: 'Breaking change: Old data without crdt_type/field_name fields will fail to deserialize. This is intentional - we require fresh nodes with the new data format.' This is a significant mismatch between documented and actual behavior that could mislead users about migration requirements.

Suggested fix:

Update the PR description to accurately reflect that this is a breaking change requiring fresh nodes, or implement actual backward-compatible deserialization that handles missing fields with defaults.

🟡 Warning (2)

1. Fragile string-based type detection for collection types

File: crates/sdk/macros/src/state.rs (line 488-496) | Consensus: 1/1 agents ✓

The is_collection_type function uses type_str.contains() to detect collection types. This could incorrectly match user-defined types containing these substrings (e.g., a type named MyCounterWrapper would match 'Counter'). It also won't work correctly with fully qualified paths or type aliases that don't contain the original type name.

Suggested fix:

Consider using syn's type parsing to extract the actual type path segments, or at minimum check for word boundaries (e.g., type_str ends with the type name or is followed by '<').

2. Storage cleanup errors are silently ignored in reassign_deterministic_id

File: crates/storage/src/collections.rs (line 227-242) | Consensus: 1/1 agents ✓

In reassign_deterministic_id, errors from S::storage_remove(Key::Entry(old_id)) and S::storage_remove(Key::Index(old_id)) are assigned to _ignored variables, meaning failures to clean up old storage entries go unnoticed. This could lead to orphaned entries accumulating over time.

Suggested fix:

At minimum, log a warning when storage cleanup fails. Consider whether cleanup failures should propagate as errors, especially since the function uses `#[expect(clippy::expect_used)]` suggesting errors are considered fatal.

💡 Suggestion (2)

1. CrdtType::Record variant appears unused

File: crates/storage/src/collections/crdt_meta.rs (line 85-91) | Consensus: 1/1 agents ✓

The CrdtType::Record variant is added with documentation stating it's 'Used for the root application state (annotated with #[app::state])' but I don't see it being used anywhere in the implementation. The root state doesn't appear to set its crdt_type to Record.

Suggested fix:

Either implement usage of CrdtType::Record for the root state struct, or remove the variant if it's not needed for this PR.

2. FrozenStorage/UserStorage reassign doesn't clean up wrapper's own storage entry

File: crates/storage/src/collections/frozen.rs (line 60-80) | Consensus: 1/1 agents ✓

Unlike Collection::reassign_deterministic_id which removes old storage entries before adding new ones, FrozenStorage::reassign_deterministic_id (and UserStorage) only updates in-memory state and delegates to inner map. If self.storage Element is ever persisted independently, this could leave orphaned entries. The asymmetry with Collection's implementation is worth clarifying.

Suggested fix:

Document whether `FrozenStorage.storage` Element is persisted separately, and if so, add cleanup logic similar to Collection's implementation.

_{🤖 Generated by AI Code Reviewer | Review ID: review-0eff0200}

meroreviewer · 2026-02-05T13:53:03Z

crates/storage/src/entities.rs

@@ -377,6 +495,10 @@ impl Metadata {
    }


🔴 PR description falsely claims backward compatibility

The PR description states 'Custom Borsh de/serialization for Metadata handles old data gracefully' but the actual implementation uses derive macros with a comment explicitly stating: 'Breaking change: Old data without crdt_type/field_name fields will fail to deserialize. This is intentional - we require fresh nodes with the new data format.' This is a significant mismatch between documented and actual behavior that could mislead users about migration requirements.

Suggested fix:

Update the PR description to accurately reflect that this is a breaking change requiring fresh nodes, or implement actual backward-compatible deserialization that handles missing fields with defaults.

meroreviewer · 2026-02-05T13:53:05Z

crates/sdk/macros/src/state.rs

+    };
+
+    // Helper function to check if a type is a collection that needs ID assignment
+    fn is_collection_type(type_str: &str) -> bool {


🟡 Fragile string-based type detection for collection types

The is_collection_type function uses type_str.contains() to detect collection types. This could incorrectly match user-defined types containing these substrings (e.g., a type named MyCounterWrapper would match 'Counter'). It also won't work correctly with fully qualified paths or type aliases that don't contain the original type name.

Suggested fix:

Consider using syn's type parsing to extract the actual type path segments, or at minimum check for word boundaries (e.g., type_str ends with the type name or is followed by '<').

meroreviewer · 2026-02-05T13:53:07Z

crates/storage/src/collections.rs

+    ///
+    /// This method cleans up the old storage entry and parent-child references
+    /// when moving from a random ID to a deterministic one.
+    ///


🟡 Storage cleanup errors are silently ignored in reassign_deterministic_id

In reassign_deterministic_id, errors from S::storage_remove(Key::Entry(old_id)) and S::storage_remove(Key::Index(old_id)) are assigned to _ignored variables, meaning failures to clean up old storage entries go unnoticed. This could lead to orphaned entries accumulating over time.

Suggested fix:

At minimum, log a warning when storage cleanup fails. Consider whether cleanup failures should propagate as errors, especially since the function uses `#[expect(clippy::expect_used)]` suggesting errors are considered fatal.

meroreviewer · 2026-02-05T13:53:09Z

crates/storage/src/collections/crdt_meta.rs

+
+    /// Frozen Storage - write-once storage for immutable data.
+    ///
+    /// Data can be written once and never modified or deleted.


💡 CrdtType::Record variant appears unused

The CrdtType::Record variant is added with documentation stating it's 'Used for the root application state (annotated with #[app::state])' but I don't see it being used anywhere in the implementation. The root state doesn't appear to set its crdt_type to Record.

Suggested fix:

Either implement usage of CrdtType::Record for the root state struct, or remove the variant if it's not needed for this PR.

meroreviewer · 2026-02-05T13:53:11Z

crates/storage/src/collections/frozen.rs

+        let mut storage = Element::new_with_field_name(None, Some(field_name.to_string()));
+        storage.metadata.crdt_type = Some(CrdtType::FrozenStorage);
+        Self {
+            inner: UnorderedMap::new_with_field_name(&format!("__frozen_storage_{field_name}")),


💡 FrozenStorage/UserStorage reassign doesn't clean up wrapper's own storage entry

Unlike Collection::reassign_deterministic_id which removes old storage entries before adding new ones, FrozenStorage::reassign_deterministic_id (and UserStorage) only updates in-memory state and delegates to inner map. If self.storage Element is ever persisted independently, this could leave orphaned entries. The asymmetry with Collection's implementation is worth clarifying.

Suggested fix:

Document whether `FrozenStorage.storage` Element is persisted separately, and if so, add cleanup logic similar to Collection's implementation.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issue.}

cursor · 2026-02-05T13:56:17Z

crates/sdk/macros/src/state.rs

+        StructOrEnumItem::Enum(_) => {
+            // Enums don't have fields
+            return quote! {};
+        }


Enum state types missing required method causing compile failure

Medium Severity

The generate_assign_deterministic_ids_impl function returns an empty quote! {} for enum types, generating no impl block and no __assign_deterministic_ids method. However, the init wrapper in method.rs unconditionally calls state.__assign_deterministic_ids(). This inconsistency causes compile failures when enum types are used as app state with #[app::init]. The code comment on line 541-542 claims the method is "always generated (even if empty)" but this is incorrect for enums.

Additional Locations (1)

crates/sdk/macros/src/logic/method.rs#L162-L170

cursor · 2026-02-05T13:59:18Z

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

✅ Fixed: Enum state types missing required method causing compile failure
- Generated a no-op __assign_deterministic_ids method for enum state types, matching the behavior of structs without CRDT fields and preventing compile failures when enums are used with #[app::init].

Or push these changes by commenting:

@cursor push c3f70789bb

Preview (c3f70789bb)

diff --git a/crates/sdk/macros/src/state.rs b/crates/sdk/macros/src/state.rs
--- a/crates/sdk/macros/src/state.rs
+++ b/crates/sdk/macros/src/state.rs
@@ -475,12 +475,24 @@
 ) -> TokenStream {
     let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();
 
-    // Extract fields from the struct
+    // Extract fields from the struct (enums don't have fields to process)
     let fields = match orig {
         StructOrEnumItem::Struct(s) => &s.fields,
         StructOrEnumItem::Enum(_) => {
-            // Enums don't have fields
-            return quote! {};
+            // Enums don't have fields, but we still need to generate the method
+            // because the init wrapper unconditionally calls it.
+            return quote! {
+                impl #impl_generics #ident #ty_generics #where_clause {
+                    /// Assigns deterministic IDs to all collection fields based on their field names.
+                    ///
+                    /// This is called automatically by the init wrapper. Users should not call this directly.
+                    /// For enum types, this is a no-op since enums don't have fields.
+                    #[doc(hidden)]
+                    pub fn __assign_deterministic_ids(&mut self) {
+                        // No-op for enum types
+                    }
+                }
+            };
         }
     };

This was referenced Feb 5, 2026

feat(storage): implement deterministic entity/collection IDs #1787

Closed

feat(storage): add CrdtType to entity metadata #1786

Closed

feat(merodb): add schema inference from database metadata #1865

Draft