Skip to content

feat(storage): combine deterministic IDs with metadata storage#1864

Open
xilosada wants to merge 14 commits intomasterfrom
feat/combined-deterministic-ids-and-metadata
Open

feat(storage): combine deterministic IDs with metadata storage#1864
xilosada wants to merge 14 commits intomasterfrom
feat/combined-deterministic-ids-and-metadata

Conversation

@xilosada
Copy link
Member

@xilosada xilosada commented Feb 5, 2026

Summary

Combines the features from PR #1787 (deterministic entity IDs) and PR #1786 (metadata storage) into a single coherent implementation.

Changes

From #1787 - Deterministic IDs

  • compute_collection_id(parent_id, field_name) for deterministic ID generation
  • new_with_field_name(parent_id, field_name) on all collection types
  • Domain separation to prevent ID collisions

From #1786 - Metadata Storage

  • Added field_name: Option<String> to Metadata struct
  • Added crdt_type: Option<CrdtType> to Metadata struct
  • Element::new_with_field_name() and new_with_field_name_and_crdt_type()
  • All collections now store their CrdtType in metadata:
    • UnorderedMap → CrdtType::UnorderedMap
    • UnorderedSet → CrdtType::UnorderedSet
    • Vector → CrdtType::Vector
    • Counter → CrdtType::Counter
    • RGA → CrdtType::Rga

Backward Compatibility

  • Custom Borsh de/serialization for Metadata handles old data gracefully
  • BorshSerialize, BorshDeserialize, Ord, PartialOrd added to CrdtType

Benefits

  1. Deterministic IDs: Same code produces identical collection IDs across nodes
  2. Schema Inference: field_name enables tools to infer schema from database
  3. CRDT Merge Dispatch: crdt_type enables type-aware merge strategies

Follow-up

A separate PR will add merodb schema inference using this metadata.

Supersedes

Implements #1769


Note

High Risk
Changes core storage identity/serialization and adds ID reassignment logic that mutates persisted state and indexes; also breaks backward compatibility for existing stored data.

Overview
Ensures deterministic, schema-identifiable storage entities by hashing collection IDs from (parent_id, field_name) with domain separation, and persisting field_name + crdt_type on every stored Element.

The #[app::logic] init path is wrapped to call an auto-generated __assign_deterministic_ids() (via #[app::state]), which reassigns top-level collection IDs after user init() runs; storage adds ID reassignment/cleanup helpers (including index reference removal without tombstones) and exposes new_with_field_name* constructors across collections. CRDT types are now explicit (UserStorage, FrozenStorage, etc.), and app/tests are updated to use field-name constructors and validate deterministic IDs/collision avoidance.

Breaking change: Metadata Borsh layout now includes crdt_type/field_name, so old on-disk data will not deserialize.

Written by Cursor Bugbot for commit 46224a8. This will update automatically on new commits. Configure here.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 186.1s


🔴 Critical (1)

1. Breaking change to compute_id causes data loss for existing entries

File: crates/storage/src/collections.rs (line 66-73) | Consensus: 1/1 agents ✓

Adding DOMAIN_SEPARATOR_ENTRY to the existing compute_id function changes how IDs are computed for map entries. All existing map entries in databases created before this change will become inaccessible because lookups will compute different IDs. This is a silent data loss issue - existing data won't be found or modified, but won't be deleted either, leading to orphaned entries and duplicate data.

Suggested fix:

Either: (1) Provide a migration path that re-hashes existing entries, or (2) Keep the old `compute_id` for reading and only use domain separation for new entries, or (3) Version the ID computation scheme and handle both in lookups. Consider adding a database version marker to detect old vs new format.

Found by: cursor-agent

🟡 Warning (1)

1. Custom Borsh deserialization masks corruption errors

File: crates/storage/src/entities.rs (line 495-513) | Consensus: 1/1 agents ✓

The BorshDeserialize implementation uses unwrap_or(None) to handle missing fields for backward compatibility. However, this approach also silently masks actual data corruption errors. If the crdt_type field is partially corrupted (e.g., invalid enum discriminant), it will be silently treated as None instead of returning an error. This could lead to subtle bugs where corrupted data is accepted.

Suggested fix:

Check for EOF explicitly before attempting to read optional fields. For example: try to peek a byte first, or use a length-prefixed format, or read into a buffer first to distinguish between 'no more data' vs 'corrupted data'.

Found by: cursor-agent

💡 Suggestion (2)

1. Original Element::new() doesn't initialize new metadata fields

File: crates/storage/src/entities.rs (line 182-195) | Consensus: 1/1 agents ✓

The original Element::new() method doesn't set crdt_type and field_name fields. While this maintains backward compatibility, it means entities created via the old API won't have CRDT type information, potentially causing merge dispatch to fall back to LWW semantics even for built-in CRDT types.

Suggested fix:

Document this limitation clearly, or consider deprecating `Element::new()` in favor of the new constructors that accept field_name and crdt_type.

Found by: cursor-agent

2. Counter internal maps have inconsistent CRDT types

File: crates/storage/src/collections/counter.rs (line 207-232) | Consensus: 1/1 agents ✓

The positive map is created with CrdtType::Counter while the negative map is created via new_with_field_name_internal which sets CrdtType::UnorderedMap. This asymmetry could cause confusion during CRDT merge dispatch - the negative map will be treated as a generic UnorderedMap rather than as part of a Counter.

Suggested fix:

Consider creating both internal maps with a consistent CRDT type (either both as Counter or introduce a new internal marker type like `CrdtType::CounterInternal`), or document why the asymmetry is intentional.

Found by: cursor-agent

📝 Nitpick (1)

1. LwwRegister fields not updated to use new_with_field_name

File: apps/state-schema-conformance/src/lib.rs (line 125-127) | Consensus: 1/1 agents ✓

The LwwRegister fields (status, user_id, counter) still use the old LwwRegister::new() method instead of a hypothetical new_with_field_name. This is inconsistent with the collection types and means these fields won't have deterministic IDs or stored field names. While LwwRegister may not need this (as it's a simple register), the inconsistency could be confusing.

Suggested fix:

Either add `new_with_field_name` to LwwRegister for consistency, or document why LwwRegister doesn't need deterministic IDs (since it's stored inline, not as a separate collection).

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-d4cf6b50

@github-actions
Copy link

github-actions bot commented Feb 5, 2026

Merobox Workflows Failed

The following workflow(s) failed after retries:

  • blobs/workflows/blobs-example.yml
  • blobs/workflows/blobs.yml
  • collaborative-editor/workflows/collaborative-editor.yml
  • kv-store-init/workflows/kv-store-init.yml
  • kv-store-with-handlers/workflows/kv-store-with-handlers.yml
  • kv-store-with-user-and-frozen-storage/workflows/test_frozen_storage.yml
  • kv-store-with-user-and-frozen-storage/workflows/test_user_storage.yml
  • kv-store/workflows/bundle-invitation-test.yml
  • nested-crdt-test/workflows/nested-crdt-test.yml
  • private_data/workflows/example.yml
  • private_data/workflows/private-data.yml
  • team-metrics-custom/workflows/team-metrics-custom.yml
  • team-metrics-macro/workflows/team-metrics-macro.yml
  • xcall-example/workflows/xcall-example.yml

Please check the workflow logs for more details.

- Add new_with_field_name() method to Collection that generates deterministic IDs
  using SHA256 hash of parent_id + field_name
- Add new_with_field_name() to all collection types:
  - Counter (handles positive/negative maps with deterministic IDs)
  - UnorderedMap
  - UnorderedSet
  - Vector
  - ReplicatedGrowableArray
- Update #[app::state] macro to generate Default implementation that uses
  new_with_field_name() for collection fields
- Add tests to verify determinism:
  - Same field names produce same IDs
  - Different field names produce different IDs
  - Parent ID affects child collection IDs

Implements #1769
CIP Section: Protocol Invariants
Invariant: I9 (Deterministic Entity IDs)

Acceptance Criteria:
✅ Same code on two nodes produces identical collection IDs
✅ Nested collections derive IDs correctly (parent + field)
✅ Existing apps continue to work (backward compatibility)
✅ Unit tests verify determinism
The macro was generating a Default implementation that required all fields
to implement Default, which breaks apps with non-Default types like enums.
Users should manually implement Default or use new_with_field_name() in
their init functions for deterministic IDs.
Counter's internal maps were using '{field_name}_positive' and
'{field_name}_negative' as field names, which could silently collide
with user-created collections. For example, a Counter with field name
'visits' would create internal maps with IDs derived from 'visits_positive',
which would collide with a user-created UnorderedMap with field name
'visits_positive'.

Fix by using a reserved internal prefix '__counter_internal_' for
Counter's internal maps. This ensures:
- Counter('visits') creates maps with IDs from '__counter_internal_visits_positive'
- User collections named 'visits_positive' use IDs from 'visits_positive'
- No collision possible

Added test to verify no collision occurs.

Fixes collision issue in crates/storage/src/collections/counter.rs:216-228
The compute_id and compute_collection_id functions both compute
SHA256(parent_bytes + name_bytes) without domain separation, which
can cause collisions. For example:
- A nested collection with field name 'key' creates ID from
  SHA256(parent_id + 'key')
- A map entry with key 'key' creates ID from SHA256(parent_id + 'key')
- Both get identical IDs, causing data corruption

Fix by adding domain separators:
- compute_id uses '__calimero_entry__' separator
- compute_collection_id uses '__calimero_collection__' separator

This ensures map entries and nested collections never collide even
with the same parent and name.

Added test to verify no collision occurs.

Fixes collision issue in:
- crates/storage/src/collections.rs:66-74 (compute_collection_id)
- crates/storage/src/collections.rs:57-63 (compute_id)
…ections

Counter::new_with_field_name was defined in a generic impl block for any
StorageAdaptor, while other collections (UnorderedMap, UnorderedSet, Vector,
RGA) only expose new_with_field_name for MainStorage. This created an API
inconsistency.

Fix by moving new_with_field_name to the MainStorage-only impl block,
matching the pattern used by all other collection types.

Fixes API inconsistency in crates/storage/src/collections/counter.rs:188-201
Combined feature from PR #1786 and #1787:

- Add field_name and crdt_type to Metadata struct
- Add Element::new_with_field_name and new_with_field_name_and_crdt_type
- Update Collection::new_with_field_name to store metadata
- Update all collection types to pass their CrdtType:
  - UnorderedMap -> CrdtType::UnorderedMap
  - UnorderedSet -> CrdtType::UnorderedSet
  - Vector -> CrdtType::Vector
  - Counter -> CrdtType::Counter
  - RGA -> CrdtType::Rga
- Add BorshSerialize/Deserialize/Ord/PartialOrd to CrdtType
- Add UserStorage and FrozenStorage to CrdtType enum
- Custom Borsh de/serialization for backward compatibility

This enables:
- Deterministic collection IDs (from #1787)
- Schema inference from database metadata (from #1786)
- CRDT type-aware merge dispatch
@xilosada xilosada force-pushed the feat/combined-deterministic-ids-and-metadata branch from afeb8cb to b046227 Compare February 5, 2026 08:15
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 127.6s

🔴 Critical issues require attention

4 Fixed | 🆕 5 New


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Breaking change to compute_id causes data loss for existing entries (crates/storage/src/collections.rs:66)
  2. Custom Borsh deserialization masks corruption errors (crates/storage/src/entities.rs:495)
  3. Counter internal maps have inconsistent CRDT types (crates/storage/src/collections/counter.rs:207)
  4. LwwRegister fields not updated to use new_with_field_name (apps/state-schema-conformance/src/lib.rs:125)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Breaking change: adding domain separator to compute_id changes all existing entry IDs

File: crates/storage/src/collections.rs (line 66-71) | Consensus: 1/1 agents ✓

The compute_id function now includes DOMAIN_SEPARATOR_ENTRY in the hash computation. This changes the ID calculation for ALL existing map entries, meaning existing data stored before this change will become inaccessible because lookups will compute different IDs than what was used when the data was stored. This is a silent data loss scenario.

Suggested fix:

Either: (1) Add a migration mechanism to rehash existing entries, (2) Version the storage format and handle old entries without the domain separator, or (3) Only apply domain separation to new collections created with `new_with_field_name` while keeping backward compatibility for `new()`.

Found by: cursor-agent

🟡 Warning (1)

1. Backward compatibility deserialization swallows all errors, not just EOF

File: crates/storage/src/entities.rs (line 498-502) | Consensus: 1/1 agents ✓

The custom BorshDeserialize for Metadata uses unwrap_or(None) when reading crdt_type and field_name. While this handles missing fields in old data correctly, it also silently converts actual deserialization errors (like corrupted data or invalid discriminants) to None, potentially masking data corruption issues.

Suggested fix:

Differentiate between EOF (expected for old data) and other errors (potential corruption). Consider checking the error kind: `match Option::<CrdtType>::deserialize_reader(reader) { Ok(v) => v, Err(e) if matches!(e.kind(), std::io::ErrorKind::UnexpectedEof) => None, Err(e) => return Err(e) }`

Found by: cursor-agent

💡 Suggestion (2)

1. LwwRegister lacks new_with_field_name API unlike other collection types

File: apps/state-schema-conformance/src/lib.rs (line 129-131) | Consensus: 1/1 agents ✓

All collection types (UnorderedMap, UnorderedSet, Vector, Counter, RGA) received new_with_field_name methods, but LwwRegister still only has new(). The state-schema-conformance app uses LwwRegister::new(...) for status, user_id, counter, name, and active fields. This creates an API inconsistency where some CRDT types support deterministic IDs and metadata storage while others don't.

Suggested fix:

Add `new_with_field_name` to `LwwRegister` for API consistency, or document why LwwRegister intentionally doesn't need this (e.g., if registers don't have their own storage IDs).

Found by: cursor-agent

2. Counter's internal map prefix lacks runtime enforcement

File: crates/storage/src/collections/counter.rs (line 215-229) | Consensus: 1/1 agents ✓

The counter uses __counter_internal_ prefix to namespace its internal positive/negative maps. While well-documented in comments, there's no runtime check preventing a user from accidentally creating a collection with a field name like __counter_internal_visits_positive, which could collide with a Counter named visits.

Suggested fix:

Consider adding a runtime check in `compute_collection_id` or `new_with_field_name` that rejects field names starting with reserved prefixes like `__counter_internal_` and logs a warning, or document this as a reserved namespace in public API docs.

Found by: cursor-agent

📝 Nitpick (1)

1. New CrdtType variants lack documentation

File: crates/storage/src/collections/crdt_meta.rs (line 27-34) | Consensus: 1/1 agents ✓

The UserStorage and FrozenStorage variants were added to CrdtType but have minimal doc comments compared to other variants. It would be helpful to explain when these types are used and their merge semantics.

Suggested fix:

Add documentation similar to other variants, e.g., `/// User-specific isolated storage (no cross-user merging)` for UserStorage.

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-4f002b0e

- new_with_field_name() now takes only field_name (no parent_id)
- new() generates random IDs (for nested collections in maps)
- Deterministic IDs derived from field_name only
- Update apps and tests to use simplified API

This follows the POC branch approach where:
- Top-level state fields use new_with_field_name('field')
- Nested map values use new() - merge happens by key, not ID
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 236.7s

🔴 Critical issues require attention

9 Fixed | 🆕 5 New


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Breaking change to compute_id causes data loss for existing entries (crates/storage/src/collections.rs:66)
  2. Custom Borsh deserialization masks corruption errors (crates/storage/src/entities.rs:495)
  3. Counter internal maps have inconsistent CRDT types (crates/storage/src/collections/counter.rs:218)
  4. LwwRegister fields not updated to use new_with_field_name (apps/state-schema-conformance/src/lib.rs:125)
  5. Breaking change: adding domain separator to compute_id changes all existing entry IDs (crates/storage/src/collections.rs:66)
  6. Backward compatibility deserialization swallows all errors, not just EOF (crates/storage/src/entities.rs:498)
  7. LwwRegister lacks new_with_field_name API unlike other collection types (apps/state-schema-conformance/src/lib.rs:129)
  8. Counter's internal map prefix lacks runtime enforcement (crates/storage/src/collections/counter.rs:226)
  9. New CrdtType variants lack documentation (crates/storage/src/collections/crdt_meta.rs:27)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Breaking change: compute_id hash computation modified

File: crates/storage/src/collections.rs (line 66-72) | Consensus: 1/1 agents ✓

Adding DOMAIN_SEPARATOR_ENTRY to the existing compute_id function changes the hash computation for ALL existing map entries. Any data stored with the old algorithm (without domain separator) will have different computed IDs, causing lookups to fail and effectively making existing data inaccessible. This is a breaking change for any production deployment.

Suggested fix:

Either: (1) Add a migration path that re-hashes existing entries, (2) Create a separate function for new entries while keeping backward compatibility for old lookups, or (3) Version the ID computation and try both algorithms during lookup.

Found by: cursor-agent

🟡 Warning (3)

1. Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used

File: crates/storage/src/collections/crdt_meta.rs (line 30-33) | Consensus: 1/1 agents ✓

The UserStorage and FrozenStorage variants are added to CrdtType enum but are never assigned anywhere in the codebase. The FrozenStorage::new_with_field_name and UserStorage::new_with_field_name methods use Element::new_with_field_name which sets crdt_type: None, not the corresponding variant. This violates the no-dead-code rule and the stated goal of storing CRDT types in metadata.

Suggested fix:

Either use these variants in `FrozenStorage::new_with_field_name` and `UserStorage::new_with_field_name` by calling `Element::new_with_field_name_and_crdt_type(..., CrdtType::FrozenStorage)`, or remove these unused variants.

Found by: cursor-agent

2. Backward compatibility deserialization swallows all errors

File: crates/storage/src/entities.rs (line 493-498) | Consensus: 1/1 agents ✓

The custom BorshDeserialize implementation uses unwrap_or(None) which catches ANY deserialization error, not just EOF/missing data. If the crdt_type or field_name bytes are corrupted (e.g., partial write, invalid enum variant), the error is silently ignored and None is returned. This could mask data corruption.

Suggested fix:

Check specifically for EOF/unexpected end of input errors and only return None for those cases. Propagate other errors (invalid data, corruption) to the caller. For example: `match Option::<CrdtType>::deserialize_reader(reader) { Ok(v) => v, Err(e) if e.kind() == ErrorKind::UnexpectedEof => None, Err(e) => return Err(e) }`

Found by: cursor-agent

3. FrozenStorage doesn't set its CrdtType in metadata

File: crates/storage/src/collections/frozen.rs (line 51-56) | Consensus: 1/1 agents ✓

The new_with_field_name method creates an Element with crdt_type: None instead of CrdtType::FrozenStorage. This is inconsistent with other collections (UnorderedMap, Vector, etc.) that properly set their CRDT type. The inner UnorderedMap gets CrdtType::UnorderedMap, but the outer FrozenStorage wrapper has no type information.

Suggested fix:

Change `Element::new_with_field_name(...)` to `Element::new_with_field_name_and_crdt_type(..., CrdtType::FrozenStorage)` and similarly for UserStorage.

Found by: cursor-agent

💡 Suggestion (1)

1. Internal prefix collision possible with user-defined collections

File: crates/storage/src/collections/counter.rs (line 230-245) | Consensus: 1/1 agents ✓

The reserved prefix __counter_internal_ is documented but not enforced. A user could create a collection with field name __counter_internal_visits_positive which would collide with a Counter's internal map. While unlikely, this could cause subtle bugs in a production system.

Suggested fix:

Consider validating field names in `new_with_field_name` methods to reject names starting with reserved prefixes like `__counter_internal_`, `__frozen_storage_`, `__user_storage_`, etc. Alternatively, document this as a reserved namespace in public API documentation.

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-7e88dbc7

@xilosada xilosada force-pushed the feat/combined-deterministic-ids-and-metadata branch from 2a21420 to a09960b Compare February 5, 2026 08:50
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 632.0s

🔴 Critical issues require attention

23 Fixed | 🆕 5 New


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Breaking change to compute_id causes data loss for existing entries (crates/storage/src/collections.rs:66)
  2. Custom Borsh deserialization masks corruption errors (crates/storage/src/entities.rs:495)
  3. Counter internal maps have inconsistent CRDT types (crates/storage/src/collections/counter.rs:218)
  4. LwwRegister fields not updated to use new_with_field_name (apps/state-schema-conformance/src/lib.rs:125)
  5. Breaking change: adding domain separator to compute_id changes all existing entry IDs (crates/storage/src/collections.rs:66)
  6. Backward compatibility deserialization swallows all errors, not just EOF (crates/storage/src/entities.rs:498)
  7. LwwRegister lacks new_with_field_name API unlike other collection types (apps/state-schema-conformance/src/lib.rs:129)
  8. Counter's internal map prefix lacks runtime enforcement (crates/storage/src/collections/counter.rs:226)
  9. New CrdtType variants lack documentation (crates/storage/src/collections/crdt_meta.rs:27)
  10. Breaking change: compute_id hash computation modified (crates/storage/src/collections.rs:66)
  11. Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used (crates/storage/src/collections/crdt_meta.rs:30)
  12. Backward compatibility deserialization swallows all errors (crates/storage/src/entities.rs:493)
  13. FrozenStorage doesn't set its CrdtType in metadata (crates/storage/src/collections/frozen.rs:51)
  14. Internal prefix collision possible with user-defined collections (crates/storage/src/collections/counter.rs:230)
  15. Breaking change: compute_id adds domain separator to existing function (crates/storage/src/collections.rs:65)
  16. Silent error suppression in Metadata deserialization masks corruption (crates/storage/src/entities.rs:507)
  17. FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing (crates/storage/src/collections/frozen.rs:50)
  18. Reserved prefix _counter_internal lacks enforcement (crates/storage/src/collections/counter.rs:218)
  19. Breaking change: compute_id now includes domain separator (crates/storage/src/collections.rs:69)
  20. FrozenStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/frozen.rs:53)
  21. UserStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/user.rs:50)
  22. Silent error swallowing in Metadata deserialization (crates/storage/src/entities.rs:503)
  23. Reserved prefix '_counter_internal' could collide with user collections (crates/storage/src/collections/counter.rs:228)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Breaking change: compute_id now includes domain separator

File: crates/storage/src/collections.rs (line 57-63) | Consensus: 1/1 agents ✓

Adding DOMAIN_SEPARATOR_ENTRY to compute_id() changes the hash computation for ALL existing map entries. Existing data stored with the old compute_id(parent, key) will become inaccessible because lookups will now use compute_id(parent, DOMAIN_SEPARATOR_ENTRY + key), producing different IDs. This is a silent data loss scenario for any existing deployments.

Suggested fix:

Either: (1) Add a migration path that rehashes existing entries, (2) Keep backward compatibility by trying both old and new ID computation on lookup, or (3) Document this as a breaking change requiring data migration and bump the major version.

Found by: cursor-agent

🟡 Warning (2)

1. Unsafe backward compatibility: unwrap_or(None) silently swallows errors

File: crates/storage/src/entities.rs (line 493-517) | Consensus: 1/1 agents ✓

The custom BorshDeserialize for Metadata uses unwrap_or(None) when reading crdt_type and field_name. This conflates three distinct cases: (1) EOF/no more data (expected for old data), (2) valid None value, and (3) corrupted/malformed data. Corrupted data will be silently treated as None instead of surfacing the error, potentially masking data integrity issues.

Suggested fix:

Check for EOF explicitly before deserializing optional fields. If data exists but deserialization fails, propagate the error rather than defaulting to None. Consider using a version byte prefix for the Metadata format to handle schema evolution safely.

Found by: cursor-agent

2. FrozenStorage and UserStorage don't set crdt_type in metadata

File: crates/storage/src/collections/frozen.rs (line 47-56) | Consensus: 1/1 agents ✓

While other collections (UnorderedMap, Vector, Counter, etc.) use new_with_field_name_and_crdt_type which sets the CRDT type, FrozenStorage's new_with_field_name only calls Element::new_with_field_name which does NOT set crdt_type. This inconsistency means FrozenStorage entities won't have CRDT type metadata, which could break the merge dispatch feature that depends on crdt_type.

Suggested fix:

Update FrozenStorage and UserStorage to use `Element::new_with_field_name_and_crdt_type` with `CrdtType::FrozenStorage` and `CrdtType::UserStorage` respectively. The CrdtType enum already has these variants.

Found by: cursor-agent

💡 Suggestion (2)

1. Reserved prefix collision possible with user-created collections

File: crates/storage/src/collections/counter.rs (line 221-243) | Consensus: 1/1 agents ✓

Counter uses __counter_internal_{field_name}_positive naming convention. While the test verifies internal consistency, a user could theoretically create a collection with field name __counter_internal_visits_positive that would have the same ID as a Counter's internal positive map for field visits. Consider documenting that __ prefixed field names are reserved, or validating field names don't start with reserved prefixes.

Suggested fix:

Add validation in `new_with_field_name` methods to reject field names starting with `__` (reserved prefix), or document clearly that such prefixes are reserved and will cause undefined behavior.

Found by: cursor-agent

2. Element::new doesn't initialize crdt_type and field_name

File: crates/storage/src/entities.rs (line 182-195) | Consensus: 1/1 agents ✓

The existing Element::new method creates elements without crdt_type and field_name fields. While the new methods new_with_field_name and new_with_field_name_and_crdt_type are added, the PR diff shows they need to explicitly add crdt_type: None, field_name: None to the existing constructor. However, if someone calls Element::new() for a collection, the metadata won't reflect the CRDT type.

Suggested fix:

Consider whether `Element::new` should take an optional CrdtType parameter, or document clearly that `new_with_field_name_and_crdt_type` should be preferred for collection storage elements.

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-1b27efc5

Add method to reassign collection IDs to deterministic values based on
field names. This enables the #[app::state] macro to fix IDs after init()
returns, ensuring all top-level collections have deterministic IDs
regardless of how they were created.

Changes:
- Add Element::reassign_id_and_field_name() for updating ID and metadata
- Add Collection::reassign_deterministic_id() and variant with crdt_type
- Add reassign_deterministic_id() to all collection types:
  - UnorderedMap, Vector, UnorderedSet, Counter, RGA
  - UserStorage, FrozenStorage
- Make compute_collection_id pub(crate) for UserStorage/FrozenStorage
The #[app::state] macro now generates a __assign_deterministic_ids()
method that is called automatically after init() returns. This ensures
all top-level collections have deterministic IDs based on their field
names, regardless of how they were created in init().

This satisfies CIP Invariant I9: Deterministic Entity IDs
> Given the same application code and field names, all nodes MUST
> generate identical entity IDs for the same logical entities.

Changes:
- Add generate_assign_deterministic_ids_impl() to state.rs
- Generate __assign_deterministic_ids() method on state structs
- Wrap init() call in logic/method.rs to call __assign_deterministic_ids()

Benefits:
- Existing apps work without code changes (just recompile)
- Users can use simple new() in init() without worrying about IDs
- IDs are deterministic (macro ensures it)
- Migrations 'just work' when recompiling
Previously UserStorage and FrozenStorage used CrdtType::Custom("...")
instead of the proper enum variants CrdtType::UserStorage and
CrdtType::FrozenStorage that already existed in the CrdtType enum.

Changes:
- UserStorage::crdt_type() returns CrdtType::UserStorage
- FrozenStorage::crdt_type() returns CrdtType::FrozenStorage
- new_with_field_name() and reassign_deterministic_id() now set
  crdt_type in the Element metadata

This ensures proper CRDT type identification for merge dispatch.
Remove custom BorshSerialize/BorshDeserialize implementations for Metadata
that handled old data without crdt_type and field_name fields.

Breaking change: Old data without crdt_type/field_name will fail to deserialize.
This is intentional - we require fresh nodes with the new data format.

This simplifies the code and removes the unwrap_or(None) fallbacks that
were flagged by MeroReviewer.
Add detailed documentation for each CrdtType variant explaining:
- Merge semantics for each type
- When/how IDs are assigned
- CIP Invariant I9 compliance
- Nested collection behavior

Also add CrdtType::Record variant for composite structs (root app state)
that merge field-by-field via the auto-generated Mergeable impl.
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 559.4s

🔴 Critical issues require attention

27 Fixed | 🆕 5 New


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Breaking change to compute_id causes data loss for existing entries (crates/storage/src/collections.rs:66)
  2. Custom Borsh deserialization masks corruption errors (crates/storage/src/entities.rs:495)
  3. Counter internal maps have inconsistent CRDT types (crates/storage/src/collections/counter.rs:218)
  4. LwwRegister fields not updated to use new_with_field_name (apps/state-schema-conformance/src/lib.rs:125)
  5. Breaking change: adding domain separator to compute_id changes all existing entry IDs (crates/storage/src/collections.rs:66)
  6. Backward compatibility deserialization swallows all errors, not just EOF (crates/storage/src/entities.rs:498)
  7. LwwRegister lacks new_with_field_name API unlike other collection types (apps/state-schema-conformance/src/lib.rs:129)
  8. Counter's internal map prefix lacks runtime enforcement (crates/storage/src/collections/counter.rs:226)
  9. New CrdtType variants lack documentation (crates/storage/src/collections/crdt_meta.rs:69)
  10. Breaking change: compute_id hash computation modified (crates/storage/src/collections.rs:66)
  11. Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used (crates/storage/src/collections/crdt_meta.rs:30)
  12. Backward compatibility deserialization swallows all errors (crates/storage/src/entities.rs:493)
  13. FrozenStorage doesn't set its CrdtType in metadata (crates/storage/src/collections/frozen.rs:51)
  14. Internal prefix collision possible with user-defined collections (crates/storage/src/collections/counter.rs:230)
  15. Breaking change: compute_id adds domain separator to existing function (crates/storage/src/collections.rs:65)
  16. Silent error suppression in Metadata deserialization masks corruption (crates/storage/src/entities.rs:507)
  17. FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing (crates/storage/src/collections/frozen.rs:50)
  18. Reserved prefix _counter_internal lacks enforcement (crates/storage/src/collections/counter.rs:218)
  19. Breaking change: compute_id now includes domain separator (crates/storage/src/collections.rs:69)
  20. FrozenStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/frozen.rs:53)
  21. UserStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/user.rs:50)
  22. Silent error swallowing in Metadata deserialization (crates/storage/src/entities.rs:503)
  23. Reserved prefix '_counter_internal' could collide with user collections (crates/storage/src/collections/counter.rs:228)
  24. Breaking change: compute_id now includes domain separator (crates/storage/src/collections.rs:57)
  25. Unsafe backward compatibility: unwrap_or(None) silently swallows errors (crates/storage/src/entities.rs:493)
  26. FrozenStorage and UserStorage don't set crdt_type in metadata (crates/storage/src/collections/frozen.rs:47)
  27. Reserved prefix collision possible with user-created collections (crates/storage/src/collections/counter.rs:221)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Adding domain separator to compute_id breaks backward compatibility

File: crates/storage/src/collections.rs (line 57-63) | Consensus: 1/1 agents ✓

The diff adds DOMAIN_SEPARATOR_ENTRY to the existing compute_id function which changes the hash computation for all existing map entries. Existing databases will have entries stored under IDs computed without the domain separator. After this change, lookups will compute different IDs and fail to find existing data, effectively corrupting existing deployments.

Suggested fix:

Either: (1) only apply domain separation to new collections created with `new_with_field_name`, keeping `compute_id` unchanged for backward compatibility, or (2) implement a migration path that rehashes existing entries, or (3) version the ID computation scheme.

Found by: cursor-agent

🟡 Warning (3)

1. Fragile string-based type detection for collections

File: crates/sdk/macros/src/state.rs (line 473-481) | Consensus: 1/1 agents ✓

The is_collection_type function uses type_str.contains("UnorderedMap") pattern matching which is fragile and could incorrectly match user-defined types that happen to contain these substrings (e.g., MyUnorderedMapWrapper, VectorField). This could cause the macro to generate reassign_deterministic_id calls for types that don't have this method.

Suggested fix:

Consider using semantic type analysis via syn's type visitor pattern, or at minimum match on the final segment of type paths rather than substring contains. Alternatively, use a marker trait that collection types implement.

Found by: cursor-agent

2. ID reassignment after init may orphan data written during init

File: crates/sdk/macros/src/logic/method.rs (line 162-169) | Consensus: 1/1 agents ✓

The init wrapper calls state.__assign_deterministic_ids() after the user's init() function returns. If a user performs operations on collections during init() (e.g., inserting data), those entries are stored under the original random IDs. When IDs are then reassigned, the collection's metadata changes but the child entries remain under the old parent ID, potentially causing data orphaning or lookup failures.

Suggested fix:

Consider either: (1) creating collections with deterministic IDs from the start by enhancing the macro to intercept collection creation, (2) migrating child entries during ID reassignment, or (3) documenting that `init()` should only construct empty collections.

Found by: cursor-agent

3. Truncated diff hides critical implementation details

File: crates/storage/src/entities.rs (line 329-340) | Consensus: 1/1 agents ✓

The reassign_id_and_field_name method implementation is truncated in the diff. The PR description claims 'Custom Borsh de/serialization for Metadata handles old data gracefully' but the Metadata struct changes with field_name and crdt_type fields aren't shown with custom serialization. If the Borsh derives are standard (not custom), deserializing old data (without these fields) will fail.

Suggested fix:

Verify that Metadata has custom BorshDeserialize implementation that handles missing `field_name` and `crdt_type` fields by defaulting to None. Add migration tests that deserialize data from before this PR.

Found by: cursor-agent

💡 Suggestion (1)

1. Reserved prefix collision not validated

File: crates/storage/src/collections/counter.rs (line 229-243) | Consensus: 1/1 agents ✓

Counter uses __counter_internal_{field_name} prefix for its internal maps, but there's no compile-time or runtime validation that user field names don't start with __counter_internal_. A user could create a counter field named internal_foo or a map named __counter_internal_visits_positive which would collide with Counter's internal naming scheme.

Suggested fix:

Add compile-time validation in the state macro to reject field names starting with reserved prefixes (`__counter_internal_`, `__user_storage_`, `__frozen_storage_`), or use a more collision-resistant scheme like UUID-based prefixes.

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-a25ce752

The init wrapper unconditionally calls __assign_deterministic_ids() on the
state after init(), but this method was only generated for structs with
CRDT collection fields. Apps using plain BTreeMap/Vec would fail to compile.

Fix: Always generate the method, even if empty (no-op for non-CRDT apps).

This fixes abi_conformance and other apps that don't use calimero-storage
collections.
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 85% | Review time: 205.8s

🔴 Critical issues require attention

31 Fixed | 🆕 4 New | ⏳ 1 Open


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Breaking change to compute_id causes data loss for existing entries (crates/storage/src/collections.rs:66)
  2. Custom Borsh deserialization masks corruption errors (crates/storage/src/entities.rs:495)
  3. Counter internal maps have inconsistent CRDT types (crates/storage/src/collections/counter.rs:218)
  4. LwwRegister fields not updated to use new_with_field_name (apps/state-schema-conformance/src/lib.rs:125)
  5. Breaking change: adding domain separator to compute_id changes all existing entry IDs (crates/storage/src/collections.rs:66)
  6. Backward compatibility deserialization swallows all errors, not just EOF (crates/storage/src/entities.rs:498)
  7. LwwRegister lacks new_with_field_name API unlike other collection types (apps/state-schema-conformance/src/lib.rs:129)
  8. Counter's internal map prefix lacks runtime enforcement (crates/storage/src/collections/counter.rs:226)
  9. New CrdtType variants lack documentation (crates/storage/src/collections/crdt_meta.rs:69)
  10. Breaking change: compute_id hash computation modified (crates/storage/src/collections.rs:66)
  11. Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used (crates/storage/src/collections/crdt_meta.rs:30)
  12. Backward compatibility deserialization swallows all errors (crates/storage/src/entities.rs:493)
  13. FrozenStorage doesn't set its CrdtType in metadata (crates/storage/src/collections/frozen.rs:51)
  14. Internal prefix collision possible with user-defined collections (crates/storage/src/collections/counter.rs:230)
  15. Breaking change: compute_id adds domain separator to existing function (crates/storage/src/collections.rs:65)
  16. Silent error suppression in Metadata deserialization masks corruption (crates/storage/src/entities.rs:507)
  17. FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing (crates/storage/src/collections/frozen.rs:50)
  18. Reserved prefix _counter_internal lacks enforcement (crates/storage/src/collections/counter.rs:218)
  19. Breaking change: compute_id now includes domain separator (crates/storage/src/collections.rs:69)
  20. FrozenStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/frozen.rs:53)
  21. UserStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/user.rs:50)
  22. Silent error swallowing in Metadata deserialization (crates/storage/src/entities.rs:503)
  23. Reserved prefix '_counter_internal' could collide with user collections (crates/storage/src/collections/counter.rs:228)
  24. Unsafe backward compatibility: unwrap_or(None) silently swallows errors (crates/storage/src/entities.rs:493)
  25. FrozenStorage and UserStorage don't set crdt_type in metadata (crates/storage/src/collections/frozen.rs:47)
  26. Reserved prefix collision possible with user-created collections (crates/storage/src/collections/counter.rs:221)
  27. Adding domain separator to compute_id breaks backward compatibility (crates/storage/src/collections.rs:57)
  28. Fragile string-based type detection for collections (crates/sdk/macros/src/state.rs:473)
  29. ID reassignment after init may orphan data written during init (crates/sdk/macros/src/logic/method.rs:162)
  30. Truncated diff hides critical implementation details (crates/storage/src/entities.rs:329)
  31. Reserved prefix collision not validated (crates/storage/src/collections/counter.rs:229)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. Missing backward-compatible deserialization for Metadata

File: crates/storage/src/entities.rs (line 186-195) | Consensus: 1/1 agents ✓

The PR description claims 'Custom Borsh de/serialization for Metadata handles old data gracefully', but the diff shows only adding crdt_type: None and field_name: None to constructors. The Metadata struct derives BorshDeserialize, which means deserializing old data (without these fields) will fail with a borsh parsing error. No custom deserializer is visible in the diff.

Suggested fix:

Implement custom BorshDeserialize for Metadata that handles missing fields by defaulting them to None, similar to how the PR description claims it should work.

Found by: cursor-agent

🟡 Warning (2)

1. Fragile type detection using string matching

File: crates/sdk/macros/src/state.rs (line 470-478) | Consensus: 1/1 agents ✓

The is_collection_type function uses type_str.contains("Counter") and similar patterns to detect CRDT types. This can match unintended types like MyCustomCounter, CounterWrapper, or types from other crates. A type alias like type MyCounter = SomeOtherType; could also cause false positives or negatives.

Suggested fix:

Use more precise matching that checks for the full type path (e.g., `calimero_storage::collections::Counter`), or use a trait-based approach where collection types implement a marker trait that the macro can detect.

Found by: cursor-agent

2. Redundant double ID assignment in init

File: apps/state-schema-conformance/src/lib.rs (line 111-130) | Consensus: 1/1 agents ✓

Collections are created with new_with_field_name("field_name") which assigns a deterministic ID, but then the macro-generated __assign_deterministic_ids() is unconditionally called which reassigns the same IDs again. While functionally correct (idempotent), this duplicates work. More importantly, if a user intentionally uses a different field name in new_with_field_name, it will be silently overwritten.

Suggested fix:

Either (1) document that `new_with_field_name` in init() is unnecessary because the macro handles it, and update the example to use `new()`, or (2) add a flag to skip reassignment if the ID is already deterministic. The example sets a precedent that may mislead other developers.

Found by: cursor-agent

💡 Suggestion (1)

1. CrdtType should derive Hash for HashMap/HashSet usage

File: crates/storage/src/collections/crdt_meta.rs (line 15-31) | Consensus: 1/1 agents ✓

The diff adds PartialOrd, Ord, BorshSerialize, BorshDeserialize derives to CrdtType but not Hash. This prevents using CrdtType as a key in HashMap or HashSet, which may be needed for type registries or dispatch tables in future sync protocol work.

Suggested fix:

Add `Hash` to the derive list: `#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, BorshSerialize, BorshDeserialize)]`

Found by: cursor-agent

⏳ Open Issues

These issues from previous reviews are still present:

🔴 Critical (1)

1. Breaking change: compute_id now includes domain separator

File: crates/storage/src/collections.rs (line 57-63) | Consensus: 1/1 agents ✓

The compute_id function is modified to include DOMAIN_SEPARATOR_ENTRY in the hash computation. This changes the ID for all existing map entries. Existing databases will have entries stored under old IDs (without domain separator) that cannot be found with the new ID computation. This is a silent data loss issue - the data exists but becomes inaccessible.

Suggested fix:

Add a migration path or versioned ID computation. Consider checking for entries with legacy IDs (without domain separator) as a fallback, or provide a database migration tool to rehash all entries.

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-bb8dc2d3

@xilosada xilosada marked this pull request as ready for review February 5, 2026 13:22
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issues.

pub(crate) fn reassign_deterministic_id(&mut self, field_name: &str) {
let new_id = compute_collection_id(None, field_name);
self.storage.reassign_id_and_field_name(new_id, field_name);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storage entries orphaned when collection IDs reassigned

High Severity

When collections are created using new() (as documented to be supported), they are immediately persisted to storage via add_child_to() with their initial random ID. When reassign_deterministic_id() is later called by the macro-generated __assign_deterministic_ids(), only the in-memory ID is changed via reassign_id_and_field_name(). The old storage entry and ROOT's child reference to the original ID are never cleaned up, resulting in orphaned data and stale parent-child relationships.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved - This issue has been addressed in the latest changes.

Some(quote! {
self.#field_name.reassign_deterministic_id(#field_name_str);
})
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tuple struct collection fields silently skip ID assignment

Medium Severity

The generate_assign_deterministic_ids_impl function uses field.ident.as_ref()? which returns None for tuple struct fields (e.g., struct MyState(UnorderedMap<K, V>)), silently skipping them. If a tuple struct is used as app state with collection fields created via new(), those collections would retain random IDs instead of getting deterministic ones. This violates the documented behavior that "users can use UnorderedMap::new() in init() while still getting deterministic IDs" and could cause sync failures between nodes without any compile-time or runtime warning.

Fix in Cursor Fix in Web

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved - This issue has been addressed in the latest changes.

@cursor
Copy link

cursor bot commented Feb 5, 2026

Bugbot Autofix prepared fixes for 2 of the 2 bugs found in the latest run.

  • ✅ Fixed: Storage entries orphaned when collection IDs reassigned
    • Fixed by updating reassign_deterministic_id to clean up old storage entries, index entries, and parent-child references before assigning the new deterministic ID.
  • ✅ Fixed: Tuple struct collection fields silently skip ID assignment
    • Fixed by using enumerate() and handling tuple struct fields with index-based field names (0, 1, etc.) instead of silently skipping them.

View PR

Or push these changes by commenting:

@cursor push 51c1411c99
Preview (51c1411c99)
diff --git a/crates/sdk/macros/src/state.rs b/crates/sdk/macros/src/state.rs
--- a/crates/sdk/macros/src/state.rs
+++ b/crates/sdk/macros/src/state.rs
@@ -343,8 +343,8 @@
     // Only merge fields that are known CRDT types
     let merge_calls: Vec<_> = fields
         .iter()
-        .filter_map(|field| {
-            let field_name = field.ident.as_ref()?;
+        .enumerate()
+        .filter_map(|(idx, field)| {
             let field_type = &field.ty;
 
             // Check if this is a known CRDT type by examining the type path
@@ -366,21 +366,41 @@
                 return None;
             }
 
-            // Generate merge call for CRDT fields
-            Some(quote! {
-                ::calimero_storage::collections::Mergeable::merge(
-                    &mut self.#field_name,
-                    &other.#field_name
-                ).map_err(|e| {
-                    ::calimero_storage::collections::crdt_meta::MergeError::StorageError(
-                        format!(
-                            "Failed to merge field '{}': {:?}",
-                            stringify!(#field_name),
-                            e
+            // Handle both named fields and tuple struct fields
+            if let Some(field_name) = &field.ident {
+                // Named field
+                Some(quote! {
+                    ::calimero_storage::collections::Mergeable::merge(
+                        &mut self.#field_name,
+                        &other.#field_name
+                    ).map_err(|e| {
+                        ::calimero_storage::collections::crdt_meta::MergeError::StorageError(
+                            format!(
+                                "Failed to merge field '{}': {:?}",
+                                stringify!(#field_name),
+                                e
+                            )
                         )
-                    )
-                })?;
-            })
+                    })?;
+                })
+            } else {
+                // Tuple struct field
+                let field_index = syn::Index::from(idx);
+                Some(quote! {
+                    ::calimero_storage::collections::Mergeable::merge(
+                        &mut self.#field_index,
+                        &other.#field_index
+                    ).map_err(|e| {
+                        ::calimero_storage::collections::crdt_meta::MergeError::StorageError(
+                            format!(
+                                "Failed to merge field {}: {:?}",
+                                #idx,
+                                e
+                            )
+                        )
+                    })?;
+                })
+            }
         })
         .collect();
 
@@ -478,8 +498,8 @@
     // Generate reassign calls for each collection field
     let reassign_calls: Vec<_> = fields
         .iter()
-        .filter_map(|field| {
-            let field_name = field.ident.as_ref()?;
+        .enumerate()
+        .filter_map(|(idx, field)| {
             let field_type = &field.ty;
             let type_str = quote! { #field_type }.to_string();
 
@@ -487,10 +507,21 @@
                 return None;
             }
 
-            let field_name_str = field_name.to_string();
-            Some(quote! {
-                self.#field_name.reassign_deterministic_id(#field_name_str);
-            })
+            // Handle both named fields and tuple struct fields
+            if let Some(field_name) = &field.ident {
+                // Named field: use field name for both access and ID
+                let field_name_str = field_name.to_string();
+                Some(quote! {
+                    self.#field_name.reassign_deterministic_id(#field_name_str);
+                })
+            } else {
+                // Tuple struct field: use index for access, index string for ID
+                let field_index = syn::Index::from(idx);
+                let field_name_str = idx.to_string();
+                Some(quote! {
+                    self.#field_index.reassign_deterministic_id(#field_name_str);
+                })
+            }
         })
         .collect();
 

diff --git a/crates/storage/src/collections.rs b/crates/storage/src/collections.rs
--- a/crates/storage/src/collections.rs
+++ b/crates/storage/src/collections.rs
@@ -50,8 +50,9 @@
 use crate as calimero_storage;
 use crate::address::Id;
 use crate::entities::{ChildInfo, Data, Element, StorageType};
+use crate::index::Index;
 use crate::interface::{Interface, StorageError};
-use crate::store::{MainStorage, StorageAdaptor};
+use crate::store::{Key, MainStorage, StorageAdaptor};
 use crate::{AtomicUnit, Collection};
 
 /// Domain separator for map entry IDs to prevent collision with collection IDs.
@@ -221,26 +222,72 @@
     /// all top-level collections have deterministic IDs regardless of how they were
     /// created in `init()`.
     ///
+    /// This method cleans up the old storage entry and parent-child references
+    /// when moving from a random ID to a deterministic one.
+    ///
     /// # Arguments
     /// * `field_name` - The name of the struct field containing this collection
+    #[expect(clippy::expect_used, reason = "fatal error if cleanup fails")]
     pub(crate) fn reassign_deterministic_id(&mut self, field_name: &str) {
         let new_id = compute_collection_id(None, field_name);
+        let old_id = self.storage.id();
+
+        // If already has the correct ID, nothing to do
+        if old_id == new_id {
+            return;
+        }
+
+        // Clean up old storage entry and index
+        let _ignored = S::storage_remove(Key::Entry(old_id));
+        let _ignored = S::storage_remove(Key::Index(old_id));
+
+        // Remove old child reference from ROOT (without creating tombstone)
+        let _ = <Index<S>>::remove_child_reference_only(*ROOT_ID, old_id);
+
+        // Update in-memory ID and field name
         self.storage.reassign_id_and_field_name(new_id, field_name);
+
+        // Add the collection with new ID to ROOT
+        let _ = <Interface<S>>::add_child_to(*ROOT_ID, self)
+            .expect("failed to add collection with new ID");
     }
 
     /// Reassigns the collection's ID with a specific CRDT type.
     ///
+    /// This method cleans up the old storage entry and parent-child references
+    /// when moving from a random ID to a deterministic one.
+    ///
     /// # Arguments
     /// * `field_name` - The name of the struct field containing this collection
     /// * `crdt_type` - The CRDT type for merge dispatch
+    #[expect(clippy::expect_used, reason = "fatal error if cleanup fails")]
     pub(crate) fn reassign_deterministic_id_with_crdt_type(
         &mut self,
         field_name: &str,
         crdt_type: CrdtType,
     ) {
         let new_id = compute_collection_id(None, field_name);
+        let old_id = self.storage.id();
+
+        // If already has the correct ID, nothing to do
+        if old_id == new_id {
+            return;
+        }
+
+        // Clean up old storage entry and index
+        let _ignored = S::storage_remove(Key::Entry(old_id));
+        let _ignored = S::storage_remove(Key::Index(old_id));
+
+        // Remove old child reference from ROOT (without creating tombstone)
+        let _ = <Index<S>>::remove_child_reference_only(*ROOT_ID, old_id);
+
+        // Update in-memory ID, field name, and CRDT type
         self.storage.reassign_id_and_field_name(new_id, field_name);
         self.storage.metadata.crdt_type = Some(crdt_type);
+
+        // Add the collection with new ID to ROOT
+        let _ = <Interface<S>>::add_child_to(*ROOT_ID, self)
+            .expect("failed to add collection with new ID");
     }
 
     /// Inserts an item into the collection.

diff --git a/crates/storage/src/index.rs b/crates/storage/src/index.rs
--- a/crates/storage/src/index.rs
+++ b/crates/storage/src/index.rs
@@ -309,6 +309,20 @@
         Self::mark_deleted(id, time_now())
     }
 
+    /// Removes a child reference from a parent without creating a tombstone.
+    ///
+    /// Used when reassigning collection IDs - we need to remove the old child
+    /// reference from the parent but don't want to create a tombstone since
+    /// the collection is being moved, not deleted.
+    pub(crate) fn remove_child_reference_only(
+        parent_id: Id,
+        child_id: Id,
+    ) -> Result<(), StorageError> {
+        Self::update_parent_after_child_removal(parent_id, child_id)?;
+        Self::recalculate_ancestor_hashes_for(parent_id)?;
+        Ok(())
+    }
+
     /// Updates parent's children list and hash after child removal.
     ///
     /// Step 2 of deletion: Remove child from parent's index and recalculate hash.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 368.6s

🔴 Critical issues require attention

34 Fixed | 🆕 5 New


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Breaking change to compute_id causes data loss for existing entries (crates/storage/src/collections.rs:67)
  2. Custom Borsh deserialization masks corruption errors (crates/storage/src/entities.rs:495)
  3. Counter internal maps have inconsistent CRDT types (crates/storage/src/collections/counter.rs:218)
  4. LwwRegister fields not updated to use new_with_field_name (apps/state-schema-conformance/src/lib.rs:125)
  5. Breaking change: adding domain separator to compute_id changes all existing entry IDs (crates/storage/src/collections.rs:67)
  6. Backward compatibility deserialization swallows all errors, not just EOF (crates/storage/src/entities.rs:498)
  7. LwwRegister lacks new_with_field_name API unlike other collection types (apps/state-schema-conformance/src/lib.rs:129)
  8. Counter's internal map prefix lacks runtime enforcement (crates/storage/src/collections/counter.rs:226)
  9. New CrdtType variants lack documentation (crates/storage/src/collections/crdt_meta.rs:69)
  10. Breaking change: compute_id hash computation modified (crates/storage/src/collections.rs:67)
  11. Dead code: CrdtType::UserStorage and CrdtType::FrozenStorage never used (crates/storage/src/collections/crdt_meta.rs:30)
  12. Backward compatibility deserialization swallows all errors (crates/storage/src/entities.rs:493)
  13. FrozenStorage doesn't set its CrdtType in metadata (crates/storage/src/collections/frozen.rs:51)
  14. Internal prefix collision possible with user-defined collections (crates/storage/src/collections/counter.rs:230)
  15. Breaking change: compute_id adds domain separator to existing function (crates/storage/src/collections.rs:66)
  16. Silent error suppression in Metadata deserialization masks corruption (crates/storage/src/entities.rs:507)
  17. FrozenStorage new_with_field_name doesn't set CrdtType despite variant existing (crates/storage/src/collections/frozen.rs:50)
  18. Reserved prefix _counter_internal lacks enforcement (crates/storage/src/collections/counter.rs:218)
  19. Breaking change: compute_id now includes domain separator (crates/storage/src/collections.rs:70)
  20. FrozenStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/frozen.rs:53)
  21. UserStorage.new_with_field_name doesn't set CrdtType (crates/storage/src/collections/user.rs:50)
  22. Silent error swallowing in Metadata deserialization (crates/storage/src/entities.rs:503)
  23. Reserved prefix '_counter_internal' could collide with user collections (crates/storage/src/collections/counter.rs:228)
  24. Breaking change: compute_id now includes domain separator (crates/storage/src/collections.rs:58)
  25. Unsafe backward compatibility: unwrap_or(None) silently swallows errors (crates/storage/src/entities.rs:493)
  26. FrozenStorage and UserStorage don't set crdt_type in metadata (crates/storage/src/collections/frozen.rs:47)
  27. Reserved prefix collision possible with user-created collections (crates/storage/src/collections/counter.rs:221)
  28. Adding domain separator to compute_id breaks backward compatibility (crates/storage/src/collections.rs:58)
  29. Fragile string-based type detection for collections (crates/sdk/macros/src/state.rs:493)
  30. ID reassignment after init may orphan data written during init (crates/sdk/macros/src/logic/method.rs:162)
  31. Truncated diff hides critical implementation details (crates/storage/src/entities.rs:329)
  32. Reserved prefix collision not validated (crates/storage/src/collections/counter.rs:229)
  33. High Severity (crates/storage/src/collections.rs:253)
  34. Medium Severity (crates/sdk/macros/src/state.rs:525)

🆕 New Issues

These issues were found in the latest changes:

🔴 Critical (1)

1. PR description falsely claims backward compatibility

File: crates/storage/src/entities.rs (line 495-497) | Consensus: 1/1 agents ✓

The PR description states 'Custom Borsh de/serialization for Metadata handles old data gracefully' but the actual implementation uses derive macros with a comment explicitly stating: 'Breaking change: Old data without crdt_type/field_name fields will fail to deserialize. This is intentional - we require fresh nodes with the new data format.' This is a significant mismatch between documented and actual behavior that could mislead users about migration requirements.

Suggested fix:

Update the PR description to accurately reflect that this is a breaking change requiring fresh nodes, or implement actual backward-compatible deserialization that handles missing fields with defaults.

Found by: cursor-agent

🟡 Warning (2)

1. Fragile string-based type detection for collection types

File: crates/sdk/macros/src/state.rs (line 488-496) | Consensus: 1/1 agents ✓

The is_collection_type function uses type_str.contains() to detect collection types. This could incorrectly match user-defined types containing these substrings (e.g., a type named MyCounterWrapper would match 'Counter'). It also won't work correctly with fully qualified paths or type aliases that don't contain the original type name.

Suggested fix:

Consider using syn's type parsing to extract the actual type path segments, or at minimum check for word boundaries (e.g., type_str ends with the type name or is followed by '<').

Found by: cursor-agent

2. Storage cleanup errors are silently ignored in reassign_deterministic_id

File: crates/storage/src/collections.rs (line 227-242) | Consensus: 1/1 agents ✓

In reassign_deterministic_id, errors from S::storage_remove(Key::Entry(old_id)) and S::storage_remove(Key::Index(old_id)) are assigned to _ignored variables, meaning failures to clean up old storage entries go unnoticed. This could lead to orphaned entries accumulating over time.

Suggested fix:

At minimum, log a warning when storage cleanup fails. Consider whether cleanup failures should propagate as errors, especially since the function uses `#[expect(clippy::expect_used)]` suggesting errors are considered fatal.

Found by: cursor-agent

💡 Suggestion (2)

1. CrdtType::Record variant appears unused

File: crates/storage/src/collections/crdt_meta.rs (line 85-91) | Consensus: 1/1 agents ✓

The CrdtType::Record variant is added with documentation stating it's 'Used for the root application state (annotated with #[app::state])' but I don't see it being used anywhere in the implementation. The root state doesn't appear to set its crdt_type to Record.

Suggested fix:

Either implement usage of CrdtType::Record for the root state struct, or remove the variant if it's not needed for this PR.

Found by: cursor-agent

2. FrozenStorage/UserStorage reassign doesn't clean up wrapper's own storage entry

File: crates/storage/src/collections/frozen.rs (line 60-80) | Consensus: 1/1 agents ✓

Unlike Collection::reassign_deterministic_id which removes old storage entries before adding new ones, FrozenStorage::reassign_deterministic_id (and UserStorage) only updates in-memory state and delegates to inner map. If self.storage Element is ever persisted independently, this could leave orphaned entries. The asymmetry with Collection's implementation is worth clarifying.

Suggested fix:

Document whether `FrozenStorage.storage` Element is persisted separately, and if so, add cleanup logic similar to Collection's implementation.

Found by: cursor-agent


🤖 Generated by AI Code Reviewer | Review ID: review-0eff0200

@@ -377,6 +495,10 @@ impl Metadata {
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 PR description falsely claims backward compatibility

The PR description states 'Custom Borsh de/serialization for Metadata handles old data gracefully' but the actual implementation uses derive macros with a comment explicitly stating: 'Breaking change: Old data without crdt_type/field_name fields will fail to deserialize. This is intentional - we require fresh nodes with the new data format.' This is a significant mismatch between documented and actual behavior that could mislead users about migration requirements.

Suggested fix:

Update the PR description to accurately reflect that this is a breaking change requiring fresh nodes, or implement actual backward-compatible deserialization that handles missing fields with defaults.

};

// Helper function to check if a type is a collection that needs ID assignment
fn is_collection_type(type_str: &str) -> bool {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Fragile string-based type detection for collection types

The is_collection_type function uses type_str.contains() to detect collection types. This could incorrectly match user-defined types containing these substrings (e.g., a type named MyCounterWrapper would match 'Counter'). It also won't work correctly with fully qualified paths or type aliases that don't contain the original type name.

Suggested fix:

Consider using syn's type parsing to extract the actual type path segments, or at minimum check for word boundaries (e.g., type_str ends with the type name or is followed by '<').

///
/// This method cleans up the old storage entry and parent-child references
/// when moving from a random ID to a deterministic one.
///
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Storage cleanup errors are silently ignored in reassign_deterministic_id

In reassign_deterministic_id, errors from S::storage_remove(Key::Entry(old_id)) and S::storage_remove(Key::Index(old_id)) are assigned to _ignored variables, meaning failures to clean up old storage entries go unnoticed. This could lead to orphaned entries accumulating over time.

Suggested fix:

At minimum, log a warning when storage cleanup fails. Consider whether cleanup failures should propagate as errors, especially since the function uses `#[expect(clippy::expect_used)]` suggesting errors are considered fatal.


/// Frozen Storage - write-once storage for immutable data.
///
/// Data can be written once and never modified or deleted.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 CrdtType::Record variant appears unused

The CrdtType::Record variant is added with documentation stating it's 'Used for the root application state (annotated with #[app::state])' but I don't see it being used anywhere in the implementation. The root state doesn't appear to set its crdt_type to Record.

Suggested fix:

Either implement usage of CrdtType::Record for the root state struct, or remove the variant if it's not needed for this PR.

let mut storage = Element::new_with_field_name(None, Some(field_name.to_string()));
storage.metadata.crdt_type = Some(CrdtType::FrozenStorage);
Self {
inner: UnorderedMap::new_with_field_name(&format!("__frozen_storage_{field_name}")),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 FrozenStorage/UserStorage reassign doesn't clean up wrapper's own storage entry

Unlike Collection::reassign_deterministic_id which removes old storage entries before adding new ones, FrozenStorage::reassign_deterministic_id (and UserStorage) only updates in-memory state and delegates to inner map. If self.storage Element is ever persisted independently, this could leave orphaned entries. The asymmetry with Collection's implementation is worth clarifying.

Suggested fix:

Document whether `FrozenStorage.storage` Element is persisted separately, and if so, add cleanup logic similar to Collection's implementation.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issue.

StructOrEnumItem::Enum(_) => {
// Enums don't have fields
return quote! {};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum state types missing required method causing compile failure

Medium Severity

The generate_assign_deterministic_ids_impl function returns an empty quote! {} for enum types, generating no impl block and no __assign_deterministic_ids method. However, the init wrapper in method.rs unconditionally calls state.__assign_deterministic_ids(). This inconsistency causes compile failures when enum types are used as app state with #[app::init]. The code comment on line 541-542 claims the method is "always generated (even if empty)" but this is incorrect for enums.

Additional Locations (1)

Fix in Cursor Fix in Web

@cursor
Copy link

cursor bot commented Feb 5, 2026

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

  • ✅ Fixed: Enum state types missing required method causing compile failure
    • Generated a no-op __assign_deterministic_ids method for enum state types, matching the behavior of structs without CRDT fields and preventing compile failures when enums are used with #[app::init].

View PR

Or push these changes by commenting:

@cursor push c3f70789bb
Preview (c3f70789bb)
diff --git a/crates/sdk/macros/src/state.rs b/crates/sdk/macros/src/state.rs
--- a/crates/sdk/macros/src/state.rs
+++ b/crates/sdk/macros/src/state.rs
@@ -475,12 +475,24 @@
 ) -> TokenStream {
     let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();
 
-    // Extract fields from the struct
+    // Extract fields from the struct (enums don't have fields to process)
     let fields = match orig {
         StructOrEnumItem::Struct(s) => &s.fields,
         StructOrEnumItem::Enum(_) => {
-            // Enums don't have fields
-            return quote! {};
+            // Enums don't have fields, but we still need to generate the method
+            // because the init wrapper unconditionally calls it.
+            return quote! {
+                impl #impl_generics #ident #ty_generics #where_clause {
+                    /// Assigns deterministic IDs to all collection fields based on their field names.
+                    ///
+                    /// This is called automatically by the init wrapper. Users should not call this directly.
+                    /// For enum types, this is a no-op since enums don't have fields.
+                    #[doc(hidden)]
+                    pub fn __assign_deterministic_ids(&mut self) {
+                        // No-op for enum types
+                    }
+                }
+            };
         }
     };

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant