feat(storage): add Batch trait abstraction for atomic write operations #231

Tsukikage7 · 2026-01-09T07:21:25Z

关联 Issue

Closes #222

背景

当前存储层直接使用 RocksDB 的 WriteBatch 进行批量写操作，这种实现方式与底层存储引擎强耦合，无法支持后续的集群模式（需要通过 Raft 共识进行写入）。

解决方案

引入 Batch trait 抽象层，将批量写操作从具体实现中解耦：

Batch trait: 定义统一的批量写接口
RocksBatch: 独立模式实现，封装 RocksDB WriteBatch
BinlogBatch: 集群模式预留实现，后续用于 Raft 共识写入

主要改动

新增 batch.rs 模块，包含 Batch trait 及两种实现
在 Redis 结构体中添加 create_batch() 工厂方法
重构以下模块中的 WriteBatch 调用：
- redis_strings.rs (2 处)
- redis_hashes.rs (6 处)
- redis_lists.rs (7 处)
- redis_sets.rs (6 处)
- redis_zsets.rs (8 处)
添加无效 ColumnFamily 索引的显式错误处理

测试情况

单元测试：全部通过
集成测试：全部通过（共 335 个测试用例）

Checklist

代码符合项目编码规范
已添加必要的注释和文档
所有测试通过
无新增编译警告

Summary by CodeRabbit

New Features
- Added a batch abstraction for atomic multi-key/multi-structure write commits and a placeholder for cluster mode.
- Exposed a public API to create batches from the storage client.
Refactor
- Switched hashes, lists, sets, strings, and sorted sets to the unified batch API, consolidating write paths and ensuring atomic commits.
Tests & Docs
- Added basic tests and usage documentation for the new batch types.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-09T07:21:40Z

📝 Walkthrough

Walkthrough

Adds a batch abstraction (public Batch trait) with RocksBatch (RocksDB WriteBatch wrapper) and BinlogBatch (cluster placeholder); re-exports them and updates Redis storage modules to create and use batches via Redis::create_batch() replacing direct RocksDB WriteBatch usage.

Changes

Cohort / File(s)	Summary
Batch core `src/storage/src/batch.rs`, `src/storage/src/lib.rs`, `src/storage/src/redis.rs`	New `Batch` trait, `RocksBatch` (stores WriteBatch, db handle, write options, CF handles), and `BinlogBatch` (placeholder). `lib.rs` re-exports types. `Redis::create_batch()` added to build a `RocksBatch`. Review commit/error mapping, CF handle validation, and lifetimes.
Hashes `src/storage/src/redis_hashes.rs`	Replaced direct `WriteBatch` with `create_batch()`; multi-CF operations (MetaCF, HashesDataCF) are batched and committed once. Check encoded-key handling, multi-field batching, and error propagation.
Lists `src/storage/src/redis_lists.rs`	Aggregates deletes/puts into in-memory collections and uses `create_batch()` then `batch.commit()` against `ListsDataCF` and `MetaCF`. Verify ordering (deletes before puts) and metadata updates included.
Sets `src/storage/src/redis_sets.rs`	Switched to `create_batch()` for sadd/srem/spop/smove/...; multi-key deletions and conditional meta deletions consolidated. Check conditional `should_delete_meta` logic and pre-commit checks.
Strings `src/storage/src/redis_strings.rs`	Unified write paths (set, mset, incr, bit ops, deletes) to use batch abstraction and single commit per operation instead of direct `db.write_opt`. Validate consistency for multi-key ops and high-frequency paths.
Zsets `src/storage/src/redis_zsets.rs`	Zset mutations now use `create_batch()` and `batch.commit()` across `ZsetsDataCF`, `ZsetsScoreCF`, `MetaCF`. Inspect score/member consistency and atomicity across CF updates.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Redis
  participant Batch as Batch (RocksBatch/BinlogBatch)
  participant RocksDB

  Client->>Redis: write command (e.g., HSET/ZADD/LPUSH)
  Redis->>Redis: prepare encoded keys & metadata
  Redis->>Batch: create_batch()
  Redis->>Batch: put/delete per CF (Meta/Data/Score ...)
  Redis->>Batch: commit()
  alt RocksBatch (standalone)
    Batch->>RocksDB: db.write_opt(WriteBatch, WriteOptions)
    RocksDB-->>Batch: write result
  else BinlogBatch (cluster)
    Batch-->>Redis: (TODO: serialize/send via Raft/log)
  end
  Batch-->>Redis: commit result
  Redis-->>Client: response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: add string command #167 — Overlaps changes to redis_strings.rs and string command write paths that were migrated to batch usage.
feat: add list&set&hashtable and raft #172 — Broad storage changes and adoption of a batch abstraction across redis_* modules; implementation and call sites are closely related.

Suggested reviewers

AlexStocks

Poem

🐇 I gather keys in tidy stacks,
Little hops to batch the packs,
Rocks or logs, I bind them tight,
One neat commit — into the night. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main change: introducing a Batch trait abstraction for atomic write operations, which is the core feature of this changeset.
Linked Issues check	✅ Passed	The PR fully implements the objectives from issue #222: provides a unified Batch trait interface with RocksBatch for standalone mode and BinlogBatch placeholder for cluster mode, allowing mode-based batch selection.
Out of Scope Changes check	✅ Passed	All changes are in scope: batch module creation, batch trait implementation, refactoring storage modules to use the new batch abstraction, and minor HMAC refactoring in raft/network.rs which relates to consistency improvements.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/storage/src/redis_sets.rs (1)

2182-2252: SDIFFSTORE / *store operations are no longer atomic (two commits).

At Line 2239-2247 (and similarly Line 2362-2375), the code commits a “clear destination” batch, then calls self.sadd(...) which commits again. That creates an observable intermediate state (destination empty/partially written) and breaks the typical atomic semantics expected from *STORE.

Suggestion: build one batch that (1) deletes old destination meta + members and (2) writes the new meta + members, then commit once. If you want to reuse sadd, consider adding an internal helper like sadd_into_batch(&mut dyn Batch, ...) that doesn’t commit.

Also applies to: 2309-2377

🤖 Fix all issues with AI agents

In @src/storage/src/batch.rs:
- Around line 47-79: The Batch trait currently exposes put/delete as infallible
and RocksBatch implementation uses assert!/expect! which can panic; change the
trait signatures for put and delete to return Result<(), ErrorType> (or make
them return Result<()> using the crate's common error type), update
RocksBatch::{put, delete} to validate column family lookup without assert/expect
and return Err(...) on invalid CF or other failures, and propagate/store any
internal errors so that commit(self: Box<Self>) returns those errors instead of
panicking; update all call sites to handle the new Result return values and
ensure commit still returns Result<()> with any accumulated error.
- Around line 38-45: Run rustfmt on the new module to resolve the formatting
warning: run `cargo fmt` (or apply rustfmt) for src/storage/src/batch.rs so the
use/import block is properly ordered and spaced (std::sync::Arc;
rocksdb::{BoundColumnFamily, WriteBatch, WriteOptions}; snafu::ResultExt;
crate::error::{Result, RocksSnafu}; crate::ColumnFamilyIndex; engine::Engine).
Ensure no extra blank lines or misaligned imports remain so CI formatting check
passes.
- Around line 166-224: BinlogBatch::commit currently returns Ok(()) while doing
nothing; change it to return an explicit not-implemented error (e.g.,
Err(Error::unimplemented or a suitable crate::error::Error variant) from the
commit method) so callers (including create_batch when it may return
BinlogBatch) cannot acknowledge writes that aren’t persisted; update the commit
implementation in the BinlogBatch impl to construct and return that explicit
error and keep the method body otherwise unchanged until Raft append logic is
implemented.

🧹 Nitpick comments (5)

src/storage/src/redis_strings.rs (1)

2107-2151: Potentially unbounded in-memory key collection for DEL/FLUSHDB paths.

At Line 2110-2151 and 2226-2262, keys are collected into a Vec and then deleted via one batch commit. For large DBs this can spike memory and produce very large WriteBatches.

Consider chunking: delete/commit every N keys (or stream deletes directly into a batch and commit when batch.count() reaches a threshold).

Also applies to: 2226-2262

src/storage/src/redis_lists.rs (2)

319-349: lpop/rpop now apply deletes + meta update in one batch — good.

The new keys_to_delete collection and single batch commit (Line 341-349, 412-420, 483-491) is consistent.

Also applies to: 390-420, 461-491

754-806: Batch delete/put loops are correct; consider writing directly into batch to reduce allocations.

Several paths build Vec<Vec<u8>> and Vec<(Vec<u8>, Vec<u8>)> first (e.g., Line 754-806, 887-937, 1032-1073). Where feasible, you can push operations directly into batch as you compute them to avoid duplicating key/value buffers.

Also applies to: 887-937, 1032-1073
src/storage/src/redis_hashes.rs (2)
112-118: Formatting issue flagged by CI.

The batch operations are correct, but the CI cargo fmt check indicates formatting differences. Run cargo fmt to fix.
🧹 Run formatter
cargo fmt --all
289-303: Helper closure pattern works but creates some duplication.

The create_new_hash closure is duplicated across hset, hmset, hsetnx, hincrby, and hincrbyfloat. While the closures capture method-specific base_meta_key, consider extracting to a shared method that takes base_meta_key as a parameter to reduce duplication. This is a nice-to-have refactor for future.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 42cd43d and 0531ae0.

📒 Files selected for processing (8)

src/storage/src/batch.rs
src/storage/src/lib.rs
src/storage/src/redis.rs
src/storage/src/redis_hashes.rs
src/storage/src/redis_lists.rs
src/storage/src/redis_sets.rs
src/storage/src/redis_strings.rs
src/storage/src/redis_zsets.rs

🧰 Additional context used

🧬 Code graph analysis (5)

src/storage/src/redis_sets.rs (1)

src/storage/src/batch.rs (2)

new (102-114)

new (186-188)

src/storage/src/redis_strings.rs (1)

src/storage/src/batch.rs (2)

new (102-114)

new (186-188)

src/storage/src/redis_lists.rs (2)

src/storage/src/batch.rs (1)

new (102-114)

src/engine/src/rocksdb_engine.rs (1)

db (38-40)

src/storage/src/redis_hashes.rs (1)

src/storage/src/batch.rs (2)

new (102-114)

new (186-188)

src/storage/src/redis_zsets.rs (1)

src/storage/src/batch.rs (2)

new (102-114)

new (186-188)

🪛 GitHub Check: cargo fmt (ubuntu-latest)

src/storage/src/batch.rs

[warning] 40-40:
Diff in /home/runner/work/kiwi/kiwi/src/storage/src/batch.rs

src/storage/src/redis_hashes.rs

[warning] 113-113:
Diff in /home/runner/work/kiwi/kiwi/src/storage/src/redis_hashes.rs

[warning] 297-297:
Diff in /home/runner/work/kiwi/kiwi/src/storage/src/redis_hashes.rs

[warning] 622-622:
Diff in /home/runner/work/kiwi/kiwi/src/storage/src/redis_hashes.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: cargo clippy (macos-latest)
GitHub Check: cargo clippy (windows-latest)
GitHub Check: build and test (macos-latest)
GitHub Check: build and test (windows-latest)
GitHub Check: build and test (ubuntu-latest)
GitHub Check: cargo clippy (ubuntu-latest)
GitHub Check: lint
GitHub Check: test-macos
GitHub Check: test-windows
GitHub Check: test

🔇 Additional comments (19)

src/storage/src/lib.rs (1)

42-43: Batch module wiring + re-export looks good.

mod batch; (Line 42) and pub use batch::{Batch, BinlogBatch, RocksBatch}; (Line 69) are consistent with the new abstraction.

Also applies to: 69-70

src/storage/src/redis_strings.rs (1)

318-325: Batch refactor for string writes is consistent and improves atomicity.

The create_batch() -> put/delete -> commit() pattern applied across setters (e.g., Line 581-588, 635-642, 1063-1079) is consistent and keeps write paths uniform.

Also applies to: 389-396, 581-588, 635-642, 698-705, 759-766, 831-838, 1063-1079

src/storage/src/redis_sets.rs (1)

151-166: Batch usage in set mutators (sadd/srem/spop/smove) looks coherent.

The conversions to create_batch() with MetaCF + SetsDataCF operations and a single commit() (e.g., Line 151-166, 494-512, 783-813) match the intended abstraction.

Also applies to: 494-512, 627-647, 783-813

src/storage/src/redis_lists.rs (1)

105-108: List push_core() batching approach is clean and keeps meta+data updates in one commit.

Collecting deletes/puts and committing them together (Line 236-246) aligns well with the new Batch abstraction and should preserve atomicity for each list mutation.

Also applies to: 236-246

src/storage/src/redis_zsets.rs (9)

30-30: Import cleanup looks good.

The removal of WriteBatch from the import is consistent with the migration to the new batch abstraction.

99-166: Batch abstraction correctly applied for existing zset updates.

The migration from direct WriteBatch to the new batch abstraction maintains atomicity - all score key deletions, member/score insertions, and meta updates are committed together.

169-189: New zset creation path correctly uses batch abstraction.

The batch lifecycle (create → put → commit) is properly implemented for the new zset creation path.

357-376: Score increment batch operations are correct.

The batch properly handles the atomic update of: delete old score key → put new member value → put new score key → commit.

791-843: Member removal batch operations are well-structured.

The batch correctly handles multi-member deletion with proper meta update logic (delete meta if count reaches zero, otherwise update count). Statistics update is appropriately placed after successful commit.

1510-1528: Lex range removal uses correct batch pattern.

The collect-then-batch approach is efficient, and all deletions plus meta updates are atomic.

1629-1647: Rank range removal batch operations are correct.

Consistent pattern with other range removal methods.

1728-1746: Score range removal batch operations are correct.

Consistent and correct batch usage pattern.

1177-1240: Two-phase batch approach for store operations is acceptable.

The destination cleanup uses one batch (lines 1197-1227), then results are added via zadd which uses its own batch. This maintains atomicity within each phase. The lock is correctly released before calling zadd to prevent deadlock.

Note: In edge cases where the first batch commits but zadd fails, the destination would be cleared but not populated. This matches Redis behavior where a failure partway through leaves partial state.

src/storage/src/redis_hashes.rs (6)

23-23: Import cleanup is consistent with batch migration.

Removed WriteBatch import as expected for the batch abstraction migration.

335-339: Batch operations for stale hash reinit are correct.

The batch properly updates both meta and data in a single atomic commit.

696-704: Conditional meta update is correct but could be simplified.

The meta is only updated when new_fields_count > 0, which is correct. However, the batch is always created and committed even when no new fields are added (just updates to existing fields). This is fine for correctness but creates a batch commit for updates-only scenarios.

903-914: Integer overflow protection and batch operation are correct.

The checked_add prevents overflow, and the batch commit properly persists the incremented value.

1032-1046: Float overflow protection and batch operation are correct.

The is_finite() check properly guards against NaN/Infinity results before committing.

774-778: HSETNX batch operations are correct.

Atomic commit of meta and data for stale hash reinitialization.

src/storage/src/batch.rs

src/storage/src/redis.rs

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/storage/src/redis_sets.rs (1)
2206-2276: Atomicity regression risk in *STORE: destination clear and repopulate happen in multiple commits.

sdiffstore() (and clear_and_store_set() used by sinterstore/sunionstore) currently:

deletes existing destination keys and commits

calls sadd() which commits again

This opens race windows (and intermediate visibility) and can violate “single command is atomic” expectations, especially since reads typically don’t take locks.
Suggested fix: do clear + repopulate in one batch commit
- // Write the batch to clear destination first
- let mut batch = self.create_batch()?;
- ...
- batch.commit()?;
-
- // Now add the new members
- let added = self.sadd(destination, &member_refs)?;
- Ok(added)
+ // Clear + write new meta + write new members in ONE batch
+ let mut batch = self.create_batch()?;
+ for key_to_del in &keys_to_delete {
+     batch.delete(ColumnFamilyIndex::SetsDataCF, key_to_del)?;
+ }
+ if should_delete_meta {
+     batch.delete(ColumnFamilyIndex::MetaCF, &dest_base_meta_key)?;
+ }
+
+ // Build and write destination meta + members here (avoid calling sadd())
+ // - create new BaseMetaValue(DataType::Set) with count=members.len()
+ // - get version from ParsedSetsMetaValue::initial_meta_value()
+ // - for each member: MemberDataKey::new(destination, version, member).encode() and batch.put(...)
+
+ batch.commit()?;
+ Ok(members.len() as i32)
Also applies to: 2333-2401

🤖 Fix all issues with AI agents

In @src/storage/src/redis.rs:
- Around line 320-360: The code relies on positional indexing of self.handles
(via get_cf_handle/ColumnFamilyIndex) but initialization used filter() which can
remove missing CFs and shift positions, breaking the mapping; change the
implementation to use a stable name-based mapping instead: during DB open
populate a fixed-size Vec<Option<Arc<rocksdb::BoundColumnFamily<'_>>>> or a
HashMap<ColumnFamilyIndex, Arc<...>> keyed by ColumnFamilyIndex (or CF name) so
missing CFs are represented as None rather than removed, update get_cf_handle to
return the handle by lookup (not by index into a filtered Vec), and ensure
create_batch collects handles by calling the new get_cf_handle for each
ColumnFamilyIndex (MetaCF, HashesDataCF, SetsDataCF, ListsDataCF, ZsetsDataCF,
ZsetsScoreCF) so ordering/invariants are preserved.

🧹 Nitpick comments (4)

src/storage/src/batch.rs (1)

95-199: RocksBatch CF index validation is good; consider tightening diagnostics + avoiding repeated custom Location building.

The “max” in the bounds error reads like an index but is currently len; consider len.saturating_sub(1).

You can simplify the error location to snafu::location!() (and keep the message).

src/storage/src/redis_lists.rs (1)

718-806: Nice consolidation to single-commit mutations; consider avoiding pre-staging when not needed.

Several paths build Vec of keys/puts first and then replay into batch.*. If borrow/lifetime constraints permit, writing directly into batch as you compute keys would save memory and copies on large lists.

Also applies to: 887-937, 1032-1073

src/storage/src/redis_strings.rs (1)

2110-2151: Potential memory spike: del_key/flush_db collect all keys before deleting.

For large datasets, keys_to_delete can become huge. Consider chunked deletion (e.g., commit every N ops and batch.clear()), or implement a purpose-built “delete range/prefix” API in the engine later.

Also applies to: 2226-2261
src/storage/src/redis_hashes.rs (1)
294-314: Consider extracting repeated helper pattern.

The create_new_hash helper closure appears with minor variations across multiple methods (also in hsetnx at lines 780-800, hincrby at lines 905-926, and hincrbyfloat at lines 1061-1082). While the current implementation works correctly and maintains atomicity, extracting this to a shared generic helper method could reduce duplication.
Example consolidation approach

A shared method could accept a value encoder:
fn create_new_hash_with_field<F>(
    &self,
    key: &[u8],
    base_meta_key: &[u8],
    field: &[u8],
    encode_value: F,
) -> Result<()>
where
    F: FnOnce() -> Vec<u8>,
{
    let mut hashes_meta = HashesMetaValue::new(Bytes::copy_from_slice(&1u64.to_le_bytes()));
    hashes_meta.inner.data_type = DataType::Hash;
    let version = hashes_meta.update_version();

    let data_key = MemberDataKey::new(key, version, field);
    let data_value = BaseDataValue::new(encode_value());

    let mut batch = self.create_batch()?;
    batch.put(ColumnFamilyIndex::MetaCF, base_meta_key, &hashes_meta.encode())?;
    batch.put(ColumnFamilyIndex::HashesDataCF, &data_key.encode()?, &data_value.encode())?;
    batch.commit()
}
Then call sites would become: create_new_hash_with_field(key, &base_meta_key, field, || value.to_string().into_bytes())?

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0531ae0 and b86f898.

📒 Files selected for processing (8)

src/storage/src/batch.rs
src/storage/src/lib.rs
src/storage/src/redis.rs
src/storage/src/redis_hashes.rs
src/storage/src/redis_lists.rs
src/storage/src/redis_sets.rs
src/storage/src/redis_strings.rs
src/storage/src/redis_zsets.rs

🚧 Files skipped from review as they are similar to previous changes (2)

src/storage/src/lib.rs
src/storage/src/redis_zsets.rs

🧰 Additional context used

🧬 Code graph analysis (5)

src/storage/src/redis_sets.rs (1)

src/storage/src/batch.rs (2)

new (113-125)

new (221-223)

src/storage/src/redis_lists.rs (2)

src/storage/src/batch.rs (1)

new (113-125)

src/engine/src/rocksdb_engine.rs (1)

db (38-40)

src/storage/src/redis_strings.rs (1)

src/storage/src/batch.rs (2)

new (113-125)

new (221-223)

src/storage/src/redis_hashes.rs (1)

src/storage/src/batch.rs (2)

new (113-125)

new (221-223)

src/storage/src/batch.rs (2)

src/storage/src/redis.rs (1)

new (115-149)

src/engine/src/rocksdb_engine.rs (1)

db (38-40)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test-windows
GitHub Check: cargo clippy (macos-latest)

🔇 Additional comments (13)

src/storage/src/batch.rs (2)

47-90: Batch trait shape looks good for “commit consumes batch” flow.

commit(self: Box<Self>) aligns with Redis::create_batch() -> Box<dyn Batch> and avoids post-commit reuse footguns.

201-265: BinlogBatch correctly fails commit to avoid silent data loss.

Returning an error from commit() is the right default until Raft integration lands.

src/storage/src/redis_lists.rs (2)

105-246: push_core batching integration looks consistent (data CF ops + meta update in one commit).

Order of operations (deletes, puts, meta put, commit) is coherent and keeps the metadata update in the same atomic commit.

317-349: Pop paths correctly batch element deletions with metadata update.

Keeping meta written even when empty matches the comment/intent and avoids breaking lpushx/rpushx flows.

Also applies to: 388-420, 459-491

src/storage/src/redis_strings.rs (1)

318-325: Batch-based writes are consistently applied across string mutations.

These conversions preserve “single logical operation → single commit” and fit the new Batch abstraction well.

Also applies to: 389-396, 581-588, 635-642, 698-705, 759-766, 831-838, 1063-1079, 1137-1153, 1212-1219, 1284-1291, 1836-1884, 1938-1945, 1979-1986

src/storage/src/redis_sets.rs (1)

151-169: Core set mutations migrated cleanly to Batch (single commit).

These paths keep metadata + member updates together and look correct.

Also applies to: 472-520, 634-659, 795-837

src/storage/src/redis_hashes.rs (7)

86-122: LGTM! Efficient collect-then-batch pattern.

The refactoring correctly implements the batch abstraction with a two-phase approach: first collecting keys that need deletion (lines 86-99), then performing all operations in a single atomic batch (lines 112-121). This ensures consistency between the metadata count and actual deletions.

346-407: LGTM! Batch operations correctly handle all branches.

The batch abstraction is properly applied across all three logical branches:

Stale/empty hash initialization (lines 346-357)

Existing field updates (lines 376-382)

New field additions (lines 396-407)

Each branch maintains atomicity guarantees and correctly coordinates metadata updates with data changes.

648-664: LGTM! Efficient multi-field batch pattern.

The helper correctly consolidates all field-value pairs into a single batch operation, writing the metadata once followed by all data entries. This is more efficient than per-field batches and maintains atomicity for the entire multi-set operation.

716-755: LGTM! Two-phase batch strategy for existing hashes.

The collect-then-batch pattern (lines 716-730 collect, lines 743-754 batch) correctly:

Checks field existence before batching to determine new field count

Accumulates all writes before committing

Conditionally updates metadata only when new fields are added

Commits everything atomically in a single batch

832-876: LGTM! Batch operations correctly implement set-if-not-exists semantics.

Both the stale hash reinitialization (lines 832-843) and new field addition (lines 865-876) paths correctly use atomic batch operations to coordinate metadata and data updates.

959-1032: LGTM! Atomic increment operations with proper overflow handling.

The batch-based implementation correctly handles:

Integer overflow detection before committing (line 993)

Atomic updates for existing fields (lines 1000-1006)

Coordinated meta+data writes for new fields (lines 1021-1032)

All branches maintain transactional consistency through the batch abstraction.

1115-1191: LGTM! Atomic float increment with proper validation.

The implementation correctly:

Validates float results are finite before committing (lines 1148-1154)

Uses atomic batch updates for existing fields (lines 1159-1165)

Coordinates meta+data writes for new fields (lines 1180-1191)

The batch abstraction maintains consistency across all paths.

src/storage/src/redis.rs

arana-db#222) This commit introduces a Batch trait to abstract batch write operations, enabling support for both standalone (RocksDB) and future cluster (Raft) modes. Changes: - Add batch.rs module with Batch trait, RocksBatch, and BinlogBatch implementations - Add create_batch() method to Redis struct for creating batch instances - Refactor redis_strings.rs to use Batch trait (2 places) - Refactor redis_hashes.rs to use Batch trait (6 places) - Refactor redis_lists.rs to use Batch trait (7 places) - Refactor redis_sets.rs to use Batch trait (6 places) - Refactor redis_zsets.rs to use Batch trait (8 places) - Add explicit error handling for invalid column family index This abstraction allows seamless switching between direct RocksDB writes and Raft consensus-based writes in the future.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)

src/storage/src/redis_zsets.rs (2)
59-65: Fix score equality check in zadd (currently compares the wrong value).

This condition is wrong: it checks abs(sm.score) instead of abs(existing_score - sm.score), so it will “skip” updates incorrectly when sm.score is near 0.
Proposed fix
-                                if (existing_score, sm.score).1.abs() < f64::EPSILON {
+                                if (existing_score - sm.score).abs() < f64::EPSILON {
                                     // Score is the same, skip
                                     continue;
                                 } else {
Also applies to: 99-180, 188-213

1234-1265: zset_store_operation: delete+recreate is not atomic vs concurrent writers (lock gap + two commits).

You delete destination in one commit, then (after releasing the lock) call zadd() which does a second commit. A concurrent writer can observe/modify the destination in-between. Consider an internal “zadd without lock” (called while holding the destination lock) or building the destination contents in the same batch/commit.
src/storage/src/redis_sets.rs (3)
472-520: Use saturating_sub when decrementing set counts.

If metadata is inconsistent/corrupted, set_meta.count() - removed_count can underflow. Other modules already use saturating_sub.
Proposed fix
-            let new_count = set_meta.count() - removed_count as u64;
+            let new_count = set_meta.count().saturating_sub(removed_count as u64);
2206-2271: sdiffstore: destination is cleared without a destination lock (race vs concurrent ops).

You clear destination keys/meta via batch without holding the destination key lock, then later sadd() takes the lock. This breaks the per-key lock contract used elsewhere (read-modify-write ops won’t synchronize with the clear).
Sketch fix (lock destination during clear phase)
 pub fn sdiffstore(&self, destination: &[u8], keys: &[&[u8]]) -> Result<i32> {
+        let dest_str = String::from_utf8_lossy(destination).to_string();
+        let _dest_lock = ScopeRecordLock::new(self.lock_mgr.as_ref(), &dest_str);
         ...
 }
2333-2401: clear_and_store_set: Fix self-deadlock caused by nested lock acquisition

clear_and_store_set holds a ScopeRecordLock on the destination key, then calls self.sadd(destination, &member_refs) which attempts to acquire the same lock again. Since parking_lot::Mutex is not re-entrant, this causes a self-deadlock. Refactor to either:

Extract the core sadd logic into a separate internal method that does not acquire the lock, then call it from both sadd and clear_and_store_set while holding the lock once

Perform the sadd operations directly within clear_and_store_set without calling sadd
src/storage/src/redis_strings.rs (3)
1106-1155: msetnx existence check is not type-agnostic (can overwrite live non-string keys).

The loop checks only BaseKey (string) entries in MetaCF. If a key exists as hash/set/zset/list (stored under BaseMetaKey), msetnx can incorrectly proceed and overwrite it—contradicting the comment “any live key (any type) blocks the batch”.
One possible fix: reuse `get_key_type` for existence
 pub fn msetnx(&self, kvs: &[(Vec<u8>, Vec<u8>)]) -> Result<bool> {
-        let db = self.db.as_ref().context(OptionNoneSnafu {
-            message: "db is not initialized".to_string(),
-        })?;
-
-        let _cf = self
-            .get_cf_handle(ColumnFamilyIndex::MetaCF)
-            .context(OptionNoneSnafu {
-                message: "cf is not initialized".to_string(),
-            })?;
-
-        // Check if any key exists and is not expired
-        for (key, _) in kvs {
-            let string_key = BaseKey::new(key);
-
-            match db
-                .get_opt(&string_key.encode()?, &self.read_options)
-                .context(RocksSnafu)?
-            {
-                Some(val) => {
-                    let string_value = ParsedStringsValue::new(&val[..])?;
-                    if !string_value.is_stale() {
-                        return Ok(false);
-                    }
-                }
-                None => {}
-            }
-        }
+        // Check if any *live* key exists (any type blocks MSETNX, Redis-compatible).
+        for (key, _) in kvs {
+            if self.get_key_type(key).is_ok() {
+                return Ok(false);
+            }
+        }
2226-2261: flush_db accumulates unbounded batches on large databases; implement chunked deletes (commit every N keys) or add delete_range support to the Batch trait.

The current implementation iterates all keys across multiple column families, collects them into a single Vec, then adds all delete operations to a single WriteBatch before committing. On large databases, this causes:

Unnecessary memory allocation for the entire key vector

Single massive write batch with thousands/millions of operations

Increased WAL overhead and compaction pressure

RocksDB 0.23.0 provides delete_range_cf() for efficient range deletes, but it's not exposed by the Batch trait. Chunking deletes into smaller batches (commit every 10k–100k keys) would be a low-risk fix; alternatively, extend the Batch trait to support delete_range().

2110-2151: Fix prefix-scan logic and limit batch accumulation in del_key.

The current implementation has two issues:

Incorrect prefix matching: BaseKey::encode() includes the 16-byte reserve2 suffix, but composite keys (HashesDataCF, SetsDataCF, etc.) store version bytes immediately after the encoded user key. The check k.starts_with(&encoded) will not match these keys because the byte sequence after the user key is [version][data], not [reserve2]. Use encode_seek_key() or a prefix without reserve2 instead (see redis_hashes.rs:225 for the correct pattern).

Unbounded memory accumulation: Collecting all matched keys into keys_to_delete before deletion can exhaust memory for large composite types. Build and commit the batch incrementally instead (see redis_multi.rs:462-468 for the pattern).
src/storage/src/redis_hashes.rs (1)
86-122: Fix clippy/lint failures: avoid &temp_vec() / needless borrow-to-deref in batch.put(...).

CI is failing with “this expression creates a reference which is immediately dereferenced” at Line 119/350/400/698/751/836/869/963/1025/1119. The typical trigger here is passing &some_vec (or &some_fn_returning_vec()) where &[u8] is expected. Bind encoded values and pass slices.
Concrete fix pattern (apply to all reported lines)
-                    batch.put(
-                        ColumnFamilyIndex::MetaCF,
-                        &base_meta_key,
-                        &meta_val.encoded(),
-                    )?;
+                    let encoded_meta = meta_val.encoded();
+                    batch.put(
+                        ColumnFamilyIndex::MetaCF,
+                        base_meta_key.as_slice(),
+                        encoded_meta.as_slice(),
+                    )?;
-            batch.put(
-                ColumnFamilyIndex::MetaCF,
-                &base_meta_key,
-                &hashes_meta.encode(),
-            )?;
+            let encoded_meta = hashes_meta.encode();
+            batch.put(
+                ColumnFamilyIndex::MetaCF,
+                base_meta_key.as_slice(),
+                encoded_meta.as_slice(),
+            )?;
-            batch.put(
-                ColumnFamilyIndex::HashesDataCF,
-                &data_key.encode()?,
-                &data_value.encode(),
-            )?;
+            let encoded_key = data_key.encode()?;
+            let encoded_val = data_value.encode();
+            batch.put(
+                ColumnFamilyIndex::HashesDataCF,
+                encoded_key.as_slice(),
+                encoded_val.as_slice(),
+            )?;
Also applies to: 293-314, 346-358, 376-383, 395-408, 637-755, 779-877, 904-1033, 1060-1192
src/storage/src/redis_lists.rs (1)

1080-1154: rpoplpush is not crash-atomic (two commits); consider a single batch commit spanning both keys.

Even with both key locks held, rpop_internal commits first and push_core commits second; a crash between them can lose/misplace an element. With the new Batch abstraction, it should be possible to accumulate deletes/puts for both source+destination into one batch and commit once.

Also applies to: 364-429, 43-255

🤖 Fix all issues with AI agents

In @src/storage/src/batch.rs:
- Around line 56-90: The out-of-bounds error messages for ColumnFamilyIndex use
self.cf_handles.len() as the "max index" but the true maximum valid index is
self.cf_handles.len() - 1; update the error construction in the code paths that
validate cf_idx (referencing cf_handles and the Redis::create_batch() usage) to
report the maximum as self.cf_handles.len().saturating_sub(1) (or handle empty
cf_handles explicitly) so the message shows the correct highest valid index
instead of the length.
- Around line 201-264: The code relies on commented-out selection logic and can
accidentally return a BinlogBatch; add an explicit guard in Redis and
create_batch(): add a bool field like `cluster_mode` to the `Redis` struct (or
use a feature flag), then update `create_batch()` to only construct/return
`BinlogBatch` when `cluster_mode` is true and the required Raft wiring (e.g.,
append log callback) is present; otherwise always return `RocksBatch` and if
`cluster_mode` is true but append-log wiring is missing, return an immediate
error or panic rather than creating a BinlogBatch (leave `BinlogBatch::commit()`
intentional error as a secondary safeguard).

In @src/storage/src/redis_lists.rs:
- Around line 341-349: The code inconsistently handles empty lists:
lpop/rpop/rpop_internal currently preserve list metadata when the list becomes
empty while ltrim/lrem delete it, causing differing "key exists" and
lpushx/rpushx behavior; make this consistent by changing the ltrim and lrem code
paths to preserve metadata instead of deleting the MetaCF entry when a list
becomes empty (or alternatively change lpop/rpop to delete if you prefer that
semantics), updating the batch operations in the affected functions (ltrim,
lrem, and their internal helpers) so they write the parsed_meta.value() to
ColumnFamilyIndex::MetaCF rather than issuing batch.delete, and ensure the same
approach is applied to all referenced code paths (including the rpop_internal
and any other empty-list branches) so key-exists semantics match across
operations.

In @src/storage/src/redis_strings.rs:
- Around line 1063-1079: The mset implementation currently batches puts and
commits without acquiring per-key locks, which can race with ops that rely on
ScopeRecordLock (e.g., incr_decr); modify the mset function to first collect and
sort the keys (use BaseKey::new(key).encode() or the key strings), acquire a
ScopeRecordLock for each key in that sorted order (holding all locks), then
perform create_batch()/batch.put(...) for each kv and finally batch.commit(),
releasing locks after commit; ensure lock acquisition is exception-safe (release
on error) and reference ScopeRecordLock, mset, create_batch, batch.put, and
batch.commit in your change.

🧹 Nitpick comments (3)

src/raft/src/network.rs (1)

386-397: Extract duplicated message type computation into a helper method.

The match expression that computes msg_type is duplicated between add_authentication and verify_authentication. This violates the DRY principle and creates a maintenance risk—if the message type representation needs to change, both locations must be updated consistently.

♻️ Proposed refactor to eliminate duplication

Add a private helper method to MessageEnvelope:

impl MessageEnvelope {
    /// Get the message type representation for HMAC computation
    fn message_type_for_hmac(&self) -> String {
        match &self.message {
            RaftMessage::AppendEntries(_) => "AppendEntries".to_string(),
            RaftMessage::AppendEntriesResponse(_) => "AppendEntriesResponse".to_string(),
            RaftMessage::Vote(_) => "Vote".to_string(),
            RaftMessage::VoteResponse(_) => "VoteResponse".to_string(),
            RaftMessage::InstallSnapshot(_) => "InstallSnapshot".to_string(),
            RaftMessage::InstallSnapshotResponse(_) => "InstallSnapshotResponse".to_string(),
            RaftMessage::Heartbeat { from, term } => format!("Heartbeat:{}:{}", from, term),
            RaftMessage::HeartbeatResponse { from, success } => {
                format!("HeartbeatResponse:{}:{}", from, success)
            }
        }
    }
}

Then update both methods to use the helper:

 pub fn add_authentication(&mut self, auth: &NodeAuth) -> RaftResult<()> {
-    // Use a simplified message representation for HMAC
-    let msg_type = match &self.message {
-        RaftMessage::AppendEntries(_) => "AppendEntries".to_string(),
-        RaftMessage::AppendEntriesResponse(_) => "AppendEntriesResponse".to_string(),
-        RaftMessage::Vote(_) => "Vote".to_string(),
-        RaftMessage::VoteResponse(_) => "VoteResponse".to_string(),
-        RaftMessage::InstallSnapshot(_) => "InstallSnapshot".to_string(),
-        RaftMessage::InstallSnapshotResponse(_) => "InstallSnapshotResponse".to_string(),
-        RaftMessage::Heartbeat { from, term } => format!("Heartbeat:{}:{}", from, term),
-        RaftMessage::HeartbeatResponse { from, success } => {
-            format!("HeartbeatResponse:{}:{}", from, success)
-        }
-    };
+    let msg_type = self.message_type_for_hmac();
     let data_for_hmac = format!(
         "{}:{}:{}:{}:{}",
         self.message_id, self.from, self.to, self.timestamp, msg_type

 pub fn verify_authentication(&self, auth: &NodeAuth) -> bool {
     if let Some(ref expected_hmac) = self.hmac {
-        // Recreate the same data format used for HMAC generation
-        let msg_type = match &self.message {
-            RaftMessage::AppendEntries(_) => "AppendEntries".to_string(),
-            RaftMessage::AppendEntriesResponse(_) => "AppendEntriesResponse".to_string(),
-            RaftMessage::Vote(_) => "Vote".to_string(),
-            RaftMessage::VoteResponse(_) => "VoteResponse".to_string(),
-            RaftMessage::InstallSnapshot(_) => "InstallSnapshot".to_string(),
-            RaftMessage::InstallSnapshotResponse(_) => "InstallSnapshotResponse".to_string(),
-            RaftMessage::Heartbeat { from, term } => format!("Heartbeat:{}:{}", from, term),
-            RaftMessage::HeartbeatResponse { from, success } => {
-                format!("HeartbeatResponse:{}:{}", from, success)
-            }
-        };
+        let msg_type = self.message_type_for_hmac();
         let data_for_hmac = format!(
             "{}:{}:{}:{}:{}",
             self.message_id, self.from, self.to, self.timestamp, msg_type

Also applies to: 411-422

src/storage/src/lib.rs (1)

42-43: Public API expansion: consider feature-gating or sealing BinlogBatch until it’s real.

Re-exporting BinlogBatch makes it part of the stable surface; if it’s intentionally a placeholder, consider hiding behind a cluster feature or documenting “unstable/experimental” to avoid long-term API constraints.

Also applies to: 69-70

src/storage/src/redis_hashes.rs (1)

293-314: Consider de-duplicating the repeated create_new_hash closures.

The same “init meta + write 1+ fields via batch” logic is repeated in hset, hmset, hsetnx, hincrby, hincrbyfloat. A small private helper (or a generic helper taking an iterator of (field,value_bytes)) would cut repetition and reduce risk of divergence.

Also applies to: 637-665, 779-800, 904-927, 1060-1082

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b86f898 and 5b92f55.

📒 Files selected for processing (9)

src/raft/src/network.rs
src/storage/src/batch.rs
src/storage/src/lib.rs
src/storage/src/redis.rs
src/storage/src/redis_hashes.rs
src/storage/src/redis_lists.rs
src/storage/src/redis_sets.rs
src/storage/src/redis_strings.rs
src/storage/src/redis_zsets.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/storage/src/redis.rs

🧰 Additional context used

🧬 Code graph analysis (4)

src/storage/src/redis_lists.rs (3)

src/engine/src/rocksdb_engine.rs (1)

db (38-40)

src/storage/src/storage_impl.rs (1)

key (150-150)

src/storage/src/base_key_format.rs (1)

key (88-90)

src/storage/src/redis_zsets.rs (1)

src/storage/src/batch.rs (2)

new (113-125)

new (221-223)

src/storage/src/batch.rs (2)

src/storage/src/redis.rs (1)

new (115-149)

src/engine/src/rocksdb_engine.rs (1)

db (38-40)

src/storage/src/redis_strings.rs (1)

src/engine/src/rocksdb_engine.rs (1)

db (38-40)

🪛 GitHub Check: cargo clippy (ubuntu-latest)

src/storage/src/redis_hashes.rs

[failure] 119-119:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 350-350:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 400-400:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 698-698:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 751-751:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 836-836:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 869-869:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 963-963:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 1025-1025:
this expression creates a reference which is immediately dereferenced by the compiler

[failure] 1119-1119:
this expression creates a reference which is immediately dereferenced by the compiler

🪛 GitHub Check: lint