A replicated, log-structured storage engine with Raft consensus, written in C++17.
- Log-Structured Storage: Append-only segments with in-memory indexing
- Segment Compaction: Background compaction removes deleted keys and old versions
- Raft Consensus: Leader election and log replication for fault tolerance
- High Performance: Optimized for throughput with batching and mmap reads
┌─────────────────────────────────────────────────────────────┐
│ Client API │
├─────────────────────────────────────────────────────────────┤
│ Raft Consensus │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Leader │ │ Follower │ │ Candidate │ │
│ │ Election │ │ Replication │ │ Voting │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Network Layer (TCP) │
├─────────────────────────────────────────────────────────────┤
│ Log-Structured Storage │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Segments │ │ Index │ │ Compactor │ │
│ │ (Append-only)│ │ (In-memory) │ │ (Background)│ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)Start a 3-node cluster:
# Terminal 1
./dist_log_engine --id 1 --port 9001 --peers "localhost:9002,localhost:9003"
# Terminal 2
./dist_log_engine --id 2 --port 9002 --peers "localhost:9001,localhost:9003"
# Terminal 3
./dist_log_engine --id 3 --port 9003 --peers "localhost:9001,localhost:9002"Or use the provided script:
./scripts/run_cluster.sh--id <id> Node ID (default: 1)
--host <host> Host address (default: 127.0.0.1)
--port <port> Port number (default: 9001)
--peers <peers> Comma-separated peer list (host:port,...)
--data-dir <dir> Data directory (default: ./data)
--segment-size <size> Segment size in bytes (default: 64MB)
--log-level <level> Log level: debug, info, warn, error
# Unit tests
./build/test_segment
./build/test_log_store
./build/test_raft
./build/test_integration
# Benchmark
./build/benchmark --ops 100000Benchmark results (on Apple M-series, single node):
| Workload | Throughput | Median Latency |
|---|---|---|
| Reads | >1M ops/sec | <0.001ms |
| Writes (batched) | ~30K ops/sec | <0.1ms |
| Mixed (20% writes) | ~50K ops/sec | <0.02ms |
dist_log_engine/
├── src/
│ ├── common/ # Utilities (config, CRC32, thread pool)
│ ├── storage/ # Log-structured storage engine
│ ├── raft/ # Raft consensus implementation
│ ├── network/ # TCP server/client, RPC
│ ├── server/ # Server integration
│ └── main.cpp # Entry point
├── include/ # Public client library header
├── tests/ # Unit and integration tests
├── benchmark/ # Performance benchmarks
└── scripts/ # Helper scripts
- Segment: Fixed-size append-only log files with CRC32 checksums
- Index: In-memory hash map for O(1) key lookups
- LogStore: Manages segments, handles recovery, batches writes
- Compactor: Background thread merges segments
- RaftNode: State machine (Follower/Candidate/Leader)
- RaftLog: Persistent log of commands
- State: Persistent term and vote tracking
- TCPServer: Async server using epoll (Linux) / kqueue (macOS)
- TCPClient: Peer connections with automatic reconnection
- RPC: Binary protocol for Raft messages
MIT