-
Notifications
You must be signed in to change notification settings - Fork 0
Add ring benchmarks #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Compares against buffered channels and a slice protected by a sync.Mutex, two common primitives for queue behavior. The comparison includes SPSC, MPSC, SPMC, and MPMC. It varies batch sizes and payload sizes. It also measures plain enqueue/dequeue and batched enqueue / dequeue. As tested on my M3 Pro Macbook, the ring buffer implementation is substantially better under any form of contention. The only place where it does not shine is SPSC, which is no surprise because the lock-free implementation is designed for concurrent access.
WalkthroughThis PR adds Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
ring/mutex_slice_bench_test.go (3)
29-42: Document the capacity limitation.The queue only fills to
len-1capacity (line 32), leaving one slot unused. This is a valid design to avoid full/empty ambiguity whenhead == tail, but it should be documented in the type definition or constructor so users understand the effective capacity issize-1.
61-78: Remove unnecessary type cast.Line 65:
q.maskis already declared asint(line 17), so the castint(q.mask)is unnecessary.- if q.used+n <= int(q.mask)+1 { + if q.used+n <= q.mask+1 {
80-103: Optimize slice operations for benchmark accuracy.The method pre-allocates a slice with capacity
n(line 82) but then usesappend(line 92), which is less efficient than direct indexing. For benchmark code, this overhead could affect timing results.Apply this diff to use direct indexing:
func (q *mutexSliceQueue[T]) DequeueBatch(n int) []T { var zero T - items := make([]T, 0, n) for { q.mu.Lock() if q.used > 0 { if n > q.used { n = q.used } + items := make([]T, n) for i := 0; i < n; i++ { baseIdx := q.head & q.mask - items = append(items, q.buf[baseIdx]) + items[i] = q.buf[baseIdx] q.buf[baseIdx] = zero q.head++ } q.used -= n q.mu.Unlock() return items } q.mu.Unlock() runtime.Gosched() } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
buffer/BUILD.bazel(1 hunks)maps/BUILD.bazel(1 hunks)ring/BUILD.bazel(1 hunks)ring/channel_bench_test.go(1 hunks)ring/common_bench_test.go(1 hunks)ring/mutex_slice_bench_test.go(1 hunks)ring/ring_bench_test.go(1 hunks)ring/ring_test.go(0 hunks)set/BUILD.bazel(1 hunks)slices/BUILD.bazel(1 hunks)
💤 Files with no reviewable changes (1)
- ring/ring_test.go
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: freshdresch
Repo: freshdresch/go-data-structures PR: 7
File: maps/BUILD.bazel:15-17
Timestamp: 2025-09-30T05:42:22.282Z
Learning: In the freshdresch/go-data-structures repository with Bazel and Bzlmod enabled, the testify library exports targets with package names (e.g., `com_github_stretchr_testify//assert:assert`) rather than the traditional `:go_default_library` naming convention.
📚 Learning: 2025-09-30T05:42:22.282Z
Learnt from: freshdresch
Repo: freshdresch/go-data-structures PR: 7
File: maps/BUILD.bazel:15-17
Timestamp: 2025-09-30T05:42:22.282Z
Learning: In the freshdresch/go-data-structures repository with Bazel and Bzlmod enabled, the testify library exports targets with package names (e.g., `com_github_stretchr_testify//assert:assert`) rather than the traditional `:go_default_library` naming convention.
Applied to files:
slices/BUILD.bazelbuffer/BUILD.bazelring/BUILD.bazelmaps/BUILD.bazelset/BUILD.bazel
📚 Learning: 2025-09-30T05:43:34.030Z
Learnt from: freshdresch
Repo: freshdresch/go-data-structures PR: 7
File: BUILD.bazel:5-16
Timestamp: 2025-09-30T05:43:34.030Z
Learning: In rules_go, a go_library target with srcs = [] but with deps is valid and can serve as a meta-package or aggregator that re-exports other packages. The build does not fail when used this way.
Applied to files:
slices/BUILD.bazelring/BUILD.bazelmaps/BUILD.bazelset/BUILD.bazel
📚 Learning: 2025-09-30T01:59:33.315Z
Learnt from: freshdresch
Repo: freshdresch/go-data-structures PR: 3
File: set/set.go:182-184
Timestamp: 2025-09-30T01:59:33.315Z
Learning: In the go-data-structures repository, the user freshdresch is the sole contributor working on early development in a private repo with no external consumers, so breaking changes to exported APIs are acceptable and backward compatibility concerns should not be raised.
Applied to files:
buffer/BUILD.bazelmaps/BUILD.bazel
🧬 Code graph analysis (1)
ring/ring_bench_test.go (1)
ring/ring.go (3)
NewRing(61-88)WithMultiConsDequeue(55-59)WithMultiProdEnqueue(49-53)
🔇 Additional comments (17)
set/BUILD.bazel (1)
6-6: LGTM! Explicit importpath improves build clarity.Adding the
importpathattribute aligns with Go module best practices and makes the module path explicit for Bazel.slices/BUILD.bazel (1)
6-6: LGTM! Consistent with other package configurations.The
importpathaddition follows the same pattern as other packages in this PR.maps/BUILD.bazel (1)
6-6: LGTM! Consistent importpath configuration.The
importpathattribute is correctly configured for the maps package.buffer/BUILD.bazel (1)
10-10: LGTM! Explicit importpath added.Consistent with other packages in this PR.
ring/common_bench_test.go (1)
3-8: LGTM! Well-organized benchmark parameters.Centralizing benchmark configuration (ring size, batch sizes, payload sizes) in a common file makes it easy to adjust parameters consistently across all benchmark suites.
ring/BUILD.bazel (2)
6-6: LGTM! Explicit importpath added.Consistent with other packages in this PR.
10-37: Excellent test organization! Clear separation of concerns.The restructuring separates internal white-box tests from external black-box benchmarks, with clear comments explaining each target's purpose. Using
depsinstead ofembedfor the benchmark target enforces the black-box testing approach.ring/ring_bench_test.go (5)
11-26: LGTM! Well-structured SPSC sequential benchmark.The benchmark correctly measures sequential single-producer single-consumer throughput with proper setup, timing, and byte counting.
28-59: LGTM! SPMC parallel benchmark correctly coordinated.The single-producer multiple-consumer benchmark properly:
- Enables multi-consumer dequeue mode
- Coordinates 4 consumer goroutines with WaitGroup
- Balances operations (4 consumers × b.N = producer's b.N×4)
61-92: LGTM! MPSC parallel benchmark correctly coordinated.The multiple-producer single-consumer benchmark properly enables multi-producer mode and balances operations correctly.
94-141: LGTM! MPMC parallel benchmark correctly coordinated.The multiple-producer multiple-consumer benchmark uses separate wait groups for producers and consumers, ensuring proper synchronization.
143-305: Batch benchmarks correctly mirror single-item patterns.The batch variants (SPSC, SPMC, MPSC, MPMC) properly use
EnqueueBurstandDequeueBurstoperations with multi-dimensional payload arrays and maintain the same coordination patterns as their single-item counterparts.ring/channel_bench_test.go (3)
36-51: LGTM! Sequential channel benchmark baseline.Provides a good baseline for comparing ring buffer performance against Go channels in SPSC scenarios.
53-152: LGTM! Parallel channel benchmarks correctly coordinated.The MPSC, SPMC, and MPMC benchmarks properly coordinate multiple goroutines and balance producer/consumer operations.
154-302: LGTM! Batch channel benchmarks mirror single-item patterns.The batch variants use channels with
[][]bytepayloads and maintain proper coordination across all concurrency patterns.ring/mutex_slice_bench_test.go (2)
44-59: LGTM!The dequeue logic is correct. Zeroing out dequeued elements (line 50) is good practice to prevent memory leaks in long-lived queues.
105-221: LGTM!The non-batch benchmark patterns (SPSC, MPSC, SPMC, MPMC) are correctly implemented with balanced enqueue/dequeue operations and appropriate synchronization.
| type ChanWrapper[T any] struct { | ||
| ch chan T | ||
| } | ||
|
|
||
| func NewChanWrapper[T any](capacity int) *ChanWrapper[T] { | ||
| return &ChanWrapper[T]{ch: make(chan T, capacity)} | ||
| } | ||
|
|
||
| func (c *ChanWrapper[T]) EnqueueBurst(batch []T) { | ||
| for _, v := range batch { | ||
| c.ch <- v | ||
| } | ||
| } | ||
|
|
||
| func (c *ChanWrapper[T]) DequeueBurst(n int) []T { | ||
| res := make([]T, 0, n) | ||
| for i := 0; i < n; i++ { | ||
| select { | ||
| case v := <-c.ch: | ||
| res = append(res, v) | ||
| default: | ||
| return res | ||
| } | ||
| } | ||
| return res | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused ChanWrapper type.
The ChanWrapper type and its methods are defined but never used in any of the benchmarks below. All benchmarks use raw channels directly.
Apply this diff to remove the unused code:
-type ChanWrapper[T any] struct {
- ch chan T
-}
-
-func NewChanWrapper[T any](capacity int) *ChanWrapper[T] {
- return &ChanWrapper[T]{ch: make(chan T, capacity)}
-}
-
-func (c *ChanWrapper[T]) EnqueueBurst(batch []T) {
- for _, v := range batch {
- c.ch <- v
- }
-}
-
-func (c *ChanWrapper[T]) DequeueBurst(n int) []T {
- res := make([]T, 0, n)
- for i := 0; i < n; i++ {
- select {
- case v := <-c.ch:
- res = append(res, v)
- default:
- return res
- }
- }
- return res
-}
-Alternatively, if you intended to use ChanWrapper for testing, you would need to update the benchmarks to actually use it.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type ChanWrapper[T any] struct { | |
| ch chan T | |
| } | |
| func NewChanWrapper[T any](capacity int) *ChanWrapper[T] { | |
| return &ChanWrapper[T]{ch: make(chan T, capacity)} | |
| } | |
| func (c *ChanWrapper[T]) EnqueueBurst(batch []T) { | |
| for _, v := range batch { | |
| c.ch <- v | |
| } | |
| } | |
| func (c *ChanWrapper[T]) DequeueBurst(n int) []T { | |
| res := make([]T, 0, n) | |
| for i := 0; i < n; i++ { | |
| select { | |
| case v := <-c.ch: | |
| res = append(res, v) | |
| default: | |
| return res | |
| } | |
| } | |
| return res | |
| } |
🤖 Prompt for AI Agents
In ring/channel_bench_test.go around lines 9 to 34, the ChanWrapper type and its
methods (NewChanWrapper, EnqueueBurst, DequeueBurst) are declared but never used
by the benchmarks; remove the entire ChanWrapper type and its associated methods
from that section to eliminate dead code, or if you intended to test the wrapper
instead of raw channels, update the benchmark functions to instantiate
ChanWrapper via NewChanWrapper and replace direct channel operations with
EnqueueBurst and DequeueBurst calls.
| func newMutexSliceQueue[T any](size int) *mutexSliceQueue[T] { | ||
| // size must be power of two for mask | ||
| return &mutexSliceQueue[T]{ | ||
| buf: make([]T, size), | ||
| mask: size - 1, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add power-of-two validation in constructor.
The comment states size must be a power of two, but there's no runtime validation. If a non-power-of-two size is passed, the mask-based indexing throughout the code will produce incorrect results.
Apply this diff to add validation:
func newMutexSliceQueue[T any](size int) *mutexSliceQueue[T] {
// size must be power of two for mask
+ if size <= 0 || (size&(size-1)) != 0 {
+ panic("size must be a positive power of two")
+ }
return &mutexSliceQueue[T]{
buf: make([]T, size),
mask: size - 1,
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func newMutexSliceQueue[T any](size int) *mutexSliceQueue[T] { | |
| // size must be power of two for mask | |
| return &mutexSliceQueue[T]{ | |
| buf: make([]T, size), | |
| mask: size - 1, | |
| } | |
| } | |
| func newMutexSliceQueue[T any](size int) *mutexSliceQueue[T] { | |
| // size must be power of two for mask | |
| if size <= 0 || (size&(size-1)) != 0 { | |
| panic("size must be a positive power of two") | |
| } | |
| return &mutexSliceQueue[T]{ | |
| buf: make([]T, size), | |
| mask: size - 1, | |
| } | |
| } |
🤖 Prompt for AI Agents
In ring/mutex_slice_bench_test.go around lines 21 to 27, the constructor
newMutexSliceQueue allows non-power-of-two sizes which breaks mask-based
indexing; add a runtime validation that size is > 0 and a power of two (e.g.
check size & (size-1) == 0), and if not, panic with a clear message (or return
an error if you change the signature) so callers immediately get a clear failure
instead of subtle indexing bugs.
| func BenchmarkMutexSlice_SPSC_Seq_Batch(b *testing.B) { | ||
| for _, batchSize := range batchSizes { | ||
| for _, payloadSize := range payloadSizes { | ||
| b.Run( | ||
| fmt.Sprintf("batch%d_%dB", batchSize, payloadSize), | ||
| func(b *testing.B) { | ||
| q := newMutexSliceQueue[[]byte](ringSize) | ||
| payload := make([][]byte, batchSize) | ||
| for i := range payload { | ||
| payload[i] = make([]byte, payloadSize) | ||
| } | ||
|
|
||
| b.SetBytes(int64(payloadSize)) | ||
| b.ResetTimer() | ||
|
|
||
| for i := 0; i < b.N; i++ { | ||
| q.EnqueueBatch(payload) | ||
| _ = q.DequeueBatch(batchSize) | ||
| } | ||
| }, | ||
| ) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct SetBytes calculation for batch benchmarks.
The SetBytes value should account for the total bytes processed per iteration. Each iteration enqueues and dequeues batchSize items of payloadSize bytes, so the total is batchSize * payloadSize, not just payloadSize (line 235).
This issue affects throughput reporting in all batch benchmarks (lines 235, 261, 299, 340).
Apply this diff to fix the calculation:
- b.SetBytes(int64(payloadSize))
+ b.SetBytes(int64(batchSize * payloadSize))Apply the same fix to lines 261, 299, and 340.
🤖 Prompt for AI Agents
In ring/mutex_slice_bench_test.go around lines 223-246 (and similarly at lines
~261, ~299, and ~340), the benchmark calls use b.SetBytes(int64(payloadSize))
which only accounts for one item; change SetBytes to account for total bytes
processed per iteration by using b.SetBytes(int64(batchSize * payloadSize))
(apply the same fix at the other reported line ranges so batch benchmarks report
throughput as batchSize * payloadSize).
Summary by CodeRabbit
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.