[RFC 0195] init: Binary Cache Index Protocol #195

kalbasit · 2026-01-15T06:34:48Z

Rendered

Previous discussions and references:

cc @Mic92 @edef1c @brianmcgee @roberth @zimbatm @Kernald

rfcs/0195-binary-cache-index-protocol.md

roberth

A bit rough around the edges (I know it's draft!), but seems like a good starting point. Do you plan to validate this?

rfcs/0195-binary-cache-index-protocol.md

roberth · 2026-01-15T12:00:45Z

rfcs/0195-binary-cache-index-protocol.md

+
+## 4. Layer 1: Journal (Hot Layer)
+
+The journal captures recent mutations with minimal latency.


Not sure about minimal.

This seems to be determined by segment_duration_seconds, and it requires many individual requests to catch up with the log.

Perhaps with HTTP range requests the number of requests could be reduced, turning this into a small number of bulk downloads.

Long polling could be an implementation strategy to make this even more realtime, without the added complexity of a push protocol.
When doing range requests instead of relying on split files, you'd still want a time interval parameter, but instead of journal.segment_duration_seconds it would be journal.segment_query_interval. Set to 0 for long polling.
If it's a dumb bucket, set a high value to reduce unnecessary / inefficient traffic.

You're right that 'minimal' oversells it. I'll soften the language. Regarding optimizations: range requests for catch-up and long polling for real-time are good implementation strategies, but I'm inclined to keep them out of the spec itself since they're optimizations that servers/clients can adopt independently without protocol changes. The protocol just needs to not preclude them. Would a note in the implementation considerations section acknowledging these optimization opportunities be sufficient?

I meant to incorporate those HTTP features in order to simplify the protocol.
By appending to a large journal file and relying on this feature, you may both reduce the spec complexity and improve performance, in terms of latency and number of requests.

Unless we have an overriding reason to provide this inefficient multi-file scheme, I think we'd be better off treating an append only log as an append only log at the HTTP level.

I've been thinking about how to implement a single-file journal over dumb storage (S3 + CDN), and I don't see a clean path. S3 objects are immutable, so "appending" requires re-uploading the entire file. The multi-segment approach lets writers upload small files without touching existing data.

For smart servers (with actual append support or a proxy layer), a single-file journal with range requests would indeed be simpler and more efficient. But I want the baseline protocol to work with just static file hosting.

I see two options:

Option A: Define both modes in the spec

Add a field to indicate journal mode (segments vs single)

Clients implement both code paths

More flexible, but adds complexity to every client implementation

Option B: Segments as the only mode

Keep segments as the baseline (works with dumb storage)

Smart servers like ncps could still optimize internally but serve the segment format for compatibility

Simpler client implementations

Given that cache.nixos.org (the largest cache) runs on dumb S3, I'm leaning toward Option B. But I'm open to Option A if you think the range-request efficiency is worth the added client complexity.

What's your preference?

@roberth circling back on this, which option should I put in the RFC?

@roberth what do you think about the options here? I think I'll keep it as-is for now (I will update the RFC with other comments from last round). Let me know if you prefer to alter this section.

rfcs/0195-binary-cache-index-protocol.md

Mic92 · 2026-01-15T12:59:20Z

rfcs/0195-binary-cache-index-protocol.md

+Total:  64 bytes
+```
+
+**Implementation Note**: The header is designed to avoid struct padding issues. All multi-byte integers are little-endian. Implementations in C/Rust should use explicit byte-level serialization or `#pragma pack(1)` / `#[repr(packed)]` to ensure correct layout.


Did you not specify big-endian above?

Unless there's an overriding reason, pick little endian. Big endian is just not how computers work anymore.

{big-endian above} had an overriding reason, which is the correspondence between lexicographic sorting and the ordering of number sorting. But that's part of the domain, whereas this here is just a trivial implementation-level detail.
For comparison, we wouldn't pick big/little endian because e.g. accounting applications have Arabic numerals which are big endian.

To clarify what might be confusing: the RFC uses both, intentionally:

Section 2.1 (hash interpretation): Big-endian so that lexicographic string ordering equals numeric ordering - required for prefix-based sharding to work correctly.

Section 5.1 (header integers): Little-endian for uint64 fields like item count and offsets - just binary serialization matching modern CPUs.

These aren't contradictory; they serve different purposes. I'll add a note to Section 5.1 clarifying why header integers use little-endian while hash interpretation uses big-endian.

Interestingly, Nix32 encoding follows the lexicographic sort of the reversed byte string.
So prefix-based sharding for Nix32 paths is - behind the scenes - suffix-based sharding as it relates to native hash bytes and the base-16 encoding.
See docs pr NixOS/nix#15004 referenced earlier.

I feel uneasy to perpetuate this syntactic quirk.
If I understand correctly, it causes the byte sequences in this spec to be reverse of the native hash bytes. That is very very ugly.

You're right that this is an unfortunate consequence of Nix's non-standard base32 encoding. The big-endian interpretation in Section 2.1 is required for the protocol to work correctly—it ensures lexicographic string ordering equals numeric ordering, which is essential for prefix-based sharding and delta encoding.
The alternative would be to have the protocol convert Nix32 → native bytes → use native byte order, but this would:

Add a conversion step on every operation

Break the correspondence between string prefixes and shard assignment

Add complexity without functional benefit

I can add a note acknowledging that the 160-bit integers in the index are byte-reversed relative to the native hash representation, but I don't see a way to avoid this without significantly complicating the protocol. Is there a specific problem you foresee this causing?

I considered reversing the string so that shard prefixes correspond to native hash byte prefixes, but this would break the intuitive correspondence between a hash's visible prefix (b6gv...) and its shard location (b6/). Operators debugging cache issues would need to mentally reverse hashes to find the right shard.

The current design's 'ugliness' is confined to the internal byte representation, which most implementers won't encounter directly (they'll use libraries like go-nix). The reverse approach would surface the ugliness to every user interaction.

I'm open to other suggestions, but I think preserving hash_prefix == shard_name is worth the internal byte-order quirk.

rfcs/0195-binary-cache-index-protocol.md

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>

kalbasit · 2026-01-15T23:58:47Z

Thank you @kevincox, @Mic92, @roberth for your quick review and suggestions. I have resolved some of your comments and I will attend to the rest in a few hours after work. Thank you again 🙏🏼

Refine binary cache index protocol with manifest URL discovery, structured base URLs, zstd compression for shards, and clarified format details.

kalbasit

Thanks again for the review. I have addressed all your comments. I wasn't sure about the etiquette regarding resolving threads—should I leave them open for you to resolve if you are satisfied, or should I resolve them? I did go ahead and resolve the obvious code changes since I adopted those directly.

rfcs/0195-binary-cache-index-protocol.md

kalbasit · 2026-01-16T05:22:29Z

rfcs/0195-binary-cache-index-protocol.md

+
+## 4. Layer 1: Journal (Hot Layer)
+
+The journal captures recent mutations with minimal latency.


You're right that 'minimal' oversells it. I'll soften the language. Regarding optimizations: range requests for catch-up and long polling for real-time are good implementation strategies, but I'm inclined to keep them out of the spec itself since they're optimizations that servers/clients can adopt independently without protocol changes. The protocol just needs to not preclude them. Would a note in the implementation considerations section acknowledging these optimization opportunities be sufficient?

kalbasit · 2026-01-16T05:24:11Z

rfcs/0195-binary-cache-index-protocol.md

+18      8     Sparse index offset from start of file (uint64, little-endian)
+26      8     Sparse index entry count (uint64, little-endian)
+34      8     XXH64 checksum of encoded data section (uint64, little-endian)
+42      22    Reserved for future use (must be zeros)


Good question. The intent is lenient: clients SHOULD ignore the reserved bytes to allow minor, backward-compatible additions without breaking old clients. Breaking changes would bump the magic number (e.g., NIXIDX02) or the manifest version field. I'll clarify this in the spec - something like: 'Clients MUST ignore non-zero values in reserved bytes to allow backward-compatible extensions. Incompatible format changes will use a new magic number.

kalbasit · 2026-01-16T05:30:51Z

rfcs/0195-binary-cache-index-protocol.md

+Total:  64 bytes
+```
+
+**Implementation Note**: The header is designed to avoid struct padding issues. All multi-byte integers are little-endian. Implementations in C/Rust should use explicit byte-level serialization or `#pragma pack(1)` / `#[repr(packed)]` to ensure correct layout.


To clarify what might be confusing: the RFC uses both, intentionally:

Section 2.1 (hash interpretation): Big-endian so that lexicographic string ordering equals numeric ordering - required for prefix-based sharding to work correctly.

Section 5.1 (header integers): Little-endian for uint64 fields like item count and offsets - just binary serialization matching modern CPUs.

These aren't contradictory; they serve different purposes. I'll add a note to Section 5.1 clarifying why header integers use little-endian while hash interpretation uses big-endian.

rfcs/0195-binary-cache-index-protocol.md

Mic92 · 2026-01-19T15:43:03Z

Thanks again for the review. I have addressed all your comments. I wasn't sure about the etiquette regarding resolving threads—should I leave them open for you to resolve if you are satisfied, or should I resolve them? I did go ahead and resolve the obvious code changes since I adopted those directly.

I think resolving them is fine. We can always re-open if we feel like this is missing the point.

…dback Major changes based on RFC review comments: - Inline manifest into nix-cache-info: Remove separate manifest.json file and embed all index configuration directly in nix-cache-info using Index-prefixed fields. This eliminates an HTTP request and avoids adding another file format. - Document Nix32 byte order quirk: Add note in Section 2.1 explaining that Nix's base32 encoding processes bytes in reverse order compared to RFC4648, and recommend using established libraries like go-nix. - Change journal segment ID to opaque identifier: IndexJournalCurrentSegment is now specified as "opaque monotonically increasing" rather than explicitly a Unix timestamp. - Remove "Client Implementation Effort" from Drawbacks: This isn't a drawback—it's just how new features work. - Remove speculative Future Work items: Drop SIMD decoding, GPU acceleration, and flake discovery (already solved via nix-cache-info). - Update all examples to use nix-cache-info format - Update algorithm pseudocode to reference cache_info.Index* fields

kalbasit · 2026-01-30T08:37:30Z

Most comments are now address, can you give it another review? Let me know if this is ready for next steps.

[RFC 0195] init: Binary Cache Index Protocol

9554c66

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Show resolved Hide resolved

roberth reviewed Jan 15, 2026

View reviewed changes

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

kevincox reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Show resolved Hide resolved

Mic92 reviewed Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Show resolved Hide resolved

kalbasit commented Jan 15, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

kalbasit and others added 2 commits January 15, 2026 15:56

Apply suggestions from code review

4219cca

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>

Apply suggestion from @kalbasit

795f5f6

kalbasit added 3 commits January 15, 2026 20:35

remove the journal.segment_duration_seconds field

61a0d4d

implement Cache-Control instead of random ttl

4044da3

Address PR comments.

494b9e9

Refine binary cache index protocol with manifest URL discovery, structured base URLs, zstd compression for shards, and clarified format details.

kalbasit commented Jan 16, 2026

View reviewed changes

This was referenced Jan 16, 2026

Add Nix32 encoding documentation NixOS/nix#15004

Merged

doc: add nix-cache-info format documentation NixOS/nix#15006

Merged

kevincox reviewed Jan 16, 2026

View reviewed changes

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

rfcs/0195-binary-cache-index-protocol.md Outdated Show resolved Hide resolved

kalbasit requested review from Mic92, kevincox and roberth January 30, 2026 08:37


		## 4. Layer 1: Journal (Hot Layer)

		The journal captures recent mutations with minimal latency.

Uh oh!

[RFC 0195] init: Binary Cache Index Protocol #195

Are you sure you want to change the base?

[RFC 0195] init: Binary Cache Index Protocol #195

Uh oh!

Conversation

kalbasit commented Jan 15, 2026

Uh oh!

Uh oh!

roberth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kalbasit commented Jan 15, 2026

Uh oh!

kalbasit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mic92 commented Jan 19, 2026

Uh oh!

kalbasit commented Jan 30, 2026

Uh oh!

Reviewers