Skip to content
/ rfcs Public

Conversation

@kalbasit
Copy link
Member

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit rough around the edges (I know it's draft!), but seems like a good starting point. Do you plan to validate this?


## 4. Layer 1: Journal (Hot Layer)

The journal captures recent mutations with minimal latency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about minimal.

This seems to be determined by segment_duration_seconds, and it requires many individual requests to catch up with the log.

Perhaps with HTTP range requests the number of requests could be reduced, turning this into a small number of bulk downloads.

Long polling could be an implementation strategy to make this even more realtime, without the added complexity of a push protocol.
When doing range requests instead of relying on split files, you'd still want a time interval parameter, but instead of journal.segment_duration_seconds it would be journal.segment_query_interval. Set to 0 for long polling.
If it's a dumb bucket, set a high value to reduce unnecessary / inefficient traffic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that 'minimal' oversells it. I'll soften the language. Regarding optimizations: range requests for catch-up and long polling for real-time are good implementation strategies, but I'm inclined to keep them out of the spec itself since they're optimizations that servers/clients can adopt independently without protocol changes. The protocol just needs to not preclude them. Would a note in the implementation considerations section acknowledging these optimization opportunities be sufficient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to incorporate those HTTP features in order to simplify the protocol.
By appending to a large journal file and relying on this feature, you may both reduce the spec complexity and improve performance, in terms of latency and number of requests.

Unless we have an overriding reason to provide this inefficient multi-file scheme, I think we'd be better off treating an append only log as an append only log at the HTTP level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about how to implement a single-file journal over dumb storage (S3 + CDN), and I don't see a clean path. S3 objects are immutable, so "appending" requires re-uploading the entire file. The multi-segment approach lets writers upload small files without touching existing data.

For smart servers (with actual append support or a proxy layer), a single-file journal with range requests would indeed be simpler and more efficient. But I want the baseline protocol to work with just static file hosting.

I see two options:

Option A: Define both modes in the spec

  • Add a field to indicate journal mode (segments vs single)
  • Clients implement both code paths
  • More flexible, but adds complexity to every client implementation

Option B: Segments as the only mode

  • Keep segments as the baseline (works with dumb storage)
  • Smart servers like ncps could still optimize internally but serve the segment format for compatibility
  • Simpler client implementations

Given that cache.nixos.org (the largest cache) runs on dumb S3, I'm leaning toward Option B. But I'm open to Option A if you think the range-request efficiency is worth the added client complexity.

What's your preference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberth circling back on this, which option should I put in the RFC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberth what do you think about the options here? I think I'll keep it as-is for now (I will update the RFC with other comments from last round). Let me know if you prefer to alter this section.

Total: 64 bytes
```

**Implementation Note**: The header is designed to avoid struct padding issues. All multi-byte integers are little-endian. Implementations in C/Rust should use explicit byte-level serialization or `#pragma pack(1)` / `#[repr(packed)]` to ensure correct layout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you not specify big-endian above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's an overriding reason, pick little endian. Big endian is just not how computers work anymore.

{big-endian above} had an overriding reason, which is the correspondence between lexicographic sorting and the ordering of number sorting. But that's part of the domain, whereas this here is just a trivial implementation-level detail.
For comparison, we wouldn't pick big/little endian because e.g. accounting applications have Arabic numerals which are big endian.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify what might be confusing: the RFC uses both, intentionally:

  • Section 2.1 (hash interpretation): Big-endian so that lexicographic string ordering equals numeric ordering - required for prefix-based sharding to work correctly.
  • Section 5.1 (header integers): Little-endian for uint64 fields like item count and offsets - just binary serialization matching modern CPUs.

These aren't contradictory; they serve different purposes. I'll add a note to Section 5.1 clarifying why header integers use little-endian while hash interpretation uses big-endian.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, Nix32 encoding follows the lexicographic sort of the reversed byte string.
So prefix-based sharding for Nix32 paths is - behind the scenes - suffix-based sharding as it relates to native hash bytes and the base-16 encoding.
See docs pr NixOS/nix#15004 referenced earlier.

I feel uneasy to perpetuate this syntactic quirk.
If I understand correctly, it causes the byte sequences in this spec to be reverse of the native hash bytes. That is very very ugly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that this is an unfortunate consequence of Nix's non-standard base32 encoding. The big-endian interpretation in Section 2.1 is required for the protocol to work correctly—it ensures lexicographic string ordering equals numeric ordering, which is essential for prefix-based sharding and delta encoding.
The alternative would be to have the protocol convert Nix32 → native bytes → use native byte order, but this would:

  1. Add a conversion step on every operation
  2. Break the correspondence between string prefixes and shard assignment
  3. Add complexity without functional benefit

I can add a note acknowledging that the 160-bit integers in the index are byte-reversed relative to the native hash representation, but I don't see a way to avoid this without significantly complicating the protocol. Is there a specific problem you foresee this causing?

I considered reversing the string so that shard prefixes correspond to native hash byte prefixes, but this would break the intuitive correspondence between a hash's visible prefix (b6gv...) and its shard location (b6/). Operators debugging cache issues would need to mentally reverse hashes to find the right shard.

The current design's 'ugliness' is confined to the internal byte representation, which most implementers won't encounter directly (they'll use libraries like go-nix). The reverse approach would surface the ugliness to every user interaction.

I'm open to other suggestions, but I think preserving hash_prefix == shard_name is worth the internal byte-order quirk.

kalbasit and others added 2 commits January 15, 2026 15:56
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
@kalbasit
Copy link
Member Author

Thank you @kevincox, @Mic92, @roberth for your quick review and suggestions. I have resolved some of your comments and I will attend to the rest in a few hours after work. Thank you again 🙏🏼

Refine binary cache index protocol with manifest URL discovery, structured base URLs, zstd compression for shards, and clarified format details.
Copy link
Member Author

@kalbasit kalbasit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the review. I have addressed all your comments. I wasn't sure about the etiquette regarding resolving threads—should I leave them open for you to resolve if you are satisfied, or should I resolve them? I did go ahead and resolve the obvious code changes since I adopted those directly.


## 4. Layer 1: Journal (Hot Layer)

The journal captures recent mutations with minimal latency.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that 'minimal' oversells it. I'll soften the language. Regarding optimizations: range requests for catch-up and long polling for real-time are good implementation strategies, but I'm inclined to keep them out of the spec itself since they're optimizations that servers/clients can adopt independently without protocol changes. The protocol just needs to not preclude them. Would a note in the implementation considerations section acknowledging these optimization opportunities be sufficient?

18 8 Sparse index offset from start of file (uint64, little-endian)
26 8 Sparse index entry count (uint64, little-endian)
34 8 XXH64 checksum of encoded data section (uint64, little-endian)
42 22 Reserved for future use (must be zeros)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The intent is lenient: clients SHOULD ignore the reserved bytes to allow minor, backward-compatible additions without breaking old clients. Breaking changes would bump the magic number (e.g., NIXIDX02) or the manifest version field. I'll clarify this in the spec - something like: 'Clients MUST ignore non-zero values in reserved bytes to allow backward-compatible extensions. Incompatible format changes will use a new magic number.

Total: 64 bytes
```

**Implementation Note**: The header is designed to avoid struct padding issues. All multi-byte integers are little-endian. Implementations in C/Rust should use explicit byte-level serialization or `#pragma pack(1)` / `#[repr(packed)]` to ensure correct layout.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify what might be confusing: the RFC uses both, intentionally:

  • Section 2.1 (hash interpretation): Big-endian so that lexicographic string ordering equals numeric ordering - required for prefix-based sharding to work correctly.
  • Section 5.1 (header integers): Little-endian for uint64 fields like item count and offsets - just binary serialization matching modern CPUs.

These aren't contradictory; they serve different purposes. I'll add a note to Section 5.1 clarifying why header integers use little-endian while hash interpretation uses big-endian.

@Mic92
Copy link
Member

Mic92 commented Jan 19, 2026

Thanks again for the review. I have addressed all your comments. I wasn't sure about the etiquette regarding resolving threads—should I leave them open for you to resolve if you are satisfied, or should I resolve them? I did go ahead and resolve the obvious code changes since I adopted those directly.

I think resolving them is fine. We can always re-open if we feel like this is missing the point.

…dback

Major changes based on RFC review comments:

- Inline manifest into nix-cache-info: Remove separate manifest.json file
  and embed all index configuration directly in nix-cache-info using
  Index-prefixed fields. This eliminates an HTTP request and avoids
  adding another file format.

- Document Nix32 byte order quirk: Add note in Section 2.1 explaining
  that Nix's base32 encoding processes bytes in reverse order compared
  to RFC4648, and recommend using established libraries like go-nix.

- Change journal segment ID to opaque identifier: IndexJournalCurrentSegment
  is now specified as "opaque monotonically increasing" rather than
  explicitly a Unix timestamp.

- Remove "Client Implementation Effort" from Drawbacks: This isn't a
  drawback—it's just how new features work.

- Remove speculative Future Work items: Drop SIMD decoding, GPU
  acceleration, and flake discovery (already solved via nix-cache-info).

- Update all examples to use nix-cache-info format

- Update algorithm pseudocode to reference cache_info.Index* fields
@kalbasit
Copy link
Member Author

Most comments are now address, can you give it another review? Let me know if this is ready for next steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants