Merged
Conversation
ea8f353 to
d4064ca
Compare
hoffbrinkle
reviewed
Nov 26, 2025
sluongng
reviewed
Nov 26, 2025
74544c7 to
6d71426
Compare
Split and Splice docs
fmeum
reviewed
Dec 5, 2025
fmeum
approved these changes
Dec 5, 2025
6149e45 to
0d8e725
Compare
0d8e725 to
6810bc3
Compare
sluongng
reviewed
Dec 19, 2025
sluongng
reviewed
Dec 19, 2025
tyler-french
added a commit
to buildbuddy-io/buildbuddy
that referenced
this pull request
Dec 23, 2025
An implementation for CAS/SplitBlob and SpliceBlob is described here: bazelbuild/remote-apis#353 In order to do chunking on a layer above the abstraction of the cache, and store individual chunks in the CAS separately, we need to create a chunked metadata storage of some sort. To keep things simple, this implementation uses the Action Cache for storage of the chunked manifests, and stores them **under the original blobs digest**. This is simpler than using a **derived digest** (i.e. a hash of the digest + metadata) but has the same security. The AC entries are stored under a versioned prefix in the instance name, which means we can change the version to invalidate all cached manifests across all instances. `Split`: is used to retrieve a chunked manifest for a blob. If any of the chunks are not found, or the manifest is not found, it returns a `NotFound` error. `Splice`: is used to upsert a chunked manifest. All chunks should be available in the CAS. If any are missing, it returns an `InvalidArgument` error. Splice will also return an `InvalidArgument` error if the chunks do not concatenate together to equal the original blob digest. The implementation uses the experiment config so that we can enable this by group or user gradually to evaluate performance. Since AC entries are stored by group, this is safe. This PR is part of a series for an MVP of CDC. The next step is to implement the `Read`/`Write` ByteStream APIs to read and write using CDC if a blob matches conditions. An example of how this can be used for `ByteStream/Read`: #10997 Follow-ups included in buildbuddy-io/buildbuddy-internal#6426
There was a problem hiding this comment.
Pull request overview
This PR updates the documentation for the SplitBlob and SpliceBlob APIs in the Remote Execution v2 protocol to clarify their purpose and usage patterns. The changes emphasize that these APIs are primarily for storing and retrieving chunk composition metadata rather than performing the actual splitting and splicing operations on the server.
Key Changes:
- Clarified that
SplitBlobretrieves stored information about how a blob is chunked rather than performing the split operation - Updated
SpliceBlobdocumentation to emphasize that clients tell the server how chunks compose a blob - Expanded error conditions for
SplitBlobto include cases where split information or chunks are missing
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
215a11c to
b13101c
Compare
b13101c to
42a2081
Compare
tjgq
requested changes
Jan 7, 2026
42a2081 to
0973d8b
Compare
Contributor
Author
|
@tjgq Do you have time to take another look here also? Thanks! |
tjgq
approved these changes
Feb 4, 2026
tjgq
approved these changes
Feb 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR better aligns the language of the REv2 API to describe how a client should expect to interact with the
SplitandSpliceAPIs.Generally speaking, when designing a Remote Cache service, the server is not always primarily responsible for doing splitting and splicing blobs. In fact, the
SplitandSpliceAPIs are extremely helpful from a client's context to store and retrieve this manifest for how content defined chunking can compose a blob.For example, if a client calls
Splicethat maps blob digestAto chunksA1andA2, this instructs the server to store this information. Later, if a client that is not chunking-aware callsReadonA, the server can use this stored state to composeAfromA1andA2stored in the CAS, and serve it to the client.Similarly, if a user calls
Spliton blobB(which could be some Action Result), the server would respond with its stored manifest:B1andB2. A chunking aware client can then skip downloadingB1if it's available locally from some other file's chunks, and download onlyB2without ever needing to download the entirety ofB.