Make SpliceBlobRequest.blob_digest mandatory#356
Conversation
sluongng
left a comment
There was a problem hiding this comment.
cc: @roloffs @aehlig for the JustBuild side since this is technically a breaking change, but we have yet to do a tag release so it should be ok-ish for consideration.
As explained in the other PR, I think SHOULD is still ok here as we do want to give the client the benefit of not having to hash the whole large blob when they don't want to.
Realistically, though, the client should almost always want to hash the large blob to use that hash in FindMissing before chunking that blob for uploads. That could potentially save it a lot of network traffic. So I don't mind too much if we go for a stricter route either.
- If chunking information already exists for the blob, it allows
the server to keep or replace with new chunking information.
how is this OR decided?
@sluongng I think the server should be at liberty to decide whether to keep the previous entry, or replace it. There is a risk of a faulty client poisoning or maliciously populating these. |
Agree. a. Do you think we should add this to the spec to specifically call out that the server MAY decide to keep the previous entry. b. Are the clients interested in the result of this decision? I.e., if the client tried to splice SetA into BigBlob, but the server decided to keep SetB, should we give the client a signal so that they can choose to download and use SetB locally instead? |
What if the server responds with |
Thanks for letting us know. Currently JustBuild always sets the digest, so is not broken by that breaking change. |
There was a problem hiding this comment.
Pull request overview
This PR makes the blob_digest field required in the SpliceBlobRequest message by changing the client requirement from "SHOULD" to "MUST" and documenting additional reasons for this requirement.
Key Changes:
- Updated documentation to require (MUST vs SHOULD) that clients set the
blob_digestfield - Enhanced reason #1 to clarify that early existence checks apply to both the blob and existing chunks
- Added reason #3 explaining how the field enables servers to manage chunking information
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
53a9ea5 to
743040b
Compare
|
cc @EdSchouten @roloffs @mostynb I see you were involved in #282 #337, if you have time to review: hoping to get consensus on this. |
For future reference: we reached consensus to submit this at the working group meeting, and will reevaluate potential use cases that need an optional digest once we have more experience with client/server implementations (it's easier to start out stricter and loosen it later than the other way around). |
For reasons listed in the API, this field should be required