Conversation
alanshaw
left a comment
There was a problem hiding this comment.
LGTM - I left some suggestions.
src/hashes/interface.ts
Outdated
| digest(): Digest | ||
|
|
||
| /** | ||
| * Computes the digest of the given input and writes it into the provided |
There was a problem hiding this comment.
of the given input
Do you mean "of the bytes written so far"?
src/hashes/interface.ts
Outdated
| * can be use to control whether multihash prefix is written, if `false` | ||
| * only the raw digest writtend omitting the prefix. | ||
| */ | ||
| digestInto(output: Uint8Array, offset?: number, asMultihash?: boolean): this |
There was a problem hiding this comment.
Return output? Not sure how useful it is to chain this method.
| digestInto(output: Uint8Array, offset?: number, asMultihash?: boolean): this | |
| digestInto(output: Uint8Array, offset?: number, asMultihash?: boolean): Uint8Array |
There was a problem hiding this comment.
Not sure if there's a better convention for method name when receiving a parameter to mutate. digestInto is a bit awkwardly named (IMO!).
digestBYOB? 🙄
There was a problem hiding this comment.
asMultihash - do we need? The point of this library is multiformats.
There was a problem hiding this comment.
I went into this in the linked issue. In practice I have encountered many instances where I do need to leave out the prefix, I could go and use another library in those instances, but seems like we could just expose this. That sayid I agree that this method is kind of meh and we could do better.
@vasco-santos suggested having a whole another method, personally I'm wondering if perhaps we should have methods just to get digest without a prefix as this is low level API anyway ?
| digestInto(output: Uint8Array, offset?: number, asMultihash?: boolean): this | |
| // multihash prefix for it | |
| header: Uint8Array | |
| // only writes the digest without a prefix | |
| digestInto(output: Uint8Array, offset?: number): this |
Or alternatively something like this
| digestInto(output: Uint8Array, offset?: number, asMultihash?: boolean): this | |
| encodeMultihashInto(target: Uint8Array, offset?: number): this | |
| encodeMultihashHeaderInto(target: Uint8Array, offset?: number): this | |
| encodeDigestInto(target: Uint8Array, offset?: number): thsi |
There was a problem hiding this comment.
digestIntois a bit awkwardly named (IMO!).
I'm happy to call this something, else I was trying to align with varint.encodeInto, which as it turns out was called varint.encodeTo 😅
digestBYOB? 🙄
Works for me although I had to google to figure out what BYOB stand for.
Alternatively we could just have read(bytes: Uint8Array, offfset?: number): this
There was a problem hiding this comment.
Return output? Not sure how useful it is to chain this method.
I personally find mutations that return values misleading, that said I'm amendable to the idea
There was a problem hiding this comment.
I had to google to figure out what BYOB stand for.
BYOB was aiming for familiarity with https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamBYOBReader etc. but I guess that was a miss 😆
| /** | ||
| * Number of bytes that were consumed. | ||
| */ | ||
| count(): bigint |
There was a problem hiding this comment.
I mean if someone is hashing >9PiB of data in JS then 👏👏👏.
src/hashes/interface.ts
Outdated
| digest(): Digest | ||
|
|
||
| /** | ||
| * Computes the digest of the given input and writes it into the provided |
There was a problem hiding this comment.
| * Computes the digest of the given input and writes it into the provided | |
| * Computes the digest of the bytes written so far and writes it into the provided |
|
I'm realizing now that interface here is intentionally non-destructive, as in I could compute digest over and over. Unfortunately node crypto APIs are destructive though
Instead node provides Given this the case with node, proposed API seems impractical, perhaps instead we could also introduce same constraint and On the other hand copy on digest maybe negligible overhead, in which case API without copy would be nicer. |
| /** | ||
| * Writes bytes to be digested. | ||
| */ | ||
| write(bytes: Uint8Array): this |
There was a problem hiding this comment.
Typically in streaming hashers this is called update.
There was a problem hiding this comment.
I'm fine with calling it update although I do find that name confusing personally as I think of update as overwrite as opposed to append.
There was a problem hiding this comment.
Right, but streaming hashers aren't appending to a buffer they are updating their internal state with the new data you pass.
|
What about |
|
I'd really like this to land! I find myself wanting to do this more and more... |
| * | ||
| * @param [offset=0] - Byte offset in the `target`. | ||
| */ | ||
| readDigest(target: Uint8Array, offset?: number): this |
There was a problem hiding this comment.
can you describe the use-case for this? it seems like this makes it an onerous API to have to implement
There was a problem hiding this comment.
Sorry, my mistake, this is the output function!
I think maybe the naming could be better here. We have ample precedent of digest() in JS-land, so we could have digest() and multihash() (or multihashDigest() if you want to be more explicit). In Go-land Sum() is the standard for this action, which has grown on me to make sense (though it's taken time!).
There was a problem hiding this comment.
Oh, I also see I'm discussing history here - read* being the new versions? I'm not a fan. I also wonder whether we could have nicer APIs that don't require you to pass in a target? I understand that's an important part of this, for efficiency, but casual use typically just wants that done for you. So could the APIs take a target? instead and always return Uint8Array? So you can either choose to supply the bytes to write in to (with optional offset) or not supply one, but either way you get back some bytes.
There was a problem hiding this comment.
read* being the new versions? I'm not a fan.
I mean if you think of it as a transform stream, it makes sense to have write and read ops. I don't mind renaming it to something else, but please don't make me come up with a name that everyone will like.
I also wonder whether we could have nicer APIs that don't require you to pass in a target? I understand that's an important part of this, for efficiency, but casual use typically just wants that done for you. So could the APIs take a target? instead and always return Uint8Array? So you can either choose to supply the bytes to write in to (with optional offset) or not supply one, but either way you get back some bytes.
I'm not completely opposed to returning back the target, however I would caution against it as it mixes two very different modes into one and can also lead to mistakes (e.g. you may have passed undefined reference which will no through but happily give you back Uint8Array)
Idea was that if you want to compute digest you just call digest method and use this only in those rare cases when you need to work with slabs of memory.
|
For what it's worth Rust Multihash hasher has a similar interface to one proposed here /// Trait implemented by a hash function implementation.
pub trait Hasher {
/// Consume input and update internal state.
fn update(&mut self, input: &[u8]);
/// Returns the final digest.
fn finalize(&mut self) -> &[u8];
/// Reset the internal hasher state.
fn reset(&mut self);
} |
| /** | ||
| * Number of bytes that were consumed. | ||
| */ | ||
| count(): bigint |
There was a problem hiding this comment.
| /** | |
| * Number of bytes that were consumed. | |
| */ | |
| count(): bigint |
Let's just drop this method, we can revisit if we find it really necessary.
Proposal for the #260