Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions vindex/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,14 @@ This provides two key guarantees:

The result is a system that extends the verifiability of the underlying log to its queries, preserving the end-to-end chain of trust while providing the efficiency modern systems require.

Using pointers as the values in this data structure is an important part of the design:

- Evolution of the value for a key is predictable: it's an append-only data structure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that predictable evolution isn't a consequence of using pointers, it's due to your "implicit" reduceFn being sortByIndex (really ~v=append(v, indexOf(x))).

You could imagine a map of, say, "who are all the CAs which have issued a cert for ?" also having an append-only structure in the leaf, where the values are just the set of Issuers seen in source-log order. i.e. there are many reduceFn impls constrained so as to also provide append-only ordering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the same outcome can be achieved in other ways, but I was trying to be concise.

- Values stay small: pointers to the values mean that the index doesn't need to duplicate values

Compare the above against the more powerful, but less efficient, general map (e.g. the [batchmap](https://github.com/google/trillian/tree/master/experimental/batchmap)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Efficiency is probably arguable in some senses:

  • The vindex size is reduced, but clients are forced to make multiple roundtrips: first to the vindex and then to the source log or its mirror for each resource pointer (plus, if the source log is a tlog-tiles log, and the pointers are non-consecutive, then we have a 256x multiplier on bytes retrieved due to entry bundles).
  • A map using an "append-only" constrained reduceFn as in the comment above could potentially result in a more efficient map if you're looking at it from a systemic PoV - potentially it'd eliminate the need for clients to go to the source log at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Efficiency is always arguable; case in point - I tried to keep this short in the hopes of being efficient, but now we're having this conversation 😆

I was trying to keep this short as this is very early on in the doc. Even if the index served the full leaf contents back and we were only interested in wire efficiency, this approach is still inefficient over time because every lookup returns stuff you've probably seen before. Of course that can be mitigated by having some since parameter, but... well, I didn't want to get into that before I've introduced MapFn at all yet.

The vision is that the Verifiable Index will meet 80% of use-cases, and the general map (with general `ReduceFn`s) will be required for more advanced needs.

## Applications

This verifiable map can be applied to any log where users have a need to enumerate all values matching a specific query. For example:
Expand Down
Loading