Skip to content

Allow swapping collection implementation to persistent collections#455

Open
andrewparmet wants to merge 6 commits intoopen-toast:mainfrom
andrewparmet:enable-persistent-collections
Open

Allow swapping collection implementation to persistent collections#455
andrewparmet wants to merge 6 commits intoopen-toast:mainfrom
andrewparmet:enable-persistent-collections

Conversation

@andrewparmet
Copy link
Collaborator

@andrewparmet andrewparmet commented Feb 12, 2026

Adds opt-in persistent collections backed by kotlinx-collections-immutable. When enabled (via system property protokt.collections.persistent=true on JVM, or env var PROTOKT_COLLECTIONS_PERSISTENT=true on JVM/JS), deserialized repeated and map fields use PersistentList/PersistentMap instead of UnmodifiableList/UnmodifiableMap.

The key benefit is structural sharing: the + operator in builder DSL copy {} blocks runs in O(log n) instead of O(n), making incremental message mutation dramatically faster on pre-populated collections.

This is achieved through a BuilderScope interface with member extension plus operators that shadow stdlib's plus within generated builder classes. When the receiver is a persistent collection, PersistentList.add() / PersistentMap.put() are used for O(log n) structural sharing; otherwise, the stdlib copy behavior is preserved.

Benchmarks:

Copy-Append (1000 iterations of msg.copy { field = field + element }, ms/op)

Lists:

  ┌────────────────────────┬───────────────┬──────┬─────────┬──────────────────────┬─────────┐
  │ Dataset                │ protobuf-java │ wire │ protokt │ protokt (persistent) │ speedup │                                                                                                             
  ├────────────────────────┼───────────────┼──────┼─────────┼──────────────────────┼─────────┤
  │ Large (pre-populated)  │ 2.83          │ 3.02 │ 1.61    │ 0.034                │ 47x     │                                                                                                             
  ├────────────────────────┼───────────────┼──────┼─────────┼──────────────────────┼─────────┤                                                                                                             
  │ Medium (pre-populated) │ 0.28          │ 0.92 │ 1.33    │ 0.042                │ 32x     │
  ├────────────────────────┼───────────────┼──────┼─────────┼──────────────────────┼─────────┤
  │ Small (from empty)     │ 0.28          │ 0.92 │ 1.39    │ 1.39                 │ ~1x     │
  └────────────────────────┴───────────────┴──────┴─────────┴──────────────────────┴─────────┘

Maps:

  ┌────────────────────────┬───────────────┬──────┬─────────┬──────────────────────┬─────────┐
  │ Dataset                │ protobuf-java │ wire │ protokt │ protokt (persistent) │ speedup │
  ├────────────────────────┼───────────────┼──────┼─────────┼──────────────────────┼─────────┤
  │ Large (pre-populated)  │ 22.8          │ 26.8 │ 26.7    │ 0.178                │ 150x    │
  ├────────────────────────┼───────────────┼──────┼─────────┼──────────────────────┼─────────┤
  │ Medium (pre-populated) │ 17.1          │ 20.9 │ 21.8    │ 0.173                │ 126x    │
  ├────────────────────────┼───────────────┼──────┼─────────┼──────────────────────┼─────────┤
  │ Small (from empty)     │ 17.0          │ 21.7 │ 21.6    │ 19.2                 │ ~1x     │
  └────────────────────────┴───────────────┴──────┴─────────┴──────────────────────┴─────────┘

Serialize/Deserialize (ms/op):

  ┌────────────────────┬───────────────┬────────┬─────────┬──────────────────────┬───────┐
  │ Benchmark          │ protobuf-java │ wire   │ protokt │ protokt (persistent) │ delta │
  ├────────────────────┼───────────────┼────────┼─────────┼──────────────────────┼───────┤
  │ deserialize large  │ 1438          │ 898    │ 771     │ 825                  │ +7%   │
  ├────────────────────┼───────────────┼────────┼─────────┼──────────────────────┼───────┤
  │ deserialize medium │ 3.36          │ 3.15   │ 2.28    │ 2.40                 │ +5%   │
  ├────────────────────┼───────────────┼────────┼─────────┼──────────────────────┼───────┤
  │ deserialize small  │ 0.0053        │ 0.0079 │ 0.0040  │ 0.0039               │ ~0%   │
  ├────────────────────┼───────────────┼────────┼─────────┼──────────────────────┼───────┤
  │ serialize large    │ 1221          │ 1466   │ 1263    │ 1359                 │ +8%   │
  ├────────────────────┼───────────────┼────────┼─────────┼──────────────────────┼───────┤
  │ serialize medium   │ 0.92          │ 1.16   │ 1.00    │ 1.02                 │ ~0%   │
  ├────────────────────┼───────────────┼────────┼─────────┼──────────────────────┼───────┤
  │ serialize small    │ 0.0019        │ 0.0067 │ 0.0027  │ 0.0024               │ ~0%   │
  └────────────────────┴───────────────┴────────┴─────────┴──────────────────────┴───────┘

Deserialization costs ~5-7% more with persistent collections due to PersistentList.builder() overhead. Serialization is marginally slower for large messages. "Small" copy-append shows no improvement because the initial collection is emptyList()/emptyMap(), not a persistent type.

@andrewparmet andrewparmet changed the title first cut of persistent collections Allow swapping collection implementation to persistent collections Feb 13, 2026
@andrewparmet andrewparmet marked this pull request as ready for review February 13, 2026 04:06
defaultProtoc()

configure<JavaApplication> {
mainClass.set("com.toasttab.protokt.benchmarks.ProtobufBenchmarksKt")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks were broken.

Comment on lines 54 to 57
map.isEmpty() -> emptyMap()
map is UnmodifiableMap -> map
usePersistentCollections && PersistentCollections.isFrozenMap(map) -> map
else -> UnmodifiableMap(LinkedHashMap(map))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lazy comparison allows compileOnly dependency on the immutable collections runtime here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant