Skip to content

Conversation

@zihaowang678
Copy link
Contributor

@zihaowang678 zihaowang678 commented Jul 28, 2025

Summary
This PR adds support for the ION_ELEMENT_DOM API to the ion-java-benchmark-cli, enabling benchmarking of the IonElement library from ion-element-kotlin alongside the existing STREAMING and DOM APIs.

Changes

  1. API Enum Enhancement (API.java)
  • Added ION_ELEMENT_DOM enum value with documentation
  1. Read Task Implementation (IonMeasurableReadTask.java)
  • Implemented element reading methods for buffer and file operations
  • Added proper resource management and side effect consumer integration
  1. Write Task Implementation (IonMeasurableWriteTask.java)
  • Added generateWriteInstructionsElement() method
  • Integrated element processing with limits and flush period support
  1. Task Routing (MeasurableReadTask.java & MeasurableWriteTask.java)
  • Added ION_ELEMENT_DOM routing in getTask() and setUpTrial() methods
  1. CLI Documentation (Main.java)
  • Updated --api option to include ion_element_dom with format restrictions
  1. Dependencies (pom.xml)
  • Added ion-element (1.3.0) and Kotlin dependencies
  1. Test Coverage (OptionsTest.java)
  • Added comprehensive tests for read/write operations with various configurations

Performance Analysis
Evaluated on three Ion corpus datasets, including both Ion text and Ion binary data, IonElement shows an average of ~37.09% increase in primary_score of performance compared to IonValue.

@zihaowang678 zihaowang678 marked this pull request as ready for review July 28, 2025 23:35
Copy link
Contributor

@popematt popematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few questions and suggestions, but overall this is looking pretty good.

Comment on lines +120 to +124
```
ion-java-benchmark read --api dom \
--api ion_element_dom \
example.10n
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious—when you run this, what sort of results to you get?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It runs two APIs in sequence, then gives two stacked statistical summaries at the end, one for each API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have been more clear. Approximately how much of a performance difference are you getting between the two APIs when you run this command?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approximately a 30-ish percentage improvement in primary_score. While this command is a rough comparison between two APIs, the result aligns well with our more extensive experiments run in SampleTime mode across three ion datasets, where we saw a ~37.09% improvement in primary_score.

* IonElement API. The "loader" is defined as any context that is tied to a single stream.
* @throws IOException if thrown during reading.
*/
abstract void fullyReadElementFromBuffer(SideEffectConsumer consumer) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker—it would be nice if we could add new APIs to the benchmark CLI without having to add more methods for each one. (Especially since they are unlikely to be applicable for all data formats.) Have you thought about any ways this could be factored to make it more easily extensible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the current approach is a straightforward one but doesn't scale well. A capability-based system where each format declares which APIs it supports might be better. I'd be happy to refactor this in a follow-up if it's worth doing.

zihaowang678 and others added 4 commits July 30, 2025 21:29
Co-authored-by: Matthew Pope <81593196+popematt@users.noreply.github.com>
Co-authored-by: Matthew Pope <81593196+popematt@users.noreply.github.com>
Co-authored-by: Matthew Pope <81593196+popematt@users.noreply.github.com>
Co-authored-by: Matthew Pope <81593196+popematt@users.noreply.github.com>
Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good! Note the failing CI build.

Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited to be able to use this to make the case for adoption of ion-element-kotlin over IonValue.

Once @popematt approves, we can squash/merge.

@zihaowang678 zihaowang678 merged commit cffcef1 into amazon-ion:master Aug 1, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants