Skip to content

ZJIT: Diff Tool #822

@jacob-shops

Description

@jacob-shops

Currently, we measure changes to ZJIT with ruby-bench and measure stats using the --zjit-stats flag.

Current Status

The workflow is pretty standard each time, but doesn't have a defined process. It usually looks something like this.

  1. Build ruby (and chruby to this version)
  2. Run stats and save them off (with something like WARMUP_ITRS=2 MIN_BENCH_ITRS=10 MIN_BENCH_TIME=0 ruby --zjit-stats benchmarks/lobsters/benchmark.rb 2> before.txt)
  3. Make changes to ZJIT
  4. Run stats again but save as after.txt
  5. Manually diff the results and paste this into your PR description.

Problems

  • Manual time and effort is required for each revision to ZJIT, and we can probably eliminate this.
  • It's an undocumented or non-obvious workflow for any new members of the team or external contributors.
  • It fills PR descriptions from the author that could be clearer as a CI bot
  • It's difficult to nicely make diffs with the current output format because the JIT is non-deterministic, and also because optimizations to one part of the JIT usually have subtle effects on other components.
  • We don't have a good way to measure clear differences to ZJIT at snapshots in time. We probably shouldn't modify nightly builds for curiosity about differences between two versions. Having such statistics would be useful for planning (we can easily measure which changes had the biggest impacts to correct any assumptions that may have been incorrect) and it would be great for presentations and such.
  • As the JIT becomes more complicated, we may want something more sophisticated to measure changes in perf and remove noise. Currently, we cannot even implement a simple t-test which could be used as a basis for a statistical diff, which is what we need more than a code diff in this instance.

Ideal Outcome

We have a tool that can ingest 2 snapshots in time of ZJIT, perhaps through git hashes. The tool builds ZJITs at these times, and runs some configurable benchmark with sane defaults to capture all results, not just the worst offenders. The tool then performs a statistical test to minimize results to only share those that are statistically significant.

This tool could be used during development. Instead of attempting to remember this before performing your ZJIT modification, you could type a simple, easier to remember command afterwards as a gut check that your change worked as intended. Additionally, this tool could be used in CI, where we diff master and the current PR. The statistical change could be listed as a comment right under the author's description.

Plan

While all of these things would be nice to have, this epic is lower priority than simply improving the JIT. For this reason, we will split this into many tickets that are all very tractable and get us useful intermediate results along the way to the overall goal.

Philosophy: Sketch out the minimal workable solution that can be fleshed out without refactor. Do the easiest and dumbest implementations for each component, and put in more work as needed. (ie. version 1 stubs out each component with a print statement telling you what commands to run and everything is manual. The only innovation is how the problem is framed and that it creates scaffolding to build on top of.)

Please see linked issues for further details.

Sub-issues

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions