ZJIT: Diff Tool

Currently, we measure changes to ZJIT with `ruby-bench` and measure stats using the `--zjit-stats` flag.

### Current Status
The workflow is pretty standard each time, but doesn't have a defined process. It usually looks something like this.

1. Build ruby (and `chruby` to this version)
2. Run stats and save them off (with something like `WARMUP_ITRS=2 MIN_BENCH_ITRS=10 MIN_BENCH_TIME=0 ruby --zjit-stats benchmarks/lobsters/benchmark.rb 2> before.txt`)
3. Make changes to ZJIT
4. Run stats again but save as `after.txt`
5. Manually diff the results and paste this into your PR description.

### Problems
- Manual time and effort is required for each revision to ZJIT, and we can probably eliminate this.
- It's an undocumented or non-obvious workflow for any new members of the team or external contributors.
- It fills PR descriptions from the author that could be clearer as a CI bot
- It's difficult to nicely make diffs with the current output format because the JIT is non-deterministic, and also because optimizations to one part of the JIT usually have subtle effects on other components.
- We don't have a good way to measure clear differences to ZJIT at snapshots in time. We probably shouldn't modify nightly builds for curiosity about differences between two versions. Having such statistics would be useful for planning (we can easily measure which changes had the biggest impacts to correct any assumptions that may have been incorrect) and it would be great for presentations and such.
- As the JIT becomes more complicated, we may want something more sophisticated to measure changes in perf and remove noise. Currently, we cannot even implement a simple t-test which could be used as a basis for a statistical diff, which is what we need more than a code diff in this instance.

### Ideal Outcome
We have a tool that can ingest 2 snapshots in time of ZJIT, perhaps through git hashes. The tool builds ZJITs at these times, and runs some configurable benchmark with sane defaults to capture _all_ results, not just the worst offenders. The tool then performs a statistical test to minimize results to only share those that are statistically significant.

This tool could be used during development. Instead of attempting to remember this before performing your ZJIT modification, you could type a simple, easier to remember command _afterwards_ as a gut check that your change worked as intended. Additionally, this tool could be used in CI, where we diff master and the current PR. The statistical change could be listed as a comment right under the author's description.

### Plan
While all of these things would be nice to have, this epic is lower priority than simply improving the JIT. For this reason, we will split this into many tickets that are all very tractable and get us useful intermediate results along the way to the overall goal.

**Philosophy**: Sketch out the minimal workable solution that can be fleshed out without refactor. Do the easiest and dumbest implementations for each component, and put in more work as needed. (ie. version 1 stubs out each component with a print statement telling you what commands to run and everything is manual. The only innovation is how the problem is framed and that it creates scaffolding to build on top of.)

Please see linked issues for further details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZJIT: Diff Tool #822

Current Status

Problems

Ideal Outcome

Plan

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ZJIT: Diff Tool #822

Description

Current Status

Problems

Ideal Outcome

Plan

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions