Skip to content

Conversation

@jbachorik
Copy link
Collaborator

@jbachorik jbachorik commented Dec 5, 2025

!WARNING!
This PR depends on #690 - it will stay in draft until the other PR is merged.

Summary

Implements JMC-8477: Add opt-in memory-mapped file support to JFR Writer API for off-heap event storage.

Problem

The current JFR Writer API stores all event data in on-heap byte arrays during recording creation, causing:

  • Heap pressure from large byte array allocations
  • GC overhead from frequent allocation/deallocation
  • Unbounded memory growth with recording size
  • Scalability issues for long-running or high-throughput scenarios

Solution

This PR adds opt-in memory-mapped file support, allowing event data to be stored off-heap.

API

Users opt in via the builder pattern:

Recording recording = Recordings.newRecording(outputStream,
    settings -> settings.withMmap(4 * 1024 * 1024)  // 4MB per thread
                        .withJdkTypeInitialization());

Benefits

  • Reduced Heap Pressure: Event data stored off-heap eliminates large on-heap allocations
  • Predictable Memory Usage: Fixed memory footprint per thread regardless of recording size
  • Improved Performance: JMH benchmarks show 8-12% throughput improvement
  • Better Scalability: Multiple recordings can coexist without competing for heap space
  • Fully Backward Compatible: Heap mode remains the default; opt-in only

Performance Results

JMH benchmarks show consistent improvements across event types:

  • writeSimpleEvent: +8.3% (909K → 985K ops/s)
  • writeMultiFieldEvent: +11.9% (787K → 881K ops/s)
  • writeRepeatedStringsEvent: +11.9% (793K → 887K ops/s)
  • writeStringHeavyEvent: +10.4% (801K → 884K ops/s)

Backward Compatibility

  • ✅ Heap mode remains the default behavior
  • ✅ Existing API unchanged - no breaking changes
  • ✅ All existing tests pass without modification
  • ✅ New functionality accessed only through explicit opt-in

Testing

  • Unit Tests: 850+ lines covering core functionality
  • Integration Tests: Multi-threaded stress tests, large event tests
  • Benchmarks: Comprehensive JMH benchmark suite with comparison tools
  • Validation: All existing JFR Writer tests pass

Commits

  1. Add JMH benchmark infrastructure (44aa637) - JMH benchmarks for performance validation
  2. Add memory-mapped file support (5def325) - Core mmap implementation
  3. Refactor to builder pattern (aaddf98) - Move from system properties to builder API
  4. Fix JMH benchmark build (7264aa4) - Enable out-of-the-box benchmark execution
  5. Add comparison tool (449eefd) - Python script for benchmark result analysis

Documentation

  • README.md with comprehensive usage instructions
  • JMH benchmark guide with examples
  • Comparison tool for performance analysis

🤖 Generated with Claude Code


Progress

  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Integration blocker

 ⚠️ Title mismatch between PR and JBS for issue JMC-8477

Issue

  • JMC-8477: JFR Writer: Add opt-in memory-mapped file support for event storage (Enhancement - P4) ⚠️ Title mismatch between PR and JBS.

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jmc.git pull/691/head:pull/691
$ git checkout pull/691

Update a local copy of the PR:
$ git checkout pull/691
$ git pull https://git.openjdk.org/jmc.git pull/691/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 691

View PR using the GUI difftool:
$ git pr show -t 691

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jmc/pull/691.diff

Using Webrev

Link to Webrev Comment

jbachorik and others added 6 commits December 3, 2025 14:17
- Created benchmark module with JMH 1.37 support
- Implemented 4 benchmark suites:
  * EventWriteThroughputBenchmark: measures ops/sec for different event types
  * AllocationRateBenchmark: measures allocation rates during event writing
  * StringEncodingBenchmark: measures UTF-8 encoding and caching performance
  * ConstantPoolBenchmark: measures constant pool buildup and lookup

- Baseline results (Apple M1 Max, JDK 21.0.5):
  * Simple events: 986K ops/s
  * Multi-field events: 862K ops/s
  * Unique strings: 11.9x slower than repeated strings
  * Identified OutOfMemoryError with unbounded constant pool growth

- Comprehensive documentation with build/run instructions
- JSON results for programmatic comparison

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Implements opt-in memory-mapped file (mmap) support for off-heap event
storage to reduce heap pressure during JFR recording. The implementation
uses double-buffered per-thread mapped byte buffers with automatic
rotation and background flushing.

Key Components:
- LEB128MappedWriter: Fixed-size mmap-backed LEB128 writer with
  overflow detection
- ThreadMmapManager: Manages per-thread double buffers and coordinates
  rotation with background flush executor (daemon threads)
- Chunk: Enhanced with automatic rotation support when buffer fills
- RecordingImpl: Conditionally uses mmap or heap mode based on system
  properties

Configuration:
- org.openjdk.jmc.flightrecorder.writer.mmap.enabled=true (default: false)
- org.openjdk.jmc.flightrecorder.writer.mmap.chunkSize=<bytes> (default: 4MB)

Performance: Benchmarks show 8-12% improvement over heap mode across all
event types (writeSimpleEvent +8.3%, writeMultiFieldEvent +11.9%,
writeRepeatedStringsEvent +11.9%, writeStringHeavyEvent +10.4%).

Backward Compatibility: Heap mode remains the default. All existing tests
pass. New unit tests and integration tests verify mmap functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace system property-based configuration with fluent builder API for
configuring memory-mapped file settings when creating JFR recordings.

Changes:
- Add RecordingSettingsBuilder.withMmap() methods for enabling mmap with
  default (4MB) or custom chunk sizes
- Add useMmap() and getMmapChunkSize() methods to RecordingSettings
- Remove system property constants from RecordingImpl:
  - org.openjdk.jmc.flightrecorder.writer.mmap.enabled
  - org.openjdk.jmc.flightrecorder.writer.mmap.chunkSize
- Update MmapRecordingIntegrationTest to use builder pattern instead of
  system properties
- Add @SInCE 10.0.0 javadoc tags to all new public API methods

Migration example:
  Before: System.setProperty("org.openjdk.jmc.flightrecorder.writer.mmap.enabled", "true")
  After:  Recordings.newRecording(stream, settings -> settings.withMmap())

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Fix maven-jar-plugin configuration to correctly locate MANIFEST.MF from
the resources directory, enabling successful benchmark JAR creation.

Changes:
- Add manifestFile configuration pointing to ${project.build.outputDirectory}/META-INF/MANIFEST.MF
- Create comprehensive README.md with:
  - Build and usage instructions
  - All 13 benchmark descriptions
  - Profiling guide with examples
  - Output format options (JSON, CSV, text)
  - Common JMH options reference
  - Troubleshooting guide
- Add src/main/resources/META-INF/MANIFEST.MF for JAR packaging

The benchmark JAR (benchmarks.jar) now builds successfully via 'mvn clean package'
and supports all JMH command-line arguments out-of-the-box.

Verified functionality:
- JAR executes and responds to -h (help)
- All 13 benchmarks listed with -l
- Benchmarks execute successfully (e.g., writeSimpleEvent: ~950K ops/s)
- JSON/CSV output formats work
- Profilers available (gc, jfr, stack, etc.)
- Regex filtering and parameterization work

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add compare.py Python script for easy comparison of JMH benchmark results
between baseline and optimized runs. The script automatically handles different
benchmark modes (throughput vs average time) and displays improvements with
directional indicators.

Changes:
- Add compare.py utility script with:
  - Automatic detection of benchmark modes (thrpt, avgt, etc.)
  - Correct calculation of improvements (higher/lower is better)
  - Formatted output with directional arrows (↑/↓)
  - Support for parameterized benchmarks
  - Optional custom title for comparisons
- Update README.md with:
  - Detailed compare.py usage instructions
  - Example output showing improvement percentages
  - Explanation of benchmark mode handling

Usage:
  python3 compare.py baseline.json optimized.json "Optional Title"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@bridgekeeper
Copy link

bridgekeeper bot commented Dec 5, 2025

👋 Welcome back jbachorik! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 5, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr label Dec 5, 2025
@mlbridge
Copy link

mlbridge bot commented Dec 5, 2025

Webrevs

The benchmark module contains JMH benchmark classes, not JUnit tests.
Configure maven-surefire-plugin to skip test execution for this module.

Fixes CI failure: 'No tests to run' error
@jbachorik jbachorik marked this pull request as draft December 5, 2025 12:26
@openjdk openjdk bot removed the rfr label Dec 5, 2025
@jbachorik jbachorik changed the title JMC-8477: Add opt-in memory-mapped file support for event storage 8477: Add opt-in memory-mapped file support for event storage Dec 5, 2025
Replace Float.floatToIntBits/Double.doubleToLongBits conversions with
direct buffer.putFloat/putDouble calls. This is more idiomatic and
eliminates unnecessary bit conversion overhead.

Changes:
- writeFloat(): Use buffer.putFloat() instead of writeIntRaw(floatToIntBits())
- writeDouble(): Use buffer.putDouble() instead of writeLongRaw(doubleToLongBits())

Benefits:
- More readable code (intent is clearer)
- Potentially faster (avoids bit conversion)
- Consistent with standard ByteBuffer API usage

All 322 tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant