Skip to content

feat: Add benchmark automation scripts and enhance SP1 cluster infrastructure#16

Open
WiseMrMusa wants to merge 60 commits intomasterfrom
nm/02-scripts
Open

feat: Add benchmark automation scripts and enhance SP1 cluster infrastructure#16
WiseMrMusa wants to merge 60 commits intomasterfrom
nm/02-scripts

Conversation

@WiseMrMusa
Copy link

This PR builds on nm/01-guest-program to add comprehensive automation scripts for generating fixtures, running gas-categorized benchmarks, and significantly improves the SP1 cluster setup with better error handling, configurability, and health checks.


Detailed Changes

1. Benchmark Automation Scripts

scripts/generate-gas-categorized-fixtures.sh

Generates witness files organized by gas categories from EEST fixtures:

Feature Description
Gas categories Support for 1M, 10M, 30M, 45M, 60M, 100M, 150M and custom values
Rational numbers Support formats like 0.5M, 1.5M, 2.5M
Local EEST support --eest-fixtures-path to use local fixtures
Dry run mode --dry-run to preview commands
Parallel processing Leverages Rayon with configurable thread count
# Example usage
./scripts/generate-gas-categorized-fixtures.sh -g 1M,10M,30M
./scripts/generate-gas-categorized-fixtures.sh -e ./local-fixtures -g 0.5M,1M

scripts/run-gas-categorized-benchmarks.sh

Main script for executing benchmarks across gas categories:

Option Description
-z, --zkvm Select zkVM: risc0, sp1, openvm, pico, zisk, airbender
-e, --execution-client Choose reth or ethrex
-r, --resource Resource type: cpu, gpu, network, cluster
-c, --gas-categories Gas categories to benchmark
-a, --action prove or execute
-g, --guest stateless-executor or stateless-validator
-m, --memory-tracking Enable memory profiling
-f, --force-rerun Bypass cached results
-n, --dry-run Preview commands
# Example usage
./scripts/run-gas-categorized-benchmarks.sh -z sp1 -c 1M,10M -r gpu
./scripts/run-gas-categorized-benchmarks.sh -z risc0 -a execute -g stateless-executor

2. Opcode Analysis Scripts

scripts/trace_opcodes.py

Comprehensive opcode tracing and analysis tool (~1169 lines):

Feature Description
Opcode extraction Extract opcodes from bytecode
Execution tracing Trace opcode execution sequences
Gas analysis Analyze gas consumption per opcode
Statistics Generate instruction breakdown statistics
CSV export Export results to CSV for analysis

scripts/analyze_opcode_traces.py

Analyzes traced opcode execution data:

Output Description
Instruction counts Per-opcode execution frequency
Gas breakdown Gas consumption by opcode type
Hotspot detection Identify expensive operations

scripts/trace-opcodes-by-gas-category.sh

Wrapper script for tracing opcodes across gas categories.

3. SP1 Cluster Enhancements

Major improvements to scripts/sp1-cluster/:

start-sp1-cluster.sh Improvements

Feature Description
Configurable resources All resource limits via environment variables
NVIDIA detection Comprehensive GPU/runtime verification with installation guides
Docker Compose v1/v2 Support for both legacy and modern Docker Compose
Port configuration --port and --redis-port for multiple clusters
Health checks --wait flag for service health verification
Circuit cache Auto-create SP1 circuits directory
Skip GPU check --skip-gpu-check to bypass NVIDIA verification
# New usage examples
./start-sp1-cluster.sh --gpu-nodes 4 --wait
./start-sp1-cluster.sh --port 50052 --redis-port 6380 -d
./start-sp1-cluster.sh --gpu-nodes 0  # CPU-only mode

docker-compose.yml Improvements

Enhancement Description
Health checks All critical services have health checks
Configurable limits CPU, memory, and ports via env vars
Service dependencies Proper depends_on with health conditions
Restart policies restart: unless-stopped for reliability
HA documentation Comments explaining production considerations

env.example (New File)

Environment template with all configurable options:

# Resource limits
REDIS_MEMORY_LIMIT=30G
CPU_NODE_CPUS_LIMIT=8
GPU_NODE_MEMORY_LIMIT=24G

# Network ports
API_PORT=50051
REDIS_PORT=6379

# Circuit cache
SP1_CIRCUITS_DIR=${HOME}/.sp1/circuits

4. CLI Enhancements

RISC0 Keccak Cycle Limit

Added --risc0-keccak-limit option for controlling keccak cycle limits:

./ere-hosts prove --zkvm risc0 --risc0-keccak-limit 1000000

Improved Help Text

  • Clearer resource type descriptions
  • Network/cluster validation for SP1
  • Better error messages

5. Dependency Updates

Package Change
ere packages Updated to NethermindEth/ere version 4bac87e
Zisk Updated for V0.15.0 support
openvm-revm-crypto Vendored for stability
precompiles/zisk Removed sp1_bls12_381

6. Script Improvements

Script Change
generate-gas-categorized-fixtures.sh Default Rayon thread count for stability
run-gas-categorized-benchmarks.sh Enhanced option handling and validation
Various Fixed permissions, updated .gitignore

Why These Changes?

Automation Benefits

  1. Reproducibility: Standardized scripts ensure consistent benchmark runs
  2. Flexibility: Support for multiple gas categories, zkVMs, and configurations
  3. Analysis: Opcode tracing enables deep performance analysis
  4. Efficiency: Dry-run modes prevent wasted compute time

SP1 Cluster Benefits

  1. Reliability: Health checks ensure services are ready before use
  2. Flexibility: Configurable resources adapt to different hardware
  3. Multi-instance: Port configuration allows multiple clusters
  4. Debugging: Better error messages with actionable solutions

Testing

# Generate fixtures
./scripts/generate-gas-categorized-fixtures.sh -g 1M --dry-run

# Run benchmarks
./scripts/run-gas-categorized-benchmarks.sh -z sp1 -c 1M -n

# Start SP1 cluster with health wait
cd scripts/sp1-cluster
./start-sp1-cluster.sh --gpu-nodes 4 --wait

Breaking Changes

None. All changes are backward compatible.

jsign and others added 25 commits December 12, 2025 15:09
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Co-authored-by: Han <tinghan0110@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
…ion 4bac87e for ere packages in Cargo.toml and Cargo.lock
- Introduced Cluster variant in Resource enum for SP1 cluster resources.
- Updated ProverResourceType conversion to handle Cluster using default ClusterProverConfig.
- Introduced a new `docker-compose.yml` file for managing SP1 Cluster services including Redis, PostgreSQL, API, Coordinator, CPU, and GPU nodes.
- Added `start-sp1-cluster.sh` script for starting the cluster with options for GPU nodes and mixed worker modes.
- Created `stop-sp1-cluster.sh` script for stopping the cluster and managing persistent data and images.
- Enhanced user experience with logging and help messages in the scripts.
- Added validation in the CLI to ensure that the Cluster resource can only be used with SP1 zkVMs.
- Introduced `bail` from `anyhow` for error handling in case of invalid resource configurations.
Adds a new crate that provides pure EVM transaction execution without
pre-execution validation or post-execution consensus checks.

Key components:
- stateless_execution_with_trie: Core execution function returning bool
- WitnessDatabase: EVM database backed by StatelessTrie witness data
- RethStatelessExecutorGuest: Guest implementation for zkVMs
- SP1 and RISC0 entry points for zkVM compilation

This enables accurate benchmarking of raw EVM execution cycles in zkVMs
by skipping all validation overhead.
Updates both root and ere-guests workspace Cargo.toml to include:
- reth-stateless-executor as a workspace member
- Required reth dependencies (reth-evm, reth-revm)
- Dependency declarations for the new crate
Adds stateless_executor module to benchmark-runner for execution-only
benchmarking support:
- stateless_executor/reth.rs: Input preparation for Reth executor
- Updated lib.rs to export the new module
- Made BlockMetadata.block_used_gas public for reuse
…-executor

Adds zkVM-specific entry points for all supported platforms:
- Airbender: With custom allocator and runtime
- OpenVM: With crypto provider installation
- Pico: With KZG proof verification crypto provider
- ZisK: Standard entry point

These mirror the structure of stateless-validator entry points.
Introduces a new stateless_executor module with the following features:
- Added stateless_executor.rs for handling execution client variants and input preparation.
- Updated reth.rs to utilize StatelessExecutorFixture for input processing.
- Enhanced read_benchmark_fixtures_folder to read and parse benchmark fixtures.

This lays the groundwork for execution-only benchmarking of stateless programs.
…cutor and validator to allow both input_file and input_folder

This commit introduces the ability to specify either an input folder or a single input file for the stateless executor and validator. The `stateless_executor_inputs` and `stateless_validator_inputs` functions are updated to support this new functionality, along with corresponding changes in the CLI to accept an optional input file argument. This improves flexibility in benchmarking by allowing users to work with individual benchmark fixture files directly.
…r stateless executor

This commit modifies the CLI to set a default value for the execution client argument, allowing for easier configuration. Additionally, the guest relative path for the Reth execution client is simplified by removing the specific subdirectory, enhancing the path resolution for the stateless executor.
This commit introduces a new `MemoryTracker` struct to monitor memory usage during the proving process. It tracks initial, peak, and average memory usage, and integrates memory sampling into the benchmark runner. The proving metrics now include memory usage statistics, enhancing performance analysis capabilities.
…d runner.rs

This commit introduces optional memory tracking capabilities in the benchmark runner. The `Cargo.toml` is updated to include new features for memory tracking, and the `runner.rs` file is modified to conditionally compile memory tracking logic based on the feature flag. This enhances the ability to monitor memory usage during the proving process, aligning with previous enhancements to performance analysis.
…chmark CLI

This commit introduces a new `Network` resource type in the CLI, allowing users to specify network proving. It also adds validation to ensure that network proving is only used with SP1, enhancing the robustness of the command line interface for the zkVM benchmarker. The changes improve user experience by preventing misconfigurations related to resource selection.
… in reth.rs

This commit updates the input handling in the `stateless_executor_inputs_from_fixtures` function to pass a reference to the benchmark wrapper instead of moving it. This change ensures that the benchmark wrapper can be reused, improving memory efficiency and preventing potential ownership issues.
This commit cleans up the `runner.rs` file by removing the unused `PublicValues` import from the `ere_zkvm_interface` module. This enhances code clarity and maintains a cleaner codebase.
…nd version 4bac87e for ere packages in Cargo.toml and Cargo.lock
@vercel
Copy link

vercel bot commented Jan 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
zkgas-profiling Ignored Ignored Jan 9, 2026 8:25am

run: ls -R artifacts

- name: Release
uses: softprops/action-gh-release@v2

Check warning

Code scanning / CodeQL

Unpinned tag for a non-immutable Action in workflow Medium

Unpinned 3rd party Action 'Release Compiled Guests' step
Uses Step
uses 'softprops/action-gh-release' with ref 'v2', not a pinned commit hash
Regenerate lockfile to resolve Git merge conflict markers that were
preventing the project from building. The conflicts were between
different versions of ere dependencies.
… commit hashes

This commit modifies the Rust toolchain version in multiple workflow files to use specific commit hashes instead of the stable tag. This change ensures consistency in the toolchain used across different CI jobs, enhancing build reliability.
…e files

This commit refactors several files by reorganizing and cleaning up import statements for better readability and consistency. Additionally, it enhances code formatting in the `runner.rs`, `stateless_executor.rs`, `stateless_validator.rs`, and other related files, ensuring a more maintainable codebase. These changes do not alter functionality but improve overall code clarity.
This commit updates the versions of several dependencies in `Cargo.lock` and modifies the source revisions for the `ere` packages in `Cargo.toml` and `ere-guests/Cargo.toml`. The changes include downgrading `itertools`, `syn`, and `windows-sys` versions, as well as updating the source revision for various `ere` packages to a more recent commit. These updates aim to ensure compatibility and stability across the project.
This commit introduces several new workspace dependencies in `Cargo.toml`, including `openvm-k256`, `openvm-keccak256`, `openvm-kzg`, `openvm-p256`, `openvm-pairing`, and `openvm-sha2`, enhancing the functionality of the OpenVM platform. Additionally, a new module for `openvm_revm_crypto` is added in `main.rs`, setting the stage for further development in cryptographic functionalities.
…ture generation

This commit introduces a new script, `generate-gas-categorized-fixtures.sh`, which automates the generation of zkEVM fixture inputs categorized by gas parameters. The script includes functionality for validating gas categories, checking project structure, building the necessary project components, and generating fixtures for specified gas categories. It also supports a dry-run mode for previewing actions without execution, enhancing usability for developers working with zkEVM benchmarks.
…s benchmarks

This commit introduces a new script, `run-gas-categorized-benchmarks.sh`, designed to automate the execution of benchmarks across various gas-categorized fixtures. The script supports multiple options for customization, including dry-run mode, specific gas categories, and resource types. It validates input directories, checks project structure, and provides detailed output on benchmark results, enhancing the benchmarking workflow for developers working with zkEVM.
…nchmarks.sh

This commit refines the feature handling logic in the `run-gas-categorized-benchmarks.sh` script. It ensures that the script only attempts to build with features if they are specified, enhancing clarity and preventing unnecessary build attempts. The changes also streamline the command construction for running benchmarks, improving overall usability and robustness of the benchmarking process.
…orized-benchmarks.sh

This commit adds 'node_modules' to the .gitignore file to prevent tracking of Node.js dependencies. Additionally, it changes the default value of the --force-rerun option in the run-gas-categorized-benchmarks.sh script from true to false, enhancing the script's usability by preventing unintended reruns of benchmarks.
This commit changes the file permissions for `generate-gas-categorized-fixtures.sh` and `run-gas-categorized-benchmarks.sh` scripts from 644 to 755, allowing them to be executed directly. This enhancement improves usability for developers running gas benchmarks.
This commit introduces two new CSV files: `instruction_breakdown.csv` and `instruction_summary.csv`. The `instruction_breakdown.csv` provides detailed opcode usage statistics for various test cases, while the `instruction_summary.csv` summarizes total and unique instructions along with the top five instructions for each test case. These additions enhance the analysis capabilities for opcode performance in benchmarks.
This commit deletes the `instruction_breakdown.csv` and `instruction_summary.csv` files from the opcode_traces directory. These files were previously used for opcode usage statistics and summaries, but are no longer needed in the project.
This commit introduces several new scripts to enhance opcode analysis capabilities. The `analyze_opcode_traces.py` script analyzes blockchain test cases, extracts bytecode, traces opcode execution, and generates detailed reports. The `trace_opcodes.py` script executes transactions using the py-evm library for accurate opcode tracing, handling jumps and conditionals correctly. Additionally, a new shell script, `run_gas_categories.sh`, automates the execution of opcode tracing across various gas categories, improving the benchmarking workflow. Documentation for using py-evm for accurate traces is also included.
…-fixtures.sh

This commit updates the `generate-gas-categorized-fixtures.sh` script to support customizable gas categories through command-line options. The usage instructions have been modified to include an options flag, and a new function for parsing gas categories has been added. Default gas values are now defined, and the script validates the input format for gas values, improving usability and flexibility for generating zkEVM fixture inputs.
…tegorized-fixtures.sh

This commit enhances the `generate-gas-categorized-fixtures.sh` script by introducing a new command-line option for specifying a local EEST fixtures directory. The usage instructions have been updated to reflect this change, and the script now validates the mutually exclusive use of EEST_TAG and the new --eest-fixtures-path option. Additionally, the output messages have been improved for clarity, enhancing the overall usability of the script for generating zkEVM fixture inputs.
…enchmarks.sh

This commit updates the `run-gas-categorized-benchmarks.sh` script to support a comma-separated list of gas categories, allowing for more flexible benchmark execution. The usage instructions have been revised to reflect this change, and the script now validates the format of gas categories, including support for rational numbers. Default gas categories are defined, and the script's output messages have been improved for clarity, enhancing the overall usability for running gas benchmarks.
This commit deletes the `PY_EVM_TRACING_GUIDE.md` and `run_gas_categories.sh` files, which are no longer needed in the project. The removal of these files streamlines the codebase and eliminates outdated documentation and scripts related to opcode tracing and gas category execution.
…tegorized-benchmarks.sh

This commit updates the `run-gas-categorized-benchmarks.sh` script to streamline the command-line options by introducing short forms for various flags, improving usability. The help message and examples have been revised for clarity, and the output directory option has been added. Additionally, the script's output messages have been enhanced for better readability, making it easier for users to understand the benchmark execution process.
…k script documentation

This commit updates the documentation in `cli.rs` and `run-gas-categorized-benchmarks.sh` to clarify that the `NETWORK_PRIVATE_KEY` environment variable is optional for the network resource. The help messages have been revised to provide clearer instructions on resource types and their requirements, improving usability for users running benchmarks.
…ures script

This commit adds a default value for the Rayon thread count in the `generate-gas-categorized-fixtures.sh` script, allowing users to override it via the environment variable. This enhancement improves the script's configurability for parallel execution.
- Added validation in the CLI to ensure that the Cluster resource can only be used with SP1 zkVMs.
- Introduced `bail` from `anyhow` for error handling in case of invalid resource configurations.
This commit enhances the CLI help documentation by adding 'cluster' as a valid resource type alongside 'network', and updates the validation logic to ensure both resources are only supported with SP1 zkVM. These changes improve clarity and prevent misconfigurations in resource selection.
This commit introduces a new command-line option `--risc0-keccak-po2` to the benchmark script, allowing users to specify the RISC0 keccak accelerator cycle limit as a power of 2. The default value is set to 15, and the documentation has been updated to reflect this addition. This enhancement provides users with more control over memory usage during benchmarks.
- Create env.example with all configurable resource limits
- Update docker-compose.yml to use environment variable substitution
- Add configurable limits for Redis, CPU node, mixed node, and GPU nodes
- Make Redis password and PostgreSQL credentials configurable
- Add SP1_CIRCUITS_DIR variable for circuit cache path
- Add API_PORT and REDIS_PORT variables for port configuration

All variables have sensible defaults matching the previous hardcoded values.
- Add comprehensive check_nvidia_gpu() function that verifies:
  - nvidia-smi command availability
  - GPU accessibility
  - Docker NVIDIA runtime configuration
- Add --skip-gpu-check flag to bypass GPU verification
- Provide detailed installation instructions with links:
  - NVIDIA drivers installation guide
  - NVIDIA Container Toolkit installation commands
  - Quick install commands for Ubuntu/Debian
- Add log_hint() function for actionable suggestions
- Improve error messages with specific remediation steps
- Show alternative options (CPU-only mode, skip check)
- Check if Docker daemon is running with helpful start commands
- Add detect_docker_compose() function to both start and stop scripts
- Detect Docker Compose v2 (docker compose) first as preferred option
- Fall back to Docker Compose v1 (docker-compose) with deprecation warning
- Display version information when compose is detected
- Provide upgrade guide link for v1 users
- Use DOCKER_COMPOSE_CMD variable consistently throughout scripts
- Update help output to use dynamic compose command
- Add ensure_circuits_dir() function to automatically create the circuits cache
- Use SP1_CIRCUITS_DIR environment variable (default: ~/.sp1/circuits)
- Check directory writability and warn if not writable
- Export SP1_CIRCUITS_DIR for use in docker-compose.yml
- Prevents Docker from creating the directory as root (permission issues)
- Provide helpful error messages and hints for manual creation
- Add --port flag to specify API gRPC port (default: 50051)
- Add --redis-port flag to specify Redis port (default: 6379)
- CLI arguments override environment variables from .env
- Add port validation (must be 1-65535)
- Update help text with port configuration examples
- Enables running multiple SP1 clusters on the same machine

Example usage:
  ./start-sp1-cluster.sh --port 50052 --redis-port 6380
Docker Compose health checks:
- Add health checks for redis, postgresql, api, and coordinator services
- Use proper dependency conditions (service_healthy) for startup order
- Add restart: unless-stopped policy for all services
- Update depends_on to use health check conditions

Start script improvements:
- Add --wait flag to wait for all services to be healthy (implies --detach)
- Implement comprehensive wait_for_service() and wait_for_health() functions
- Add strict mode for --wait that fails if services don't become healthy
- Show visual feedback during health check with checkmarks

HA documentation in docker-compose.yml:
- Document single-instance architecture limitations
- Provide guidance for production deployments:
  - Run multiple clusters on separate machines with load balancer
  - Use external managed services for Redis and PostgreSQL
  - Configure monitoring and alerting
  - Implement backup and recovery procedures
This commit simplifies the main function by removing the redundant validation logic for cluster proving with SP1 zkVMs. The resource type is now directly converted from the CLI input, streamlining the code and improving readability.
…able

- Updated health checks for postgresql and api services to use wget for better reliability.
- Changed hardcoded private keys to environment variable references for BIDDER_SP1_PRIVATE_KEY and FULFILLER_SP1_PRIVATE_KEY in docker-compose.yml.
- Added private key placeholders in env.example for better configuration guidance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants