GitHub - kamini08/ir-analysis: A comprehensive framework to evaluate Intermediate Representation (IR) lifters (Ghidra, angr, BAP, LLVM). Measures performance, lifting accuracy, and reliability for binary analysis applications.

IR Analysis / IR Lifting Benchmark

This repository provides an analysis and benchmarking framework for evaluating Intermediate Representation (IR) lifting and disassembly tools against binary samples. It orchestrates multiple lifters/analysers (Ghidra, angr, BAP, LLVM) and collects runtime, memory and IR-statistics into CSV and human-readable reports.

The project is intended for security researchers and tool developers who want to compare lifting quality and resource usage across different toolchains.

Key features

Batch orchestration for multiple analyzers: Ghidra (P-code), angr (VEX), BAP (BIL), and LLVM-based tooling.
Per-sample timing and memory measurement using GNU time.
CSV summary output with counts for functions, basic blocks and IR statements.
Scripts to run each tool headless and helper scripts for installation and reporting.

Repository layout

scripts/ - orchestration and analysis scripts (see details below).
samples/ - dataset of binaries (organized by benign/ malware and optionally by architecture).
results/ - output logs, CSV summaries and generated reports.
ghidra_projects/ - (optional) project data produced by Ghidra headless runs.
validation/ - helper utilities used for semantic validation and test harnesses.

Quick start

Install prerequisites. On Debian/Ubuntu based systems this typically includes:

sudo apt update
sudo apt install -y python3 python3-venv python3-pip time file

Run the repository setup script (this creates a virtualenv and installs Python requirements where applicable):

./setup.sh

Configure external tools (examples):

Ghidra: set the GHIDRA_INSTALL_DIR environment variable to your Ghidra installation directory.
BAP: optional, installed via opam (enable with ENABLE_BAP=true).
LLVM: optional; enable with ENABLE_LLVM=true.

Example environment variables (set inline or export in your shell):

export GHIDRA_INSTALL_DIR=/path/to/ghidra
SAMPLES_DIR=/absolute/path/to/samples RESULTS_DIR=/absolute/path/to/results ./scripts/run_all.sh

Activate your Python virtualenv (if created by setup.sh) and run the full benchmark:

source venv/bin/activate
./scripts/run_all.sh

Notes:

The orchestrator script is scripts/run_all.sh. It detects missing dependencies and will print guidance when something is absent.
Use ENABLE_BAP and ENABLE_LLVM environment variables to toggle optional analyses.

Producing reports

After a run, a CSV summary results/summary.csv and detailed logs are produced.
Generate a human-readable report with:

python3 scripts/report_generator.py

The generated report(s) are placed under results/ (for example results/benchmark_report.md).

Development notes

Scripts are a mix of Bash and Python. Keep shell scripts POSIX-friendly where possible and prefer bash for complex logic.
The scripts/validation/ folder contains utilities for semantic differencing and validation between IRs.

Contributing

See CONTRIBUTING.md for contribution guidelines, test instructions, and the project workflow.

License

This project is open source under the MIT License - see the bundled LICENSE file.

If something in this README is incorrect (paths, script names, or behavior), please open an issue or submit a pull request with the fix.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
scripts		scripts
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SETUP_COMPLETE.md		SETUP_COMPLETE.md
activate.sh		activate.sh
gen_stats.py		gen_stats.py
run_juliet_analysis.sh		run_juliet_analysis.sh
setup.sh		setup.sh
verify_before_commit.sh		verify_before_commit.sh
verify_cfg_implementation.sh		verify_cfg_implementation.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IR Analysis / IR Lifting Benchmark

Key features

Repository layout

Quick start

Producing reports

Development notes

Contributing

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

kamini08/ir-analysis

Folders and files

Latest commit

History

Repository files navigation

IR Analysis / IR Lifting Benchmark

Key features

Repository layout

Quick start

Producing reports

Development notes

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages