`recheck-benchmark`

This is a project that attempts to benchmark the different backends used by recheck. See Regex-Radar/regex-radar#24 for more info.

Setup

Findings

The benchmark is still incomplete not taking into account of startup times, warmup times and IPC overhead (especially for the worker thread). Additionally the worker is implemented as a pool, but has a fixed size of 1. Some early conclusions:

native is pretty fast, as expected
java is the fastest for a batch, slow for individual patterns, not sure why
wasm is very fast for individual regex checks, and still has a good performance for batches
scalajs is the slowest as expected
thread-worker has similar performance as scalajs, but the messaging overhead is noticable
thread-worker-wasm has a a similar performance to wasm, with IPC overhead being noticable if a lot of checks are done in a small window of time.
increasing the workerPoolSize to cpu count or availableParallelism() does not seem to give any benefits which is odd, since especially batches should see some work being done in paralell.

Some variations to benchmark:

~~worker pools with multiple workers~~, measure the memory/process spawning overhead as additional cost
write some benchmarking suites, with setup/teardowns to measure:
- the overhead of spawning processes
- the overhead of the IPC
- the performance of the individual
do some memory profiling, to check the memory overhead

As the aim of the benchmark is to figure out the performance gains vs the portability of the backend, wasm seems like a very good choice, if you ignore the experimental status of its build target. use thread-worker-wasm to keep the async requirement, with a small overhead for the IPC messaging being worth the price of being able to use async/await.

Visualisations

Attempts at visualising the initial results.

Latency - lower is better
Throughput - higher is better

Batch - Latency - Throughput

6 regexes in the dataset

Backend	Avg Latency	(ms)	Throughput	(ops/sec)
`java`	██	`97`	██████████	`10`
`native`	███	`135`	███████	`7`
`wasm`	██████	`276`	████	`4`
`scalajs`	████████████	`613`	██	`2`
`threadWorker`	█████████████	`628`	██	`2`

Per Pattern

Throughput (ops/sec)

Pattern	`native` (ops/sec)	`java` (ops/sec)	`wasm` (ops/sec)	`scalajs` (ops/sec)	`threadWorker` (ops/sec)
`(https?:\/\/(w{3}\.)?)+[a-zA-Z0-9\-]+\.[a-z]+(\/[^\s])`	`7`	`11`	`4`	`2`	`2`
`^(a+)+$`	`1232`	`39`	`1958`	`552`	`562`
`(a\|aa)+$`	`999`	`48`	`1227`	`416`	`324`
`(x+x+)+y`	`682`	`60`	`647`	`275`	`237`
`^([a-zA-Z0-9_.+-]+)+@(([a-zA-Z0-9-])+.)+[a-zA-Z0-9]{2,4}$`	`240`	`42`	`153`	`68`	`65`
`^([A-Z]:\\)?(\\[A-Za-z_\-\s0-9\.]+)+\\?$`	`486`	`46`	`327`	`155`	`143`

Run benchmark

# install dependencies
npm i
# run benchmark
npm run bench

Dataset

The regular expressions used to benchmark are found in ./data/redos-vulnerable.json.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
scripts		scripts
src		src
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
results.txt		results.txt
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`recheck-benchmark`

Setup

Findings

Visualisations

Batch - Latency - Throughput

Per Pattern

Run benchmark

Dataset

About

Uh oh!

Releases

Packages

Languages

License

Regex-Radar/recheck-benchmark

Folders and files

Latest commit

History

Repository files navigation

recheck-benchmark

Setup

Findings

Visualisations

Batch - Latency - Throughput

Per Pattern

Run benchmark

Dataset

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`recheck-benchmark`

Packages