We should implement a CI workflow to execute benchmarks on each PR: - show performance delta between the new changes and the main branch. - use automated comments to display benchmark results directly. - ensure new features do not slow down the library.