Adding micro-benchmarks to XDG by Waqar-ukaea · Pull Request #66 · xdg-org/xdg

Waqar-ukaea · 2025-02-24T16:36:16Z

I think it would be a good idea if XDG also implemented micro-benchmarks alongside its unit tests for the various ray tracing operations so that we have a record of performance of these operations when code changes are made.

This PR aims to add some micro-benchmarks for each of the ray tracing operations currently available in XDG. Catch2 supports benchmarking natively so I was thinking we could just go ahead and use the Catch2 macros rather than bringing in another library. Details of how to implement benchmarks with Catch2 can be found here - https://github.com/catchorg/Catch2/blob/devel/docs/benchmarks.md

If we want more advanced benchmarking control something like https://github.com/google/benchmark may be more suitable though.

Some other considerations:

We can just add some benchmarks to the existing test cases but it might make more sense to define some simple geometry/problems explicitly for benchmarking purposes.
Each additional benchmark will ofc increase the time it takes for the CI to complete so will have to think about the length of these benchmarks and how many we actually need.
How these benchmarks actually interact with the CI will also need to be considered. Should large performance drops block merging for example.
Perhaps these benchmarks could also be extended to some of the mesh operations too.
Ctest by default won't display benchmark results unless called with --verbose as a benchmark won't count as failing.
Benchmarking results could be logged to seperate file and used for analysis in within the workflow but could also be stored as github artifacts to allow some persistent history. (Probably makes sense for an artifact-based workflow to only be triggered on merges into main)

…_vol tests

Waqar-ukaea · 2025-02-25T11:42:30Z

A minor thing but since we are calling the tests in parallel with 4 processes the outputs aren't kept together. So I changed the github actions workflow to call ctest in serial for now. A real fix for this could just be to define the benchmarks as a separate test case entirely which should (hopefully) ensure they do stay together.

I've added a few simple tests which just call to benchmark some of the setups in test_ray_fire.cpp and point_in_vol.cpp. In my initial testing the results of the benchmarking shows that it seems to be pretty consistent between runs of the CI workflow as shown below.

First run in CI:

6: benchmark name                       samples       iterations    est run time
6:                                      mean          low mean      high mean
6:                                      std dev       low std dev   high std dev
6: -------------------------------------------------------------------------------
6: ray_fire_from_inside                           100           144      3.024 ms 
6:                                         209.832 ns    208.518 ns    212.355 ns 
6:                                         8.86016 ns    4.75811 ns    13.3745 ns 
6:                                                                                
6: ray_fire_from_outside_vol [Exiting]            100            78     3.0342 ms 
6:                                         390.932 ns    387.763 ns    398.342 ns 
6:                                         23.4068 ns    12.2809 ns    43.8629 ns 
6:                                                                                
6: ray_fire_from_outside_vol                                                      
6: [Entering]                                     100            78     3.0264 ms 
6:                                         389.687 ns    387.093 ns    394.681 ns 
6:                                         17.5623 ns    9.28274 ns     26.516 ns 

7: point_in_vol [Inside]                          100           145      3.016 ms 
7:                                         213.492 ns    210.505 ns    219.257 ns 
7:                                         20.4051 ns    12.3731 ns    31.5559 ns 
7:                                                                                
7: point_in_vol [Outside]                         100          1006      3.018 ms 
7:                                         30.0012 ns    29.7964 ns    30.5066 ns 
7:                                         1.53132 ns   0.717069 ns     2.8829 ns

Second run in CI:

6: benchmark name                       samples       iterations    est run time
6:                                      mean          low mean      high mean
6:                                      std dev       low std dev   high std dev
6: -------------------------------------------------------------------------------
6: ray_fire_from_inside                           100           142     3.0246 ms 
6:                                         213.868 ns    212.447 ns    216.602 ns 
6:                                         9.48367 ns    4.99679 ns    14.2413 ns 
6:                                                                                
6: ray_fire_from_outside_vol [Exiting]            100            78      3.042 ms 
6:                                         390.466 ns    388.054 ns    395.235 ns 
6:                                           16.54 ns    9.11818 ns    25.6529 ns 
6:                                                                                
6: ray_fire_from_outside_vol                                                      
6: [Entering]                                     100            78     3.0342 ms 
6:                                         391.085 ns     388.41 ns    396.642 ns 
6:                                         18.7117 ns     10.715 ns    31.2428 ns 

7: point_in_vol [Inside]                          100           145      3.016 ms 
7:                                         207.004 ns    205.707 ns    209.469 ns 
7:                                         8.81287 ns    4.95491 ns    13.2013 ns 
7:                                                                                
7: point_in_vol [Outside]                         100          1002      3.006 ms 
7:                                         29.9244 ns    29.7573 ns    30.2778 ns 
7:                                         1.18804 ns   0.695017 ns    1.87094 ns

Next steps:

Add in some more simple benchmarks for the other ray tracing operations available in XDG.
Look into some more advanced benchmarking techniques with Catch2.
Potentially define explicit benchmark problems as separate test cases. Maybe a larger benchmark problem would be better suited than the minimal cases used for unit testing.
Modify the github actions workflow for the CI to store benchmark results as an artefact for later use/analysis.
Automatically publish the results of the micro benchmarks in a similar manner to the end-to-end OpenMC benchmarks?

pshriwise · 2025-02-26T17:39:26Z

This is nice @Waqar-ukaea! Thanks for exploring this.

One thought: CI machines can change and I don't think we're guaranteed a specific machine for a given run, making performance comparisons tricky. I really like the idea of building these benchmarks into the test suite, but we may want to limit comparisons to runs on a consistent machine. I'm looking into how we might connect to one, but it could take some time to ensure we're doing it securely (it's behind a firewall).

Waqar-ukaea · 2025-02-26T18:36:56Z

Sounds good @pshriwise. In that case I can carry on with what I had in mind for this PR and we can keep it unmerged until we figure out what we're doing with the CI machine.

… into micro-benchmarking

Waqar-ukaea added 2 commits February 24, 2025 16:23

Started adding micro-benchmarks with Catch2

7ba2c87

Added some simple catch2 benchmarks to existing ray_fire and point_in…

676e228

…_vol tests

Update ci.yml to run tests in serial

e176918

Waqar-ukaea added 2 commits March 5, 2025 09:17

trying out a Catch2::ADVANCED_BENCHMARK

38ababd

Merge branch 'micro-benchmarking' of https://github.com/Waqar-ukaea/xdg…

67a7335

… into micro-benchmarking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding micro-benchmarks to XDG#66

Adding micro-benchmarks to XDG#66
Waqar-ukaea wants to merge 5 commits intoxdg-org:mainfrom
Waqar-ukaea:micro-benchmarking

Waqar-ukaea commented Feb 24, 2025 •

edited

Loading

Uh oh!

Waqar-ukaea commented Feb 25, 2025 •

edited

Loading

Uh oh!

pshriwise commented Feb 26, 2025

Uh oh!

Waqar-ukaea commented Feb 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Waqar-ukaea commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Waqar-ukaea commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pshriwise commented Feb 26, 2025

Uh oh!

Waqar-ukaea commented Feb 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Waqar-ukaea commented Feb 24, 2025 •

edited

Loading

Waqar-ukaea commented Feb 25, 2025 •

edited

Loading