Skip to content

Conversation

@jeffhammond
Copy link
Contributor

Some of us think C++ standard parallelism is going to be important in HPC.

This is very easy to implement in QS. The performance looks comparable to OpenMP, although I only tested Clang 12 on an Ampere Altra Q80.

The GPU version of this will come later. There are some nuances there that I haven't resolved.

This PR includes #43.

Jeff Hammond and others added 10 commits February 4, 2022 00:42
Signed-off-by: Jeff Hammond <jeff.r.hammond@intel.com>
New version uses functions not macros.

The use of template functions allows for enforcement of type-safety,
which is implemented using static_assert.

The old implementation is preserved for posterity.

A header guard was added.

I found the old macro names confusing, so I used new names, but I map
the old names in the source onto them so the application source does not
change.

Signed-off-by: Jeff Hammond <jeff.r.hammond@intel.com>
Signed-off-by: Jeff Hammond <jehammond@nvidia.com>
Signed-off-by: Jeff Hammond <jehammond@nvidia.com>
Signed-off-by: Jeff Hammond <jehammond@nvidia.com>
Signed-off-by: Jeff Hammond <jehammond@nvidia.com>
Signed-off-by: Jeff Hammond <jehammond@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant