Skip to content

Conversation

@devreal
Copy link

@devreal devreal commented Nov 6, 2023

Trial PR for feedback from @bosilca. Probably needs some more cleanup but feedback on the design is appreciated. Commits will be squashed later.

Currently only implements offloading for allreduce algorithms. It's missing rooted reduce.

related to spack/spack#40725

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
@github-actions
Copy link

github-actions bot commented Nov 6, 2023

Hello! The Git Commit Checker CI bot found a few problems with this PR:

ef8b526: ROCM: add missing FUNC_FUNC_FN macro

  • check_signed_off: does not contain a valid Signed-off-by line

6fd216f: accelerator/rocm: regular memory behaves like unif...

  • check_signed_off: does not contain a valid Signed-off-by line

955849b: Device op: pass device to lower-level op to avoid ...

  • check_signed_off: does not contain a valid Signed-off-by line

3afec6b: Draft of ompi_op_select_device

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

1 similar comment
@github-actions
Copy link

github-actions bot commented Nov 6, 2023

Hello! The Git Commit Checker CI bot found a few problems with this PR:

ef8b526: ROCM: add missing FUNC_FUNC_FN macro

  • check_signed_off: does not contain a valid Signed-off-by line

6fd216f: accelerator/rocm: regular memory behaves like unif...

  • check_signed_off: does not contain a valid Signed-off-by line

955849b: Device op: pass device to lower-level op to avoid ...

  • check_signed_off: does not contain a valid Signed-off-by line

3afec6b: Draft of ompi_op_select_device

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@devreal devreal requested a review from bosilca November 6, 2023 00:39
@github-actions
Copy link

github-actions bot commented Nov 6, 2023

Hello! The Git Commit Checker CI bot found a few problems with this PR:

ef8b526: ROCM: add missing FUNC_FUNC_FN macro

  • check_signed_off: does not contain a valid Signed-off-by line

6fd216f: accelerator/rocm: regular memory behaves like unif...

  • check_signed_off: does not contain a valid Signed-off-by line

955849b: Device op: pass device to lower-level op to avoid ...

  • check_signed_off: does not contain a valid Signed-off-by line

3afec6b: Draft of ompi_op_select_device

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

…0_dlopen

spack:fix for dlopen missing symbol problem
@github-actions
Copy link

github-actions bot commented Nov 7, 2023

Hello! The Git Commit Checker CI bot found a few problems with this PR:

ef8b526: ROCM: add missing FUNC_FUNC_FN macro

  • check_signed_off: does not contain a valid Signed-off-by line

6fd216f: accelerator/rocm: regular memory behaves like unif...

  • check_signed_off: does not contain a valid Signed-off-by line

955849b: Device op: pass device to lower-level op to avoid ...

  • check_signed_off: does not contain a valid Signed-off-by line

3afec6b: Draft of ompi_op_select_device

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

2 similar comments
@github-actions
Copy link

github-actions bot commented Nov 7, 2023

Hello! The Git Commit Checker CI bot found a few problems with this PR:

ef8b526: ROCM: add missing FUNC_FUNC_FN macro

  • check_signed_off: does not contain a valid Signed-off-by line

6fd216f: accelerator/rocm: regular memory behaves like unif...

  • check_signed_off: does not contain a valid Signed-off-by line

955849b: Device op: pass device to lower-level op to avoid ...

  • check_signed_off: does not contain a valid Signed-off-by line

3afec6b: Draft of ompi_op_select_device

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@github-actions
Copy link

github-actions bot commented Nov 7, 2023

Hello! The Git Commit Checker CI bot found a few problems with this PR:

ef8b526: ROCM: add missing FUNC_FUNC_FN macro

  • check_signed_off: does not contain a valid Signed-off-by line

6fd216f: accelerator/rocm: regular memory behaves like unif...

  • check_signed_off: does not contain a valid Signed-off-by line

955849b: Device op: pass device to lower-level op to avoid ...

  • check_signed_off: does not contain a valid Signed-off-by line

3afec6b: Draft of ompi_op_select_device

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Joseph Schuchart and others added 21 commits November 7, 2023 18:09
Signed-off-by: Joseph Schuchart <jschuchart@leconte.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <jschuchart@xsdk.icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
If the target process is unable to execute an RDMA operation it
instructs the origin to change the communication protocol. When this
happen theorigin must be informed to cancel all pending RDMA operations,
and release the rdma_frag.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
…or allreduce recursive doubling

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
The accelerator component may report the availability of a single accelerator
whose ID is not zero.

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
…_SUPPORT

These macros are defined to either 1 or 0

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
…evices

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
We know where source and target buffers are located, so pass the right
transfer direction to the accelerator memcpy call.

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants