-
Notifications
You must be signed in to change notification settings - Fork 4
Solo: a shared memory collective module #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
thananon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not the attempt to get to upstream right? Seems like there are some corner cases left unhandled.
ompi/mca/coll/solo/coll_solo.h
Outdated
| bool enabled; | ||
|
|
||
| /** | ||
| * osc alrogithms attach memory blocks to this bynamic window and use it to perform one-sided |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typos (alrogithms, bynamic)
| &(static_size[i]), | ||
| &(static_disp[i]), | ||
| &(solo_module->ctrl_bufs[i])); | ||
| solo_module->data_bufs[i] = (char *) (solo_module->ctrl_bufs[i]) + 4 * opal_cache_line_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the data_bufs are always a constant away (in this instance 4 cache lines) from the ctrl_bufs then instead of storing them you can easily compute them as needed.
…locate l_seg_count instead of count)
Known problems: 1. bcast: has bug if a datatype is from MPI_Bottom 2. reduce \ allreduce : fix or remove the non contiguous support.
|
I declined this in favor of #38. |
- Add support for fallback to previous coll module on non-commutative operations (#30) - Replace mutexes by atomic operations. - Use the correct nbc request type (for both ibcast and ireduce) * coll/base: document type casts in ompi_coll_base_retain_* - add module-wide topology cache - use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast - reduce number of memory allocations - call the default request completion. - Remove the requests from the Fortran lookup conversion tables before completing and free it. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> Signed-off-by: Joseph Schuchart <schuchart@hlrs.de> Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
No description provided.