Open
Conversation
Details: - Adds a configuration flag `--harden-barriers` (disabled by default)`. - When enabled, threads record a) the currently-detected sense variable from the barrier object (in this mode, the sense variable is also incremented rather than XOR'ed between 1 and 0 to prevent ABA problems), and b) the source location of the call to `bli_thrinfo_barrier` or `bli_thrinfo_bcast` as an address to a string literal. If any thread in a team records different information from its peers, a diagnostic is printed and the program aborts. - This information requires an additional dynamically-allocated array, and some extra reads/writes during the barrier process. While I haven't measured it, the performance impact should be small though (and is opt-in). - This should detect errors related to problems such as conditionally-taken barriers within a thread team, use of the incorrect thread info object, threads escaping barriers early, etc. Limitations: - Both calls to `bli_thrcomm_barrier` within `bli_thrinfo_bcast` receive the same source line information. However, the check on sense variable should still catch any problems. - Certain problems (such as missing a broadcast) may still manifest as illegal memory accesses or memory corruption before the problem can be detected in a later barrier. - Not implemented for tree barriers yet. I would prefer to refactor the tree and non-tree barriers as a unified implementation first.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details:
--harden-barriers(disabled by default).bli_thrinfo_barrierorbli_thrinfo_bcastas an address to a string literal. If any thread in a team records different information from its peers, a diagnostic is printed and the program aborts.Limitations:
bli_thrcomm_barrierwithinbli_thrinfo_bcastreceive the same source line information. However, the check on sense variable should still catch any problems.