-
Notifications
You must be signed in to change notification settings - Fork 11
PytatoPyOpenCLArrayContext: add support for kernel profiling #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5f2d45d to
1ce6dce
Compare
1ce6dce to
2c2e431
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
arraycontext/impl/pytato/compile.py:643
- Add tests that verify kernel profiling support works as expected when profile_kernels is enabled.
if self.actx.profile_kernels:
arraycontext/impl/pytato/init.py:299
- Add tests for the new profiling methods (e.g., _wait_and_transfer_profile_events, get_profiling_data_for_kernel, reset_profiling_data_for_kernel, tabulate_profiling_data) to ensure that profiling data is collected and summarized correctly.
self.profile_kernels = profile_kernels
alexfikl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got excited about the profiling so I skimmed a bit and left some comments, hope that's ok!
arraycontext/impl/pytato/__init__.py
Outdated
| """Reset profiling data for kernel `kernel_name`.""" | ||
| self.profile_results.pop(kernel_name, None) | ||
|
|
||
| def tabulate_profiling_data(self) -> pytools.Table: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks more like a helper function than a class method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno - this is currently the main way of "getting" the complete profiling data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the main way of getting it actx.profile_results and/or actx.get_profile_for_kernel? This is just a handy way to display it (probably one of many).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it's just a convenience function for the user. We have the same method for eager profiling: https://github.com/illinois-ceesd/mirgecom/blob/41c6e567d44258557d95ef8453c3b82c4c89ab55/mirgecom/profiling.py#L216
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @alexfikl that this should be a function, not a method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you guys think of 1123c16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Co-authored-by: Alexandru Fikl <alexfikl@gmail.com>
arraycontext/impl/pytato/__init__.py
Outdated
| """Reset profiling data for kernel `kernel_name`.""" | ||
| self.profile_results.pop(kernel_name, None) | ||
|
|
||
| def tabulate_profiling_data(self) -> pytools.Table: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @alexfikl that this should be a function, not a method.
arraycontext/impl/pytato/__init__.py
Outdated
| """Holds a profile event that has not been collected by the profiler yet.""" | ||
|
|
||
| cl_event: cl._cl.Event | ||
| translation_unit: Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want the translation_unit in here? I'm not sold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to just the name of the translation unit in 7216ccf
arraycontext/impl/pytato/__init__.py
Outdated
| allocator=self.allocator, | ||
| **bound_arguments) | ||
|
|
||
| self.add_profiling_event(evt, pt_prg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only call the function if profiling is enabled, to avoid incurring the function call penalty. (Also below.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7216ccf
arraycontext/impl/pytato/__init__.py
Outdated
| allocator=self.allocator, | ||
| **bound_arguments) | ||
|
|
||
| self.add_profiling_event(evt, pt_prg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inaccurate: one loopy TranslationUnit will enqueue multiple kernels, so you'll need to either
- collect multiple events, or
- enqueue a marker before and after.
(Also below.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented the "marker" strategy in 7216ccf
99cc47a to
55537cc
Compare
7f24976 to
1123c16
Compare
1fbf1ad to
03dfa33
Compare
arraycontext/impl/pytato/__init__.py
Outdated
| def get_profiling_data_for_kernel(self, kernel_name: str) \ | ||
| -> MultiCallKernelProfile: | ||
| """Return profiling data for kernel *kernel_name*.""" | ||
| self._wait_and_transfer_profile_events() | ||
|
|
||
| time = 0 | ||
| num_calls = 0 | ||
|
|
||
| if kernel_name in self.profile_results: | ||
| knl_results = self.profile_results[kernel_name] | ||
|
|
||
| num_calls = len(knl_results) | ||
|
|
||
| for r in knl_results: | ||
| time += r | ||
|
|
||
| return MultiCallKernelProfile(num_calls, time) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'd like from the user interface here is just minimal access to the raw data and a reset primitive, without aggregation, and without exposing the before-wait/after-wait distinction. Everything else belongs in external utility routines, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to minimize the API in 6bdb85b. What do you think?
arraycontext/impl/pytato/utils.py
Outdated
| # }}} | ||
|
|
||
|
|
||
| def tabulate_profiling_data(profile_results: dict[str, MultiCallKernelProfile])\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the (so far only) interface to the doc in 6bdb85b.
arraycontext/impl/pytato/__init__.py
Outdated
| time += r | ||
|
|
||
| return MultiCallKernelProfile(num_calls, time) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to add the final interface to the class documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the (so far only) interface to the doc in 6bdb85b.
Co-authored-by: Andreas Klöckner <inform@tiker.net>
6a878e2 to
94d9d16
Compare
e986f53 to
ea691a4
Compare
ea691a4 to
fedb836
Compare
|
Thx! |
TODOs:
Please squash