Skip to content

Conversation

@Alessandro624
Copy link
Owner

Description

This PR improves the robustness of the metrics parsing pipeline and introduces a new CUDA stencil module, extending the set of GPU computation patterns covered by the project.

Key changes

  • Enhanced CSV parsing in parse_metrics.py, improving kernel name extraction and overall robustness when processing profiling outputs.

  • Updated histogram and occupancy handling in the metrics analysis workflow.

  • Added a new stencil/ module including:

    • CUDA stencil kernels implementing the core computation.
    • Makefile for standardized builds.
    • run.sh for execution.
    • profile_nvprof.sh for profiling support.
    • README documenting usage and performance analysis steps.
  • Updated .gitignore to include artifacts generated by the stencil module.

  • Updated all README.md files to ensure consistency and reflect the latest project structure and tooling.

Impact

The improved metrics parser increases the reliability of performance analysis across all CUDA examples. The stencil workload adds another memory-bound computation pattern, commonly used in scientific computing, enabling deeper exploration of cache behavior, occupancy, and bandwidth limitations.

@Alessandro624 Alessandro624 self-assigned this Dec 30, 2025
@Alessandro624 Alessandro624 added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 30, 2025
@Alessandro624 Alessandro624 merged commit 172b921 into dev Dec 30, 2025
1 check passed
@Alessandro624 Alessandro624 deleted the stencil branch December 30, 2025 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant