Skip to content

Conversation

@Alessandro624
Copy link
Owner

Description

This PR introduces a new CUDA convolution module, providing a practical example of convolution kernels along with a complete build, execution, and profiling workflow.

Key changes

  • Added a new convolution/ module including:

    • CUDA convolution kernels implementing the core computation.
    • Makefile to standardize compilation.
    • run.sh for straightforward execution.
    • profile_nvprof.sh to collect GPU performance metrics.
    • README documenting usage, configuration, and profiling steps.

Impact

The convolution example expands the set of GPU workloads available in the repository, complementing matrix multiplication with a computation pattern widely used in signal processing and deep learning. It provides a solid baseline for performance analysis and kernel optimization experiments.

@Alessandro624 Alessandro624 self-assigned this Dec 30, 2025
@Alessandro624 Alessandro624 added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 30, 2025
@Alessandro624 Alessandro624 merged commit 29f5f5c into dev Dec 30, 2025
1 check passed
@Alessandro624 Alessandro624 deleted the convolution branch December 30, 2025 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant