Skip to content

Conversation

@Qubitol
Copy link
Collaborator

@Qubitol Qubitol commented Nov 11, 2025

Apparently, arm was working only in case of Mac (where uname -p gives arm) and not in case of most Linux distributions (where uname -p is aarch64).
I fixed that: CUDACPP can now run on arm, with support also for neon (equivalent to cppsse4 and picked up using the same backend name).
In the makefiles, support for higher vectorisation is not present, but this can be something that we could think of adding in the next future.

I added tests on Linux arm as well.
I needed to regenerate the processes to update the makefiles.

@Qubitol
Copy link
Collaborator Author

Qubitol commented Nov 13, 2025

See also ARM-related PRs #421 and #425.

@valassi fixed the Googletest bug, and suggested the flags to ensure no SIMD ARM.

I properly implemented __ARM_NEON__ so that it is considered of the same importance of __SSE4_2__, and now it is used everywhere we have ARM.

@Qubitol
Copy link
Collaborator Author

Qubitol commented Nov 13, 2025

Needed to cancel some job since they hang when running runTest.exe.
Still unclear why this is happening only in case of linux ARM, while it's working on Mac.
Notice, it compiles, but it hangs when it is executed.

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Nov 14, 2025
…ests on aarch64 (with DanieleM)

This fixes a hang in the testMist tests on aarch64 in sqrtNewtonRaphson (madgraph5#1064)
(testMisc -> constexpr_tan -> constexpr_tan_quad -> constexpr_cos_quad -> constexpr_sqrt -> sqrtNewtonRaphson)

It uses the same workaround previously adopted for avoiding testMisc hangs when running valgrind (madgraph5#906)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Nov 14, 2025
…ts on aarch64 (with DanieleM)

This fixes a hang in the testMist tests on aarch64 in sqrtNewtonRaphson (madgraph5#1064)
(testMisc -> constexpr_tan -> constexpr_tan_quad -> constexpr_cos_quad -> constexpr_sqrt -> sqrtNewtonRaphson)

It uses the same workaround previously adopted for avoiding testMisc hangs when running valgrind (madgraph5#906)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Nov 14, 2025
…ests on aarch64 (with DanieleM)

This fixes a hang in the testMisc tests on aarch64 in sqrtNewtonRaphson (madgraph5#1064)
(testMisc -> constexpr_tan -> constexpr_tan_quad -> constexpr_cos_quad -> constexpr_sqrt -> sqrtNewtonRaphson)

It uses the same workaround previously adopted for avoiding testMisc hangs when running valgrind (madgraph5#906)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Nov 14, 2025
…ts on aarch64 (with DanieleM)

This fixes a hang in the testMisc tests on aarch64 in sqrtNewtonRaphson (madgraph5#1064)
(testMisc -> constexpr_tan -> constexpr_tan_quad -> constexpr_cos_quad -> constexpr_sqrt -> sqrtNewtonRaphson)

It uses the same workaround previously adopted for avoiding testMisc hangs when running valgrind (madgraph5#906)
valassi and others added 4 commits November 14, 2025 13:27
…ts on aarch64 (with DanieleM)

This fixes a hang in the testMisc tests on aarch64 in sqrtNewtonRaphson (madgraph5#1064)
(testMisc -> constexpr_tan -> constexpr_tan_quad -> constexpr_cos_quad -> constexpr_sqrt -> sqrtNewtonRaphson)

It uses the same workaround previously adopted for avoiding testMisc hangs when running valgrind (madgraph5#906)
…ith DanieleM)

Remove the custom __ARM_NEON__ with two extra underscores
Use 'g++ -march=armv8.2-a+simd -E -dM - < /dev/null | grep ARM' to check
@Qubitol
Copy link
Collaborator Author

Qubitol commented Nov 14, 2025

Latest changes involves using __ARM_NEON, which is automatically checked by the compiler in case of ARM SIMD.

@valassi valassi marked this pull request as ready for review November 14, 2025 18:47
@valassi valassi requested a review from a team as a code owner November 14, 2025 18:47
Copy link
Member

@valassi valassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Qubitol thanks for all the work on this! The tests are now finally passing in the CI. I suggest that we merge this.

@Qubitol Qubitol marked this pull request as draft November 28, 2025 17:13
@Qubitol
Copy link
Collaborator Author

Qubitol commented Nov 28, 2025

We should consider changing the instances of uname -p with uname -m since sometimes uname -p gives unknown, while uname -m should be more robust.
I found this problem today while testing on an ARM machine.

@Qubitol Qubitol marked this pull request as ready for review December 4, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants