merge main into amd-staging #1038

ronlieb · 2026-01-08T14:18:59Z

No description provided.

…er. [NFCI] (llvm#174890) This function is invoked only for ELF targets, therefore it has been moved to the ELF-specific streamer. An assertion has been added to catch its invocations outside of an invocation that targets ELF.

…vm#174695) Remove unneeded load instructions and only remain one comparison instruction.

…antCopyElimination (llvm#174706) This patch is like what Xqcibi did in llvm#174358.

…t() (llvm#174438) Some machines are able to make use of AUIPC + ADDI or LUI + ADDI fusion, make sure to consider that in the cost model for `RISCVTTIImpl::getConstantPoolLoadCost()`.

…lvm#174811) The keep-registers mode isn't super useful without disabling explicit-locals, as the local gets/sets are irrelevant noise in most cases. Switching this test makes the output much more concise and will make upcoming changes easier to review.

Not sure if this will fix the problem because I don't have a 32-bit arm machine to test with.

…vm#173268) This commit adds support for family specific support for the following intrinsics: - ldmatrix - stmatrix - mma.block_scale, mma.sp.block_scale - redux.sync - cvt.rs - clusterlaunchcontrol - setmaxnreg - tcgen05.mma Removed `hasTcgen05Instructions` function in the favour of `hasTcgen05InstSupport` Updated wmma.py script with family specific support and added new tests

This allows speculating recursively speculatable operations containing `fir.result`. Note that making it Pure does not allow speculating `fir.result` itself from its containing operation, since it is a terminator.

Fixes a failure on llc for ubsan builds: ../lib/Target/RISCV/RISCVAsmPrinter.cpp:552:7: runtime error: downcast of null pointer of type 'RISCVTargetStreamer' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Target/RISCV/RISCVAsmPrinter.cpp:552:7

llvm#173659) This PR fixes a crash by validating the type of the `kind` attribute. For `vector.contract` and `vector.outerproduct`, the verifier now emits an error when `kind` is not a CombiningKindAttr. Fixes llvm#173555.

…e` to `shufflevector` (llvm#169110) Resolves llvm#169058. This adds ~~an InstCombine pass~~ a TTI hook to the WebAssembly backend that folds `i8x16.swizzle` and `i8x16.relaxed.swizzle` operations to `shufflevector` operations if their mask operands are constant. This is mainly useful for abstractions over the raw intrinsics--for instance, in architecture-generic SIMD code that may not be able to expose the constant shuffles due to type system limitations. I took most of this from the x86 backend (in particular, `simplifyX86vpermilvar` in `X86InstCombineIntrinsic`), and adapted it for the WebAssembly backend. There wasn't any previous `instCombineIntrinsic` method on the WebAssembly `TargetTransformInfo`, so I added it. Right now, this swizzle optimization is the only one it performs. As I noted in the transform itself, the "relaxed" swizzle actually has stricter preconditions than the non-relaxed one. If a non-negative but still out-of-bounds index is provided, the "relaxed" swizzle can choose between returning 0 and the lane at the index modulo 16. However, it must make the same choice every time, and we don't know which choice the runtime will make, so we can't constant-fold it. The regression tests were mostly generated by Claude and adapted a bit by me (I tried to follow the [InstCombine contributor guide](https://llvm.org/docs/InstCombineContributorGuide.html#tests)). There was previously no WebAssembly subdirectory within the InstCombine tests, so I created that too; as of now, the swizzle fold test is the only file in it. Everything else was written by myself (well, partly copy-pasted from the x86 backend). I'm not sure how to write an Alive2 test for this; I can't find any examples where the input is an arbitrary constant.

This patch handles the special case where an extract value yields an aggregate result, which then is used as an argument to a store. The SPIRV BE uses special intrinsics (`spv_extractv` and `spv_store`) to represent these through IRTranslator, however this creates a problem: `spv_store` is called as a function, and IRTranslator cannot handle arguments that take more than a vreg. For other functions, the aggregate argument replacement pass would have solved things, but it does not apply here. Hence, we apply the same mutate-into-Int32 solution here when dealing with stores, and restore the extract value's type (which we have available as a ValueAttr) during instruction selection.

…, Z) (llvm#173808) We use fli+fneg to generate negative float, eliminate the fneg for fma. Fold fma to vfnmsac.vf,vfnmsub.vf, vfnmacc.vf, vfnmadd.vf --------- Co-authored-by: Craig Topper <craig.topper@sifive.com>

CAPIIR includes this in some of its source files, so we need to ensure the header is around.

IR should be a splat of 7 as this compares vector of elements with 7 (`vec[i]!=7`). Having `zeroinitializer` goes against this comparison. Co-authored-by: himadhith <himadhith.v@ibm.com>

This PR replace `enum`s with `enum class`es in Python bindings. No functional change.

This avoids repeatedly reparsing the URL to extract the number.

…0794) We match GCC's coverage of the extension, that is, everything except `setn` and `setnhi`. See also: https://docs.oracle.com/cd/E53394_01/html/E54833/gmael.html

) Un-breaks the build after llvm#169110.

…#174745) When activating a union member, none of the unions in that path can have a non-trivial constructor. Unfortunately, this is something we have to do when evaluating the bytecode, not while compiling it.

Remove unneeded load instructions.

…ns (llvm#173658) Because - no one should use them in explicit function call forms, and - value-discarding casts to non-`void` are already diagnosed by Clang. Also add explanation for this to `CodingGuidelines.rst`.

There were already tags for protected members in the Mustache template, but didn't use the proper tags for the newer JSON scheme.

…on RV32. (llvm#174703) On RV32, i64 elements have type 'long long'. Fixes llvm#174613.

…173873) The current implementation does not drop unit extent dimension if that dimension is indexed by a non-trivial affine expression (i.e., not a single dimension or constant 0) on the first application of the transformation. However, it is possible to drop such dimensions if all dimensions involved in the affine expression are going to be dropped. So far, this required repeated application of the transformation, with the changes in this PR, the dimensions are dropped with a single application of the transformation. Signed-off-by: Lukas Sommer <lukas.sommer@amd.com>

ZVFH and ZVFHMIN have the same code generation here.

For more readable.

…es.td. NFC.

… or more words (llvm#174789) This patch adds support for generating the `Xqcilsm` multi-word load/store instructions for three or more words. We add a new function in the `RISCVLoadStoreOptimizer` pass for doing this separate from the one that does load store pairing. The reason for this is that the implementation currently only looks for consecutive loads and stores to merge where as the pairing logic has no such restriction. We also only traverse the basic block top down for now while looking for instructions to merge.

…174598) This doesn't require p extension since it's just normal scalar instructions, but they're normally used with other p extension instructions so I just put them together.

…74585) Previously, the SPIRV BE only handled equality and inequality comparisons (ICMP_EQ, ICMP_NE) for i1 types using logical operations (OpLogicalEqual, OpLogicalNotEqual). Other comparison predicates (signed/unsigned less than, greater than, etc.) triggered an unreachable assertion. This patch extends the support for the missing predicates. The BE considers i1 values as booleans and in SPIR-V only logical operations can work on them. This patch lowers the missing predicates into supported logical operations. The lowering has been validated with instcombine to avoid introducing an unsound transformation (using the new test case).

…ility check The LLVM release version and Apple LLDB version follow slightly different numbering scheme. Make sure we set the minimum required LLDB version appropriately. Also refactors the `apple-lldb-pre-1000` feature check to use the same `get_lldb_version_string` method. Currently this was causing the LLDB LLVM formatters to be skipped on our public macOS CI.

The build mode has been deprecated in llvm#136314. According to the deprecation message, it was supposed to be removed in the LLVM 21 release. Each build mode increased the maintanance overhead when failing, such as in llvm#151117. Let's remove it in LLVM 22.

…en check for repeated load. NFC. (llvm#174950) This will make it easier to handle other splat values that are free to concat. There should be no need to do repeated peekThroughBitcasts for every (canonicalised) bitcasted operand.

…lvm#172652) This patch adds static instruction tables tests for the old Cyclone scheduling model bounded to `-mcpu=apple-m1`, for the sake of a reference. It creates a new `llvm/test/tools/llvm-mca/AArch64/Apple` directory, moves a Cyclone test there, and adds 2 tests `basic-instructions` and `neon-instructions` from Neoverse with reusable inputs (in addition to Neoverse, we also match stderr output of llvm-mca for instruciton warnings).

…152189)" This reverts commit 20d0ec8. The publish-sphinx-docs buildbot still uses LLVM_ENABLE_PROJECTS=openmp.

According to this RFC (https://discourse.llvm.org/t/rfc-adopt-regularly-scheduled-python-minimum-version-bumps/88841), we no longer support Python versions that have reached EOL. This PR mainly clarifies the somewhat vague wording of “a relatively recent Python 3 installation.”

…vm#174386)" needs downstream merge work This reverts commit 1af1cc2.

- https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/futures Towards llvm#172124

…t being hidden (llvm#174969) By the time the maxps nodes have been lowered, the v4f32 constant splats have been hidden behind a extract_subvector from the v8f32 constant splat

This patch adds the /linkreprofullpathrsp flag with the same behaviour as link.exe. This flag emits a file containing the full paths to each object passed to the link line. This is used in particular when linking Arm64X binaries, as you need the full path to all the Arm64 objects that were used in a standard Arm64 build. See: https://learn.microsoft.com/en-us/cpp/build/reference/link-repro-full-path-rsp for the Microsoft documentation of the flag. Relands llvm#165449

z1-cciauto · 2026-01-08T14:20:06Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3499

fpetrogalli and others added 30 commits January 8, 2026 00:35

[RISCV] Simplify testcases in xqcibi-redundant-copy-elim.ll. NFC. (ll…

8ae3fc2

…vm#174695) Remove unneeded load instructions and only remain one comparison instruction.

[RISCV] Add support for XAndesPerf branch on immediate in RISCVRedund…

5185f07

…antCopyElimination (llvm#174706) This patch is like what Xqcibi did in llvm#174358.

[RISCV] Improve cost modeling of RISCVTTIImpl::getConstantPoolLoadCos…

2789464

…t() (llvm#174438) Some machines are able to make use of AUIPC + ADDI or LUI + ADDI fusion, make sure to consider that in the cost model for `RISCVTTIImpl::getConstantPoolLoadCost()`.

[LLDB] Tentative fix for lldb-arm-ubuntu buildbot. (llvm#174893)

ff73cca

Not sure if this will fix the problem because I don't have a 32-bit arm machine to test with.

[flang] Make fir.result Pure operation. (llvm#173508)

84cc153

This allows speculating recursively speculatable operations containing `fir.result`. Note that making it Pure does not allow speculating `fir.result` itself from its containing operation, since it is a terminator.

[bazel] Make CAPIIR depend on TransformsPassIncGen (llvm#174902)

013c04a

CAPIIR includes this in some of its source files, so we need to ensure the header is around.

[NFC][PowerPC] fix IR to be splat and not zeroinitializer (llvm#174699)

b3564b2

IR should be a splat of 7 as this compares vector of elements with 7 (`vec[i]!=7`). Having `zeroinitializer` goes against this comparison. Co-authored-by: himadhith <himadhith.v@ibm.com>

[MLIR][Python][NFC] Use enum class instead of enum (llvm#174792)

7873abb

This PR replace `enum`s with `enum class`es in Python bindings. No functional change.

[llvm][utils] Make GitHubAPI methods pass around PR number (llvm#174860)

df2c0a7

This avoids repeatedly reparsing the URL to extract the number.

[SPARC][IAS] Implement Solaris Natural Instruction extension (llvm#17…

f813633

…0794) We match GCC's coverage of the extension, that is, everything except `setn` and `setnhi`. See also: https://docs.oracle.com/cd/E53394_01/html/E54833/gmael.html

[WebAssembly] Skip WASM-specific InstCombine tests properly (llvm#174900

7565bbe

) Un-breaks the build after llvm#169110.

[clang][bytecode] Check for non-trivial default ctors in unions (llvm…

4da37d3

…#174745) When activating a union member, none of the unions in that path can have a non-trivial constructor. Unfortunately, this is something we have to do when evaluating the bytecode, not while compiling it.

[RISCV] Simplify testcases in zibi.ll. NFC.

6ae4c0c

Remove unneeded load instructions.

[clang-doc] Add protected members to class template (llvm#174883)

fb7e805

There were already tags for protected members in the Mustache template, but didn't use the proper tags for the newer JSON scheme.

[RISCV] Fix name mangling for i64 vectors with riscv_rvv_vector_bits …

140e1de

…on RV32. (llvm#174703) On RV32, i64 elements have type 'long long'. Fixes llvm#174613.

[RISCV] Simplify the RUN lines for ZVFH/ZVFHMIN. NFC.

0b819a6

ZVFH and ZVFHMIN have the same code generation here.

[RISCV] Indent body of let scopes in RISCVInstrInfoXAndes.td. NFC.

39e9d57

For more readable.

[RISCV] Add missing comment at end of let scope in RISCVInstrInfoXAnd…

bf0c216

…es.td. NFC.

[RISCV][llvm] Support bitwise operation for XLEN fixed vectors (llvm#…

4d2563a

…174598) This doesn't require p extension since it's just normal scalar instructions, but they're normally used with other p extension instructions so I just put them together.

mgcarrasco and others added 17 commits January 8, 2026 11:53

merge main into amd-staging

587c5ae

Revert "[OpenMP] Remove LLVM_ENABLE_PROJECTS=openmp build mode (llvm#…

9ac2d0a

…152189)" This reverts commit 20d0ec8. The publish-sphinx-docs buildbot still uses LLVM_ENABLE_PROJECTS=openmp.

[libc++] Update our release notes for the upcoming release (llvm#174625)

9a8421f

unstable: llvm/test/Transforms/InstCombine/WebAssembly/fold-swizzle.ll

1770a8f

merge main into amd-staging

8ac7d61

Revert "[mlir][OpenMP] Translation support for taskloop construct (ll…

97a9e44

…vm#174386)" needs downstream merge work This reverts commit 1af1cc2.

[libc++][future] Applied [[nodiscard]] (llvm#174924)

56e8aa6

- https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/futures Towards llvm#172124

[X86] Add test showing failure to concat fma nodes due to the constan…

476ad9f

…t being hidden (llvm#174969) By the time the maxps nodes have been lowered, the v4f32 constant splats have been hidden behind a extract_subvector from the v8f32 constant splat

update revert_patches.txt

a62abc4

merge main into amd-staging

2e78ea9

ronlieb requested review from a team and dpalermo January 8, 2026 14:18

ronlieb requested review from Groverkss, nicolasvasilache and stellaraccident as code owners January 8, 2026 14:19

ronlieb removed request for Groverkss, nicolasvasilache and stellaraccident January 8, 2026 14:19

ronlieb requested a review from ergawy January 8, 2026 14:21

dpalermo approved these changes Jan 8, 2026

View reviewed changes

z1-cciauto merged commit acbadea into amd-staging Jan 8, 2026
28 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20260108070058 branch January 8, 2026 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #1038

merge main into amd-staging #1038

Uh oh!

ronlieb commented Jan 8, 2026

Uh oh!

z1-cciauto commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

merge main into amd-staging #1038

merge main into amd-staging #1038

Uh oh!

Conversation

ronlieb commented Jan 8, 2026

Uh oh!

z1-cciauto commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants