forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #1038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
z1-cciauto
merged 81 commits into
amd-staging
from
amd/merge/upstream_merge_20260108070058
Jan 8, 2026
Merged
merge main into amd-staging #1038
z1-cciauto
merged 81 commits into
amd-staging
from
amd/merge/upstream_merge_20260108070058
Jan 8, 2026
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…er. [NFCI] (llvm#174890) This function is invoked only for ELF targets, therefore it has been moved to the ELF-specific streamer. An assertion has been added to catch its invocations outside of an invocation that targets ELF.
…vm#174695) Remove unneeded load instructions and only remain one comparison instruction.
…antCopyElimination (llvm#174706) This patch is like what Xqcibi did in llvm#174358.
…t() (llvm#174438) Some machines are able to make use of AUIPC + ADDI or LUI + ADDI fusion, make sure to consider that in the cost model for `RISCVTTIImpl::getConstantPoolLoadCost()`.
…lvm#174811) The keep-registers mode isn't super useful without disabling explicit-locals, as the local gets/sets are irrelevant noise in most cases. Switching this test makes the output much more concise and will make upcoming changes easier to review.
Not sure if this will fix the problem because I don't have a 32-bit arm machine to test with.
…vm#173268) This commit adds support for family specific support for the following intrinsics: - ldmatrix - stmatrix - mma.block_scale, mma.sp.block_scale - redux.sync - cvt.rs - clusterlaunchcontrol - setmaxnreg - tcgen05.mma Removed `hasTcgen05Instructions` function in the favour of `hasTcgen05InstSupport` Updated wmma.py script with family specific support and added new tests
This allows speculating recursively speculatable operations containing `fir.result`. Note that making it Pure does not allow speculating `fir.result` itself from its containing operation, since it is a terminator.
Fixes a failure on llc for ubsan builds: ../lib/Target/RISCV/RISCVAsmPrinter.cpp:552:7: runtime error: downcast of null pointer of type 'RISCVTargetStreamer' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Target/RISCV/RISCVAsmPrinter.cpp:552:7
llvm#173659) This PR fixes a crash by validating the type of the `kind` attribute. For `vector.contract` and `vector.outerproduct`, the verifier now emits an error when `kind` is not a CombiningKindAttr. Fixes llvm#173555.
…e` to `shufflevector` (llvm#169110) Resolves llvm#169058. This adds ~~an InstCombine pass~~ a TTI hook to the WebAssembly backend that folds `i8x16.swizzle` and `i8x16.relaxed.swizzle` operations to `shufflevector` operations if their mask operands are constant. This is mainly useful for abstractions over the raw intrinsics--for instance, in architecture-generic SIMD code that may not be able to expose the constant shuffles due to type system limitations. I took most of this from the x86 backend (in particular, `simplifyX86vpermilvar` in `X86InstCombineIntrinsic`), and adapted it for the WebAssembly backend. There wasn't any previous `instCombineIntrinsic` method on the WebAssembly `TargetTransformInfo`, so I added it. Right now, this swizzle optimization is the only one it performs. As I noted in the transform itself, the "relaxed" swizzle actually has stricter preconditions than the non-relaxed one. If a non-negative but still out-of-bounds index is provided, the "relaxed" swizzle can choose between returning 0 and the lane at the index modulo 16. However, it must make the same choice every time, and we don't know which choice the runtime will make, so we can't constant-fold it. The regression tests were mostly generated by Claude and adapted a bit by me (I tried to follow the [InstCombine contributor guide](https://llvm.org/docs/InstCombineContributorGuide.html#tests)). There was previously no WebAssembly subdirectory within the InstCombine tests, so I created that too; as of now, the swizzle fold test is the only file in it. Everything else was written by myself (well, partly copy-pasted from the x86 backend). I'm not sure how to write an Alive2 test for this; I can't find any examples where the input is an arbitrary constant.
This patch handles the special case where an extract value yields an aggregate result, which then is used as an argument to a store. The SPIRV BE uses special intrinsics (`spv_extractv` and `spv_store`) to represent these through IRTranslator, however this creates a problem: `spv_store` is called as a function, and IRTranslator cannot handle arguments that take more than a vreg. For other functions, the aggregate argument replacement pass would have solved things, but it does not apply here. Hence, we apply the same mutate-into-Int32 solution here when dealing with stores, and restore the extract value's type (which we have available as a ValueAttr) during instruction selection.
…, Z) (llvm#173808) We use fli+fneg to generate negative float, eliminate the fneg for fma. Fold fma to vfnmsac.vf,vfnmsub.vf, vfnmacc.vf, vfnmadd.vf --------- Co-authored-by: Craig Topper <craig.topper@sifive.com>
CAPIIR includes this in some of its source files, so we need to ensure the header is around.
IR should be a splat of 7 as this compares vector of elements with 7 (`vec[i]!=7`). Having `zeroinitializer` goes against this comparison. Co-authored-by: himadhith <himadhith.v@ibm.com>
This PR replace `enum`s with `enum class`es in Python bindings. No functional change.
This avoids repeatedly reparsing the URL to extract the number.
…0794) We match GCC's coverage of the extension, that is, everything except `setn` and `setnhi`. See also: https://docs.oracle.com/cd/E53394_01/html/E54833/gmael.html
) Un-breaks the build after llvm#169110.
…#174745) When activating a union member, none of the unions in that path can have a non-trivial constructor. Unfortunately, this is something we have to do when evaluating the bytecode, not while compiling it.
Remove unneeded load instructions.
…ns (llvm#173658) Because - no one should use them in explicit function call forms, and - value-discarding casts to non-`void` are already diagnosed by Clang. Also add explanation for this to `CodingGuidelines.rst`.
There were already tags for protected members in the Mustache template, but didn't use the proper tags for the newer JSON scheme.
…on RV32. (llvm#174703) On RV32, i64 elements have type 'long long'. Fixes llvm#174613.
…173873) The current implementation does not drop unit extent dimension if that dimension is indexed by a non-trivial affine expression (i.e., not a single dimension or constant 0) on the first application of the transformation. However, it is possible to drop such dimensions if all dimensions involved in the affine expression are going to be dropped. So far, this required repeated application of the transformation, with the changes in this PR, the dimensions are dropped with a single application of the transformation. Signed-off-by: Lukas Sommer <lukas.sommer@amd.com>
ZVFH and ZVFHMIN have the same code generation here.
For more readable.
… or more words (llvm#174789) This patch adds support for generating the `Xqcilsm` multi-word load/store instructions for three or more words. We add a new function in the `RISCVLoadStoreOptimizer` pass for doing this separate from the one that does load store pairing. The reason for this is that the implementation currently only looks for consecutive loads and stores to merge where as the pairing logic has no such restriction. We also only traverse the basic block top down for now while looking for instructions to merge.
…174598) This doesn't require p extension since it's just normal scalar instructions, but they're normally used with other p extension instructions so I just put them together.
…74585) Previously, the SPIRV BE only handled equality and inequality comparisons (ICMP_EQ, ICMP_NE) for i1 types using logical operations (OpLogicalEqual, OpLogicalNotEqual). Other comparison predicates (signed/unsigned less than, greater than, etc.) triggered an unreachable assertion. This patch extends the support for the missing predicates. The BE considers i1 values as booleans and in SPIR-V only logical operations can work on them. This patch lowers the missing predicates into supported logical operations. The lowering has been validated with instcombine to avoid introducing an unsound transformation (using the new test case).
…ility check The LLVM release version and Apple LLDB version follow slightly different numbering scheme. Make sure we set the minimum required LLDB version appropriately. Also refactors the `apple-lldb-pre-1000` feature check to use the same `get_lldb_version_string` method. Currently this was causing the LLDB LLVM formatters to be skipped on our public macOS CI.
The build mode has been deprecated in llvm#136314. According to the deprecation message, it was supposed to be removed in the LLVM 21 release. Each build mode increased the maintanance overhead when failing, such as in llvm#151117. Let's remove it in LLVM 22.
…en check for repeated load. NFC. (llvm#174950) This will make it easier to handle other splat values that are free to concat. There should be no need to do repeated peekThroughBitcasts for every (canonicalised) bitcasted operand.
…lvm#172652) This patch adds static instruction tables tests for the old Cyclone scheduling model bounded to `-mcpu=apple-m1`, for the sake of a reference. It creates a new `llvm/test/tools/llvm-mca/AArch64/Apple` directory, moves a Cyclone test there, and adds 2 tests `basic-instructions` and `neon-instructions` from Neoverse with reusable inputs (in addition to Neoverse, we also match stderr output of llvm-mca for instruciton warnings).
According to this RFC (https://discourse.llvm.org/t/rfc-adopt-regularly-scheduled-python-minimum-version-bumps/88841), we no longer support Python versions that have reached EOL. This PR mainly clarifies the somewhat vague wording of “a relatively recent Python 3 installation.”
…vm#174386)" needs downstream merge work This reverts commit 1af1cc2.
…t being hidden (llvm#174969) By the time the maxps nodes have been lowered, the v4f32 constant splats have been hidden behind a extract_subvector from the v8f32 constant splat
This patch adds the /linkreprofullpathrsp flag with the same behaviour as link.exe. This flag emits a file containing the full paths to each object passed to the link line. This is used in particular when linking Arm64X binaries, as you need the full path to all the Arm64 objects that were used in a standard Arm64 build. See: https://learn.microsoft.com/en-us/cpp/build/reference/link-repro-full-path-rsp for the Microsoft documentation of the flag. Relands llvm#165449
Collaborator
dpalermo
approved these changes
Jan 8, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.