merge main into amd-staging #1026

ronlieb · 2026-01-07T15:39:22Z

No description provided.

Conservatively predicate sdiv/srem: - RHS may carry poison in masked‑off lanes. - RHS could be −1 while LHS has masked‑off lanes (risking INT_MIN/−1 overflow). We’ll relax this once we can prove non‑wrap/non‑poison conditions. Fixes llvm#170775.

Comments weren't very visually distinctive in HTML. They immediately proceeded the declaration header and didn't have spacing between them. To visually organize them, they now have a thin border around them. Different comment types are also now separated by a small gap. This also allows them to be easily changed in the future. Some extraneous `<div>` tags are also removed or merged.

…lvm#173965) To avoid false dependency.

Originally some `[[nodiscard]]` tests were implemented in `*/test/libcxx/diagnostics`. The Standard has a library `Diagnostics` and this folder should be reserved for it by convention. Most newer tests were added to their respective sub-folders. This patch moves around the already implemented `[[nodiscard]]` tests to their respective folders where they belong and standardizes the name to `nodiscard.verify.cpp` wherever possible. N.B. This refactors only tests, which were merged. The remaining (in-progress) ones will be moved in a future patch to reduce merge conflicts.

…me for RISCV vendor relocations (llvm#172811) Use getRISCVVendorRelocationTypeName to resolve RISCV vendor-specific relocation names (R_RISCV_CUSTOM192-255) when preceded by R_RISCV_VENDOR. This improves the output of llvm-readobj and llvm-objdump to show vendor-specific names like R_RISCV_QC_ABS20_U, R_RISCV_QC_E_BRANCH (QUALCOMM) and R_RISCV_NDS_BRANCH_10 (ANDES) instead of generic R_RISCV_CUSTOM* names. Per RISC-V psABI, R_RISCV_VENDOR must be placed immediately before its associated vendor-specific relocation, so the vendor symbol is consumed after one use. Unknown vendors fall back to R_RISCV_CUSTOM*.

…llvm#174677) The RISCVTargetMachine was still selecting RISCVELFTargetObjectFile, which was making llc crash when running the test at llvm/test/CodeGen/RISCV/riscv-macho.ll

…vm#174674) (llvm#174697) This reverts commit 0b2f3cf.

... so that `local:*;` will be lexed as three tokens instead of a single one in a version node. This is used by both version scripts and dynamic lists. Fix llvm#174363 In addition, clean up special code for space-separated `local :` and `global :`. This patch brings our lexer behavior closer to GNU ld. While GNU ld additionally rejects more characters like `~/+,=`, we don't implement this additional validation. Pull Request: llvm#174530

…lvm#172759) dropRedundantArguments was incorrectly indexing into forwardedOperands using the block argument index directly. This crashes when the block has produced operands (generated by the terminator, not forwarded from predecessors) because forwardedOperands doesn't include them. The fix checks isOperandProduced() to skip produced arguments and uses SuccessorOperands::operator[] which handles the offset correctly.

…named modules (llvm#174687) Declarations from named modules are used naturally. Thet are declarations in other TU. We don't need to record the information for updating them.

…ity analysis (llvm#174117)" This reverts commit 371fad2. The change only fixes the superficial assertion. The real problem is that bb.3 and bb.4 should not have been identified as joins of bb.5

Modify the C++ emitter to detect when an AddressOf op traces back to a const global. If it does, emit a C-style cast `(T*)(&...)` to strip the const qualification.

llvm#174704) Close llvm#174543 The root cause of the problem is that the recursion in the code pattern triggers infinite loop in the checking process for TU local exposure.

…m#172465)

…/C++ (llvm#174707) For cvt and atomic `__builtin_amdgcn_cvt` builtins, using 'x' in the def to take _Float16 for HIP/C++ and half for OpenCL.

…4700) In this PR, I added a C API for each (upstream) MLIR type to retrieve its type name (for example, `IntegerType` -> `mlirIntegerTypeGetName()` -> `"builtin.integer"`), and exposed a corresponding `type_name` class attribute in the Python bindings (e.g., `IntegerType.type_name` -> `"builtin.integer"`). This can be used in various places to avoid hard-coded strings, such as eliminating the manual string in `irdl.base("!builtin.integer")`. Note that parts of this PR (mainly mechanical changes) were produced via GitHub Copilot and GPT-5.2. I have manually reviewed the changes and verified them with tests to ensure correctness.

…ns into main repository (llvm#166809) Allow the main llvm-project repository to contain the buildbot builder instructions, instead of storing them in llvm-zorg. The corresponding llvm-zorg PR is llvm/llvm-zorg#648. Using polly-x86_64-linux-test-suite as a proof-of-concept because that builder is currently offline, I am its maintainer, and is easier to build than an configuration supporting offload. Once the design has been decided, more builders can follow. Advantages are: * It is easier to make changes in the llvm-project repository. There are more reviewers than for the llvm-zorg repository. * Buildbot changes can be made in the same PR with changes that require updating the buildbot, e.g. changing the name of a CMake option. * Configuration changes take effect immeditately when landing; no buildbot master restart needed. * Some builders store a CMake cache file in the llvm-project repository for the reasons above. However, the number of changes that can be made with a CMake cache file alone are limited. Compared to AnnotatedBuilder, advantages are: * Reproducing a buildbot configuration locally made easy: just execute the script in-place. No llvm-zorg, local buildbot worker, or buildbot master needed. * Same for testing a change of a builder before landing it in llvm-zorg. Doing so with an AnnotatedBuilder requires two llvm-zorg checkouts: One for making the change of the builder script itself, which then is pushed to a private llvm-zorg branch on GitHub, and a second that is modified to fetch that branch instead of https://github.com/llvm/llvm-zorg/tree/main. * The AnnotatedBuilder scripts are located in the llvm-zorg repository and the buildbot-workers always checkout is always the top-of-trunk. This means that a buildbot configuration is split over three checkouts: * The checkout of llvm-project to be tested * The checkout of llvm-zorg by the buildbot-worker fetches; always the top-of-trunk, i.e may not match the revision of llvm-project that is executed (such as the CMake cache files located there), especially when using the "Force build" feature. * The checkout of llvm-zorg that the buildbot-master is running, which is updated only when the master is manually restarted. * The "Force Build" feature also allows for test-building any llvm-project PR. This is correctly handled by zorg's `addGetSourcecodeSteps`, but does not work with AnnotatedBuilders that checkout the llvm-project source on their own. The goal is to move as much as possible into the llvm-project repository such that there cannot be a mismatch between checkouts of different repositories. Ideally, the buildbot-master only needs to be updated+restarted for adding/removing workers, not for build configuration changes. --------- Co-authored-by: Jan Patrick Lehr <jp.lehr@gmail.com>

…() (llvm#171456) Reapply after additional fixes. ----- Disable implicit truncation in the ConstantInt constructor by default. This means that it needs to be passed a signed/unsigned (depending on the IsSigned flag) value matching the bit width. The intention is to prevent the recurring bug where people write something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty` is larger than 64-bit and then the value is incorrect due to missing type extension. This is the continuation of llvm#112670, which originally allowed implicit truncation in this constructor to reduce initial scope of the change.

…s in constraint-related nested scopes (llvm#173776) Fixes llvm#172814 --- This patch resolves an issue in which a lambda could be classified as always-dependent while traversing nested scopes, causing an assertion failure during capture handling. Changes in PR llvm#93206 expanded scope-based dependency handling in a way that could mark certain lambdas as always-dependent when no template-dependent context was present. This update refines the criteria for assigning `LambdaDependencyKind::LDK_AlwaysDependent` and applies it only after traversal reaches a distinct enclosing function scope with template-dependent parameters.

…lvm#173990)

See: llvm#171872 (comment)

Several BTI-related functions are checking that a call MCInst has one non-annotation operand. This patch changes these checks to use MCPlus::getNumPrimeOperands, instead of getNumOperands. Testing: added annotations to existing gtests to serve as regression tests. These now also explicitly check getNumOperands and getNumPrimeOperands usage on the annotated MCInsts.

Don't duplicate the EnumEntry type in llvm-objdump. Spliced off from llvm#173868, where this is required to avoid the name collision.

- Pass plugins can use LLVM options, matching llvm#173287. - Pass plugins can run a hook before codegen, matching llvm#171872. - Pass plugins are now tested whenever they can be built, matching llvm#171998. Plugins currently don't work on AIX, tracked in llvm#172203.

We have special cases for `std::find` with `char` and `int` on most platforms, so the only other benchmark of or vector implementation is currently with `long long`, which is a rather big type. Adding `short` to the benchmarks allows us to more meaningfully compare the different implementations.

…0006) Fix lifetime safety analysis for pointer dereference operations by properly tracking origin flow. Added support in `FactsGenerator::VisitUnaryOperator()` to handle the dereference operator (`UO_Deref`). This change ensures that when a pointer is dereferenced, the origin of the pointer is properly propagated to the dereference expression.

We were checking whether the structured data value could be got as a boolean, not what value that boolean had. This meant we were incorrectly showing "yes" for everything.

Check if an scAddExpr expressions represents an URem, and if it does, use the divisor to limit the conservative range. https://alive2.llvm.org/ce/z/VPxe7C PR: llvm#174456

…ss check (llvm#173854) Add extra check to `CallAndMessageChecker` to find uninitialized non-const values passed through parameters. This check is used only at a specific list of C library functions which have non-const pointer parameters and the value should be initialized before the call.

Right now, some paths for `mlir:python` are missing, which means relevant people don’t always get notified. This PR adds those paths.

Fixes llvm#172893. In the issue reported above there, it appears that LLDB is hitting a 3s timeout as part of some CI tests; this patch attempts to fix the issue by replacing the 3s timeout with a 60s timeout, which should be suitably long for any CI job (lldb-dap itself will automatically time out after 30s, so this should not be hit unless the process hangs).

Testing LLDB-DAP vscode extension creates files in the .vscode-test folder ignore them.

…ions" (llvm#173135) Reverts llvm#158169 The improved AA precision for atomic store operations causes the DSE pass to optimize out the object variables.

llvm#170356) …rinsics This patch adds support in Clang for these assembly instructions FCVTXNT, FCVTLT, {B}FCVTNT By implementing these prototypes: // Variant is available for _f64_f32 svfloat32_t svcvtlt_f32[_f16]_z (svbool_t pg, svfloat16_t op); // Variants are available for: // _f32_f64, _bf16_f32 svfloat16_t svcvtnt_f16[_f32]_z (svfloat16_t even, svbool_t pg, svfloat32_t op); svfloat32_t svcvtxnt_f32[_f64]_z (svfloat32_t even, svbool_t pg, svfloat64_t op); according to the ACLE[1] [1] ARM-software/acle#412 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…0793) Windows x86 binaries will now be built and uploaded automatically when a new release is tagged.

This is generating a random integer, so truncating is fine. Fixes the issue reported in: llvm#171456 (comment)

…Int::get() (llvm#171456)" This reverts commit a83c894.

…select" (llvm#174758) Reverts llvm#173990 Reverting to address post-commit review feedback. Will recommit with fixes.

…stantInt::get() llvm#171456

Fix real/imag when taking a primitive parameter _and_ being discarded, and fix the case where their subexpression can't be classified. Fixes llvm#174668

…gion simplification (llvm#173505) This commit simplifies the `remove-dead-values` pass and fixes a bug in the handling of `RegionBranchOpInterface` ops. The pass used to produce invalid IR ("null value found") for the newly added test case. `remove-dead-values` is a pass for additional IR simplification that cannot be performed by the canonicalizer pass. Based on a liveness analysis, it erases dead values / IR. (The liveness analysis is a dataflow analysis that has more information about the IR than a canonicalization pattern, which can see only "local" information.) Region-based ops are difficult. The liveness analysis may determine that an SSA value is dead. However, that does not mean that the value can actually be removed. Doing so may violate an region data flow (as modeled by the `RegionBranchOpInterface`). As an example, consider the case where a region branch terminator may dispatch to one of two region successor with the same forwarded values. A successor input (block argument) can be erased only if it is dead on both successors. Before this commit, there used to be complex logic to determine when it is safe to erase an SSA value. That logic was broken. The new implementation does not remove any block arguments or op results of region-based ops. Instead, operands of region-based ops and region branch terminators are replaced with `ub.poison` if all of their successor values are dead. This simplifies the IR good enough for the canonicalizer to perform the remaining region simplification (i.e., dropping block arguments etc.). RFC: https://discourse.llvm.org/t/rfc-delegate-simplification-of-region-based-ops-from-remove-dead-values-to-canonicalizer/89194

Glue does not carry any value (in the LLVM IR Value sense) that could be considered uniform or divergent.

…() (llvm#171456) Reapply after additional fixes. ----- Disable implicit truncation in the ConstantInt constructor by default. This means that it needs to be passed a signed/unsigned (depending on the IsSigned flag) value matching the bit width. The intention is to prevent the recurring bug where people write something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty` is larger than 64-bit and then the value is incorrect due to missing type extension. This is the continuation of llvm#112670, which originally allowed implicit truncation in this constructor to reduce initial scope of the change.

z1-cciauto · 2026-01-07T15:40:02Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3481

arcbbb and others added 30 commits January 7, 2026 04:25

Fix Bazel build for 4dc9a0e (llvm#174691)

af8bb1d

[X86][APX] Emit SetZUCC instead of legacy setcc when ZU is enabled (l…

afe8257

…lvm#173965) To avoid false dependency.

[RISC-V][Mach-O] Implement and select the RISCVMachOTargetObjectFile. (…

5ab8368

…llvm#174677) The RISCVTargetMachine was still selecting RISCVELFTargetObjectFile, which was making llc crash when running the test at llvm/test/CodeGen/RISCV/riscv-macho.ll

Reapply "[AMDGPU] Rework the clamp support for WMMA instructions" (ll…

5a63367

…vm#174674) (llvm#174697) This reverts commit 0b2f3cf.

[C++20] [Modules] Don't update MarkAsUsed information for decls from …

2451172

…named modules (llvm#174687) Declarations from named modules are used naturally. Thet are declarations in other TU. We don't need to record the information for updating them.

Revert "[UniformityAnalysis] Remove an incorrect assertion in uniform…

0501950

…ity analysis (llvm#174117)" This reverts commit 371fad2. The change only fixes the superficial assertion. The real problem is that bb.3 and bb.4 should not have been identified as joins of bb.5

[mlir][emitc] Fix creating pointer from constant array (llvm#162083)

62ac813

Modify the C++ emitter to detect when an AddressOf op traces back to a const global. If it does, emit a C-style cast `(T*)(&...)` to strip the const qualification.

[LV][EVL] Add test case for issue llvm#173260. nfc (llvm#173262)

3fbe927

[C++20] [Modules] Avoid infinite loop when checking TU local exposures (

b4ed102

llvm#174704) Close llvm#174543 The root cause of the problem is that the recursion in the code pattern triggers infinite loop in the checking process for TU local exposure.

[InstCombine][AArch64] Lower NEON shift intrinsics when possible (llv…

55eaa6c

…m#172465)

[AMDGPU] Modifies cvt and atomic builtin def to take _Float16 for HIP…

eb13822

…/C++ (llvm#174707) For cvt and atomic `__builtin_amdgcn_cvt` builtins, using 'x' in the def to take _Float16 for HIP/C++ and half for OpenCL.

[MemRef] Add dim reification for AssumeAlignmentOp (llvm#174477)

c85b8ff

[BOLT][AArch64] Add rseq test (llvm#174413)

164cfda

[VectorCombine] Fold scalar selects from bitcast into vector select (l…

72f18a0

…lvm#173990)

[Clang][NFC] XFAIL plugin tests on AIX

ad2c2b2

See: llvm#171872 (comment)

[llvm-objdump][NFC] Use EnumEntry from Support (llvm#174155)

87d6c27

Don't duplicate the EnumEntry type in llvm-objdump. Spliced off from llvm#173868, where this is required to avoid the name collision.

[ARM] Update and extend neon-dot-product.ll. NFC

7635474

usx95 and others added 22 commits January 7, 2026 11:02

[lldb] Correct version -v output for booleans (llvm#174742)

44b44bc

We were checking whether the structured data value could be got as a boolean, not what value that boolean had. This meant we were incorrectly showing "yes" for everything.

[SCEV] Handle URem pattern in getRangeRef. (llvm#174456)

1dea577

Check if an scAddExpr expressions represents an URem, and if it does, use the divisor to limit the conservative range. https://alive2.llvm.org/ce/z/VPxe7C PR: llvm#174456

[GitHub] Add more glob patterns for mlir:python label (llvm#174711)

17e6f51

Right now, some paths for `mlir:python` are missing, which means relevant people don’t always get notified. This PR adds those paths.

[lldb-dap][NFC] Ignore extension built test artefacts. (llvm#174724)

e8eb20b

Testing LLDB-DAP vscode extension creates files in the .vscode-test folder ignore them.

Revert "[AA] Improve precision for monotonic atomic load/store operat…

c8941df

…ions" (llvm#173135) Reverts llvm#158169 The improved AA precision for atomic store operations causes the DSE pass to optimize out the object variables.

worklows/release-binaries: Add Windows release binary builds (llvm#15…

9363750

…0793) Windows x86 binaries will now be built and uploaded automatically when a new release is tagged.

[llvm-stress] Allow implicit truncation

faa7ede

This is generating a random integer, so truncating is fine. Fixes the issue reported in: llvm#171456 (comment)

merge main into amd-staging

10103a3

Revert "Reapply [ConstantInt] Disable implicit truncation in Constant…

e612470

…Int::get() (llvm#171456)" This reverts commit a83c894.

Revert "[VectorCombine] Fold scalar selects from bitcast into vector …

1ab7b66

…select" (llvm#174758) Reverts llvm#173990 Reverting to address post-commit review feedback. Will recommit with fixes.

revert_patches.txt : [ConstantInt] Disable implicit truncation in Con…

c36b29c

…stantInt::get() llvm#171456

[clang][bytecode] Fix some imag/real corner cases (llvm#174764)

4fdbe05

Fix real/imag when taking a primitive parameter _and_ being discarded, and fix the case where their subexpression can't be classified. Fixes llvm#174668

SelectionDAG: Do not propagate divergence through glue (llvm#174766)

47a0d0e

Glue does not carry any value (in the LLVM IR Value sense) that could be considered uniform or divergent.

merge main into amd-staging

cf69f61

Constant:get Xteams fix

c86cdce

cleanup revert_patches.txt

d5e6b4f

ronlieb requested review from a team and dpalermo January 7, 2026 15:39

ronlieb requested review from krzysz00, kuhar and stellaraccident as code owners January 7, 2026 15:39

dpalermo approved these changes Jan 7, 2026

View reviewed changes

ronlieb closed this Jan 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #1026

merge main into amd-staging #1026

Uh oh!

ronlieb commented Jan 7, 2026

Uh oh!

z1-cciauto commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

merge main into amd-staging #1026

merge main into amd-staging #1026

Uh oh!

Conversation

ronlieb commented Jan 7, 2026

Uh oh!

z1-cciauto commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants