merge main into amd-staging #1030

ronlieb · 2026-01-07T21:45:15Z

No description provided.

llvm#170356) …rinsics This patch adds support in Clang for these assembly instructions FCVTXNT, FCVTLT, {B}FCVTNT By implementing these prototypes: // Variant is available for _f64_f32 svfloat32_t svcvtlt_f32[_f16]_z (svbool_t pg, svfloat16_t op); // Variants are available for: // _f32_f64, _bf16_f32 svfloat16_t svcvtnt_f16[_f32]_z (svfloat16_t even, svbool_t pg, svfloat32_t op); svfloat32_t svcvtxnt_f32[_f64]_z (svfloat32_t even, svbool_t pg, svfloat64_t op); according to the ACLE[1] [1] ARM-software/acle#412 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…0793) Windows x86 binaries will now be built and uploaded automatically when a new release is tagged.

This is generating a random integer, so truncating is fine. Fixes the issue reported in: llvm#171456 (comment)

…select" (llvm#174758) Reverts llvm#173990 Reverting to address post-commit review feedback. Will recommit with fixes.

Fix real/imag when taking a primitive parameter _and_ being discarded, and fix the case where their subexpression can't be classified. Fixes llvm#174668

…gion simplification (llvm#173505) This commit simplifies the `remove-dead-values` pass and fixes a bug in the handling of `RegionBranchOpInterface` ops. The pass used to produce invalid IR ("null value found") for the newly added test case. `remove-dead-values` is a pass for additional IR simplification that cannot be performed by the canonicalizer pass. Based on a liveness analysis, it erases dead values / IR. (The liveness analysis is a dataflow analysis that has more information about the IR than a canonicalization pattern, which can see only "local" information.) Region-based ops are difficult. The liveness analysis may determine that an SSA value is dead. However, that does not mean that the value can actually be removed. Doing so may violate an region data flow (as modeled by the `RegionBranchOpInterface`). As an example, consider the case where a region branch terminator may dispatch to one of two region successor with the same forwarded values. A successor input (block argument) can be erased only if it is dead on both successors. Before this commit, there used to be complex logic to determine when it is safe to erase an SSA value. That logic was broken. The new implementation does not remove any block arguments or op results of region-based ops. Instead, operands of region-based ops and region branch terminators are replaced with `ub.poison` if all of their successor values are dead. This simplifies the IR good enough for the canonicalizer to perform the remaining region simplification (i.e., dropping block arguments etc.). RFC: https://discourse.llvm.org/t/rfc-delegate-simplification-of-region-based-ops-from-remove-dead-values-to-canonicalizer/89194

Glue does not carry any value (in the LLVM IR Value sense) that could be considered uniform or divergent.

…CMakeLists.txt Once we start adding more tests, having this in a separate CMakeLists.txt is more maintainable.

…74022) Summary: The other GPU enabled libraries, (openmp, flang-rt, compiler-rt, libc, libcxx, libcxx-abi) all support builds through a runtime cross-build. In these builds we use a separate CMake build that cross-compiles to a single target. This patch provides basic support for this with the `libclc` libraries. Changes include adding support for the more standard GPU compute triples (amdgcn-amd-amdhsa, nvptx64-nvidia-cuda) and building only one target in this mode. Some things left to do: This patch does not change the compiler invocations, this method would allow us to use standard CMake routines but this keeps it minimal. The prebuild support is questionable and doesn't fit into this scheme because it's a host executable, I'm ignoring it for now. The installed location should just use the triple with no `libclc/` subdirectory handling I believe.

…174766)" This reverts commit 47a0d0e. Reverted due to test failures in LLVM_ENABLE_EXPENSIVE_CHECKS builds.

`std::span` didn't have a formatter for MSVC's STL yet. The type is quite useful in C++ 20, so this PR adds a formatter for it. Since the formatter is new, I made it work with both DWARF and PDB from the start.

…172514) This patch updates the legalization of spv_insertelt and spv_extractelt to handle non-constant (dynamic) indices. When a dynamic index is encountered, the vector is spilled to the stack, and the element is accessed via OpAccessChain (lowered from spv_gep). This patch also adds custom legalization for G_STORE to scalarize vector stores and refines the legalization rules for G_LOAD, G_STORE, and G_BUILD_VECTOR. Fixes llvm#170534

…4756) This PR is quite similiar to llvm#174700. In this PR, I added a C API for each (upstream) MLIR attributes to retrieve its name (for example, `StringAttr -> mlirStringAttrGetName() -> "builtin.string"`), and exposed a corresponding type_name class attribute in the Python bindings (e.g., `StringAttr.attr_name -> "builtin.string"`). This can be used in various places to avoid hard-coded strings, such as eliminating the manual string in `irdl.base("#builtin.string")`. Note that parts of this PR (mainly mechanical changes) were produced via GitHub Copilot and GPT-5.2. I have manually reviewed the changes and verified them with tests to ensure correctness.

Pulled out of llvm#169995

Add tablegen patterns to provide codegen for SCVTF and UCVTF operating purely on SIMD & FP registers, using explicit bitcasts.

I know this is required for at least one feature, because TestSectionAPI.py has failures if zlib isn't enabled. So I think it's useful for users to be able to check. Now that it's in the config, I have also used it to make a test annotation so we don't get the failure in TestSectionAPI.py when zlib is disabled. Which for future reference was: Traceback (most recent call last): File "/home/davspi01/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 452, in wrapper return func(self, *args, **kwargs) File "/home/davspi01/llvm-project/lldb/test/API/python_api/section/TestSectionAPI.py", line 67, in test_compressed_section_data self.assertEqual(section_data, [0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80, 0x90]) AssertionError: Lists differ: [] != [32, 48, 64, 80, 96, 112, 128, 144] As it failed to decode the compressed section.

SPG says two bits for each operand.

This used to cause an ISel failure.

…he `Format` API into it (llvm#174618) This patch creates a new `FormatEntity::Formatter` class and moves `FormatEntity::Format` (and related APIs) into it. Most of the parameters to `Format` are immutable across all recursive calls, so I made them `const` member variables of `Formatter`. The main changes are just mechanical renaming of: ``` FormatEntity::Format(...) ``` to ``` FormatEntity::Formatter(...).Format(stream, entry, valobj) ``` and making use of the member variables from inside `Format`. We can probably make most of the parameters to the `Formatter` constructor defaulted, but I chose not to in this patch to keep the diff smaller. The motivation for this is that I'm planning on adding logic to detect recursive format entities (which would crash LLDB). That requires some state, which in my opinion is best kept inside the `Formatter` class instead of another parameter to `Format`. The patch should be entirely NFC.

In patch llvm#161840 I added bitcasts when lowering some NEON int scalar nodes, but I didn't properly tests that bitcasts are correctly emitted on the result as well. This patch adds those tests.

This patch adds the /linkreprofullpathrsp flag with the same behaviour as link.exe. This flag emits a file containing the full paths to each object passed to the link line. This is used in particular when linking Arm64X binaries, as you need the full path to all the Arm64 objects that were used in a standard Arm64 build. See: https://learn.microsoft.com/en-us/cpp/build/reference/link-repro-full-path-rsp for the Microsoft documentation of the flag.

…m#174788) Setting SubtargetPredicate around these multiclasses is redundant since it is always explicitly overridden for every def inside the multiclass.

…rmatter tests (llvm#174770) When building `cross-project-tests` with `_FORTIFY_SOURCE` set, we get following warnings: ``` In file included from /app/gcc/14.2.0/include/c++/14.2.0/x86_64-pc-linux-gnu/bits/os_defines.h:39, from /app/gcc/14.2.0/include/c++/14.2.0/x86_64-pc-linux-gnu/bits/c++config.h:680, from /app/gcc/14.2.0/include/c++/14.2.0/type_traits:38, from ../include/llvm/ADT/ADL.h:12, from ../include/llvm/ADT/Hashing.h:47, from ../include/llvm/ADT/ArrayRef.h:12, from ../../cross-project-tests/debuginfo-tests/llvm-prettyprinters/lldb/arrayref.cpp:1: /usr/include/features.h:381:4: warning: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Wcpp] 381 | # warning _FORTIFY_SOURCE requires compiling with optimization (-O) | ^~~~~~~ ``` This patch works around this by undefining the macro when compiling the LLDB formatter tests. If this ever becomes an issue we could try to detect `_FORTIFY_SOURCE` and skip the tests if set.

…matchesAnyListedRegexName (llvm#174414) This clarifies that patterns are regular expressions. Closes: llvm#174229

Instead of long text, use bullet points for readability.

…cing the result (llvm#174781) castAs<> will at least assert the cast is valid while getAs<> will always just return nullptr and then explode

…tions into helper function

…m#174796)

…dules. (llvm#171769)" (llvm#174783) This reverts commit 1928c1e. We have at least one repro, but I won't be able to work on this until next week. Also with Clang 22 cut upcoming, we probably need to revert for now.

…uffer loads with GEPs in DXILResourceAccess pass (llvm#174666) Fixes llvm#174656 --------- Co-authored-by: Alex Sepkowski <alexsepkowski@gmail.com>

…lvm#174786) Not escaping < and > was causing the text not to get displayed in the documentation.

fixes llvm#170777 If we don't use vector type and instead continue to pass on the matrix type when we enter `EmitExtVectorElementExpr` Then we don't need to store the row and column length on the LValue. Using the Matrix type means we can reuse the isMatrixRow() cases in EmitLoadOfLValue and EmitStoreThroughLValue and not have to support a new lValue that is a hybrid between the ExtVectorElt and MatrixRow cases. All we need to do to support this is pass the list of column indices as a `ConstantDataVector` and check the size of this Vector to know how many column iterations we need to do. Further just index into the vector to fetch the right encoded element index value.

This adds a legalization pass to convert zero size arrays to legal types for common cases. It doesn't handle all cases, but if we see real use cases for other cases, we can add them in the future. For globals, and their initializers, we generally replace `[0 x T]` with `ptr`. For instructions, we either replace `[0 x T]` with `poision`, for `alloca` we just allocate `T`. This is motivated by IR generated by the OpenMP front end. Issue: llvm#170150 --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>

…() (llvm#171456) Reapply after additional fixes. ----- Disable implicit truncation in the ConstantInt constructor by default. This means that it needs to be passed a signed/unsigned (depending on the IsSigned flag) value matching the bit width. The intention is to prevent the recurring bug where people write something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty` is larger than 64-bit and then the value is incorrect due to missing type extension. This is the continuation of llvm#112670, which originally allowed implicit truncation in this constructor to reduce initial scope of the change.

Comgr :: cache-tests/spirv-translator-cached.cl Comgr :: spirv-tests/spirv-to-reloc-debuginfo.hip Comgr :: spirv-tests/spirv-to-reloc.hip Comgr :: spirv-tests/spirv-translator.cl Comgr :: spirv-tests/spirv-translator.hip

z1-cciauto · 2026-01-07T21:45:59Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3485

CarolineConcatto and others added 30 commits January 7, 2026 13:08

worklows/release-binaries: Add Windows release binary builds (llvm#15…

9363750

…0793) Windows x86 binaries will now be built and uploaded automatically when a new release is tagged.

[llvm-stress] Allow implicit truncation

faa7ede

This is generating a random integer, so truncating is fine. Fixes the issue reported in: llvm#171456 (comment)

Revert "[VectorCombine] Fold scalar selects from bitcast into vector …

1ab7b66

…select" (llvm#174758) Reverts llvm#173990 Reverting to address post-commit review feedback. Will recommit with fixes.

[clang][bytecode] Fix some imag/real corner cases (llvm#174764)

4fdbe05

Fix real/imag when taking a primitive parameter _and_ being discarded, and fix the case where their subexpression can't be classified. Fixes llvm#174668

SelectionDAG: Do not propagate divergence through glue (llvm#174766)

47a0d0e

Glue does not carry any value (in the LLVM IR Value sense) that could be considered uniform or divergent.

[cross-project-tests][formatters] Move LLDB test setup into it's own …

dd79244

…CMakeLists.txt Once we start adding more tests, having this in a separate CMakeLists.txt is more maintainable.

Revert "SelectionDAG: Do not propagate divergence through glue (llvm#…

e5623b1

…174766)" This reverts commit 47a0d0e. Reverted due to test failures in LLVM_ENABLE_EXPENSIVE_CHECKS builds.

[LLDB] Add MSVC STL span formatter (llvm#173053)

c7c5259

`std::span` didn't have a formatter for MSVC's STL yet. The type is quite useful in C++ 20, so this PR adds a formatter for it. Since the formatter is new, I made it work with both DWARF and PDB from the start.

[X86] packss.ll/packus.ll - add AVX512 test coverage (llvm#174777)

184e3c1

Pulled out of llvm#169995

[AArch64][llvm] Add codegen for simd fpcvt intrinsics (llvm#173272)

168de3c

Add tablegen patterns to provide codegen for SCVTF and UCVTF operating purely on SIMD & FP registers, using explicit bitcasts.

[mlir][spirv] Add support for GroupNonUniformQuadSwap (llvm#174747)

bc026e3

[libc++] Force rebuilding of the CI Docker image

d38959e

[NFCI][AMDGPU] Update Mode register mask for gfx1250 (llvm#174771)

7d9fda5

SPG says two bits for each operand.

[AArch64][SDAG] Select extractelement <vscale x 1 x i1> (llvm#173016)

6a37e3b

This used to cause an ISel failure.

[NFC] [AArch64] Add missing test to llvm#161840 (llvm#174775)

2495856

In patch llvm#161840 I added bitcasts when lowering some NEON int scalar nodes, but I didn't properly tests that bitcasts are correctly emitted on the result as well. This patch adds those tests.

[AMDGPU] Remove some redundant SubtargetPredicate settings. NFC. (llv…

1355111

…m#174788) Setting SubtargetPredicate around these multiclasses is redundant since it is always explicitly overridden for every def inside the multiclass.

[clang-tidy] Rename clang::tidy::matchers::matchesAnyListedName() to …

abbd1eb

…matchesAnyListedRegexName (llvm#174414) This clarifies that patterns are regular expressions. Closes: llvm#174229

[clang-tidy][NFC] Improve readabilty of Release Notes (llvm#174686)

a23f7ef

Instead of long text, use bullet points for readability.

[ByteCode] InterpBuiltin.cpp - consistently use castAs<> if dereferen…

7490901

…cing the result (llvm#174781) castAs<> will at least assert the cast is valid while getAs<> will always just return nullptr and then explode

[cross-project-tests][formatters] Factor out setting the LLDB test op…

8fb7ed7

…tions into helper function

[docs] [clang-tidy] add abseil-unchecked-statusor-access to list (llv…

52f85b0

…m#174796)

arsenm and others added 11 commits January 7, 2026 17:34

InstCombine: Handle fmul in SimplifyDemandedFPClass (llvm#173872)

28de3c1

[DirectX] Account for GlobalOffset in CurrentIndex calculation for cb…

57b0d83

…uffer loads with GEPs in DXILResourceAccess pass (llvm#174666) Fixes llvm#174656 --------- Co-authored-by: Alex Sepkowski <alexsepkowski@gmail.com>

[mlir][spirv][nfc] Escape < and > with ` in description and summary (l…

971b823

…lvm#174786) Not escaping < and > was causing the text not to get displayed in the documentation.

merge main into amd-staging

1fca86e

Constant:get Xteams fix

0d74f44

cleanup revert_patches.txt

2d0a378

[spirv] 5 xfails

ecb433c

Comgr :: cache-tests/spirv-translator-cached.cl Comgr :: spirv-tests/spirv-to-reloc-debuginfo.hip Comgr :: spirv-tests/spirv-to-reloc.hip Comgr :: spirv-tests/spirv-translator.cl Comgr :: spirv-tests/spirv-translator.hip

ronlieb requested review from a team and dpalermo January 7, 2026 21:45

ronlieb requested review from antiagainst, kuhar, lamb-j and stellaraccident as code owners January 7, 2026 21:45

ronlieb removed request for antiagainst, kuhar and stellaraccident January 7, 2026 21:45

dpalermo approved these changes Jan 7, 2026

View reviewed changes

z1-cciauto merged commit 71b6473 into amd-staging Jan 8, 2026
29 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20260107111146-1 branch January 8, 2026 00:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #1030

merge main into amd-staging #1030

Uh oh!

ronlieb commented Jan 7, 2026

Uh oh!

z1-cciauto commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

merge main into amd-staging #1030

merge main into amd-staging #1030

Uh oh!

Conversation

ronlieb commented Jan 7, 2026

Uh oh!

z1-cciauto commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants