Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Jan 7, 2026

No description provided.

CarolineConcatto and others added 30 commits January 7, 2026 13:08
llvm#170356)

…rinsics

This patch adds support in Clang for these assembly instructions
FCVTXNT, FCVTLT, {B}FCVTNT
By implementing these prototypes:

// Variant is available for _f64_f32
svfloat32_t	svcvtlt_f32[_f16]_z	(svbool_t pg, svfloat16_t op);

// Variants are available for:
// _f32_f64, _bf16_f32
svfloat16_t svcvtnt_f16[_f32]_z (svfloat16_t even, svbool_t pg,
svfloat32_t op);

svfloat32_t svcvtxnt_f32[_f64]_z (svfloat32_t even, svbool_t pg,
svfloat64_t op);

according to the ACLE[1]

[1] ARM-software/acle#412

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…0793)

Windows x86 binaries will now be built and uploaded automatically when a new release is tagged.
This is generating a random integer, so truncating is fine.

Fixes the issue reported in:
llvm#171456 (comment)
…select" (llvm#174758)

Reverts llvm#173990
Reverting to address post-commit review feedback. Will recommit with
fixes.
Fix real/imag when taking a primitive parameter _and_ being discarded,
and fix the case where their subexpression can't be classified.

Fixes llvm#174668
…gion simplification (llvm#173505)

This commit simplifies the `remove-dead-values` pass and fixes a bug in
the handling of `RegionBranchOpInterface` ops. The pass used to produce
invalid IR ("null value found") for the newly added test case.

`remove-dead-values` is a pass for additional IR simplification that
cannot be performed by the canonicalizer pass. Based on a liveness
analysis, it erases dead values / IR. (The liveness analysis is a
dataflow analysis that has more information about the IR than a
canonicalization pattern, which can see only "local" information.)

Region-based ops are difficult. The liveness analysis may determine that
an SSA value is dead. However, that does not mean that the value can
actually be removed. Doing so may violate an region data flow (as
modeled by the `RegionBranchOpInterface`). As an example, consider the
case where a region branch terminator may dispatch to one of two region
successor with the same forwarded values. A successor input (block
argument) can be erased only if it is dead on both successors.

Before this commit, there used to be complex logic to determine when it
is safe to erase an SSA value. That logic was broken. The new
implementation does not remove any block arguments or op results of
region-based ops. Instead, operands of region-based ops and region
branch terminators are replaced with `ub.poison` if all of their
successor values are dead. This simplifies the IR good enough for the
canonicalizer to perform the remaining region simplification (i.e.,
dropping block arguments etc.).

RFC:
https://discourse.llvm.org/t/rfc-delegate-simplification-of-region-based-ops-from-remove-dead-values-to-canonicalizer/89194
Glue does not carry any value (in the LLVM IR Value sense) that could be
considered uniform or divergent.
…CMakeLists.txt

Once we start adding more tests, having this in a separate CMakeLists.txt is more maintainable.
…74022)

Summary:
The other GPU enabled libraries, (openmp, flang-rt, compiler-rt, libc,
libcxx, libcxx-abi) all support builds through a runtime cross-build. In
these builds we use a separate CMake build that cross-compiles to a
single target.

This patch provides basic support for this with the `libclc` libraries.
Changes include adding support for the more standard GPU compute triples
(amdgcn-amd-amdhsa, nvptx64-nvidia-cuda) and building only one target in
this mode.

Some things left to do:

This patch does not change the compiler invocations, this method would
allow us to use standard CMake routines but this keeps it minimal.

The prebuild support is questionable and doesn't fit into this scheme
because it's a host executable, I'm ignoring it for now.

The installed location should just use the triple with no `libclc/`
subdirectory handling I believe.
…174766)"

This reverts commit 47a0d0e.

Reverted due to test failures in LLVM_ENABLE_EXPENSIVE_CHECKS builds.
`std::span` didn't have a formatter for MSVC's STL yet. The type is
quite useful in C++ 20, so this PR adds a formatter for it.

Since the formatter is new, I made it work with both DWARF and PDB from
the start.
…172514)

This patch updates the legalization of spv_insertelt and spv_extractelt
to
handle non-constant (dynamic) indices. When a dynamic index is
encountered, the
vector is spilled to the stack, and the element is accessed via
OpAccessChain
(lowered from spv_gep).

This patch also adds custom legalization for G_STORE to scalarize vector
stores
and refines the legalization rules for G_LOAD, G_STORE, and
G_BUILD_VECTOR.

Fixes llvm#170534
…4756)

This PR is quite similiar to llvm#174700.

In this PR, I added a C API for each (upstream) MLIR attributes to
retrieve its name (for example, `StringAttr -> mlirStringAttrGetName()
-> "builtin.string"`), and exposed a corresponding type_name class
attribute in the Python bindings (e.g., `StringAttr.attr_name ->
"builtin.string"`). This can be used in various places to avoid
hard-coded strings, such as eliminating the manual string in
`irdl.base("#builtin.string")`.

Note that parts of this PR (mainly mechanical changes) were produced via
GitHub Copilot and GPT-5.2. I have manually reviewed the changes and
verified them with tests to ensure correctness.
Add tablegen patterns to provide codegen for SCVTF and UCVTF
operating purely on SIMD & FP registers, using explicit bitcasts.
I know this is required for at least one feature, because
TestSectionAPI.py has failures if zlib isn't enabled. So I think it's
useful for users to be able to check.

Now that it's in the config, I have also used it to make a test
annotation so we don't get the failure in TestSectionAPI.py when zlib is
disabled.

Which for future reference was:
Traceback (most recent call last):
File
"/home/davspi01/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py",
line 452, in wrapper
    return func(self, *args, **kwargs)
File
"/home/davspi01/llvm-project/lldb/test/API/python_api/section/TestSectionAPI.py",
line 67, in test_compressed_section_data
self.assertEqual(section_data, [0x20, 0x30, 0x40, 0x50, 0x60, 0x70,
0x80, 0x90])
AssertionError: Lists differ: [] != [32, 48, 64, 80, 96, 112, 128, 144]

As it failed to decode the compressed section.
…he `Format` API into it (llvm#174618)

This patch creates a new `FormatEntity::Formatter` class and moves
`FormatEntity::Format` (and related APIs) into it. Most of the
parameters to `Format` are immutable across all recursive calls, so I
made them `const` member variables of `Formatter`. The main changes are
just mechanical renaming of:
```
FormatEntity::Format(...)
```
to
```
FormatEntity::Formatter(...).Format(stream, entry, valobj)
```
and making use of the member variables from inside `Format`.

We can probably make most of the parameters to the `Formatter`
constructor defaulted, but I chose not to in this patch to keep the diff
smaller.

The motivation for this is that I'm planning on adding logic to detect
recursive format entities (which would crash LLDB). That requires some
state, which in my opinion is best kept inside the `Formatter` class
instead of another parameter to `Format`.

The patch should be entirely NFC.
In patch llvm#161840 I added bitcasts when lowering some NEON int scalar
nodes, but I didn't properly tests that bitcasts are correctly emitted
on the result as well. This patch adds those tests.
This patch adds the /linkreprofullpathrsp flag with the same behaviour
as link.exe. This flag emits a file containing the full paths to each
object passed to the link line.

This is used in particular when linking Arm64X binaries, as you need the
full path to all the Arm64 objects that were used in a standard Arm64
build.

See:

https://learn.microsoft.com/en-us/cpp/build/reference/link-repro-full-path-rsp
for the Microsoft documentation of the flag.
…m#174788)

Setting SubtargetPredicate around these multiclasses is redundant since
it is always explicitly overridden for every def inside the multiclass.
…rmatter tests (llvm#174770)

When building `cross-project-tests` with `_FORTIFY_SOURCE` set, we get
following warnings:
```
In file included from /app/gcc/14.2.0/include/c++/14.2.0/x86_64-pc-linux-gnu/bits/os_defines.h:39,
                 from /app/gcc/14.2.0/include/c++/14.2.0/x86_64-pc-linux-gnu/bits/c++config.h:680,
                 from /app/gcc/14.2.0/include/c++/14.2.0/type_traits:38,
                 from ../include/llvm/ADT/ADL.h:12,
                 from ../include/llvm/ADT/Hashing.h:47,
                 from ../include/llvm/ADT/ArrayRef.h:12,
                 from ../../cross-project-tests/debuginfo-tests/llvm-prettyprinters/lldb/arrayref.cpp:1:
/usr/include/features.h:381:4: warning: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Wcpp]
  381 | #  warning _FORTIFY_SOURCE requires compiling with optimization (-O)
      |    ^~~~~~~
```

This patch works around this by undefining the macro when compiling the
LLDB formatter tests.

If this ever becomes an issue we could try to detect `_FORTIFY_SOURCE`
and skip the tests if set.
…matchesAnyListedRegexName (llvm#174414)

This clarifies that patterns are regular expressions.

Closes: llvm#174229
Instead of long text, use bullet points for readability.
…cing the result (llvm#174781)

castAs<> will at least assert the cast is valid while getAs<> will always just return nullptr and then explode
arsenm and others added 11 commits January 7, 2026 17:34
…dules. (llvm#171769)" (llvm#174783)

This reverts commit 1928c1e.

We have at least one repro, but I won't be able to work on this until
next week. Also with Clang 22 cut upcoming, we probably need to revert
for now.
…uffer loads with GEPs in DXILResourceAccess pass (llvm#174666)

Fixes llvm#174656

---------

Co-authored-by: Alex Sepkowski <alexsepkowski@gmail.com>
…lvm#174786)

Not escaping < and > was causing the text not to get displayed in the
documentation.
fixes llvm#170777

If we don't use vector type and instead continue to pass on the matrix
type when we enter `EmitExtVectorElementExpr` Then we don't need to
store the row and column length on the LValue.

Using the Matrix type means we can reuse the isMatrixRow() cases in
EmitLoadOfLValue and EmitStoreThroughLValue and not have to support a
new lValue that is a hybrid between the ExtVectorElt and MatrixRow
cases.

All we need to do to support this is pass the list of column indices as
a `ConstantDataVector` and check the size of this Vector to know how
many column iterations we need to do. Further just index into the vector
to fetch the right encoded element index value.
This adds a legalization pass to convert zero size arrays to legal types
for common cases. It doesn't handle all cases, but if we see real use
cases for other cases, we can add them in the future.

For globals, and their initializers, we generally replace `[0 x T]` with
`ptr`.

For instructions, we either replace `[0 x T]` with `poision`, for
`alloca` we just allocate `T`.

This is motivated by IR generated by the OpenMP front end.

Issue: llvm#170150

---------

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
…() (llvm#171456)

Reapply after additional fixes.

-----

Disable implicit truncation in the ConstantInt constructor by default.
This means that it needs to be passed a signed/unsigned (depending on
the IsSigned flag) value matching the bit width.

The intention is to prevent the recurring bug where people write
something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty`
is larger than 64-bit and then the value is incorrect due to missing
type extension.

This is the continuation of
llvm#112670, which originally
allowed implicit truncation in this constructor to reduce initial scope
of the change.
 Comgr :: cache-tests/spirv-translator-cached.cl
 Comgr :: spirv-tests/spirv-to-reloc-debuginfo.hip
 Comgr :: spirv-tests/spirv-to-reloc.hip
 Comgr :: spirv-tests/spirv-translator.cl
 Comgr :: spirv-tests/spirv-translator.hip
@z1-cciauto
Copy link
Collaborator

@z1-cciauto z1-cciauto merged commit 71b6473 into amd-staging Jan 8, 2026
29 checks passed
@z1-cciauto z1-cciauto deleted the amd/merge/upstream_merge_20260107111146-1 branch January 8, 2026 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.