Skip to content

Conversation

@z1-cciauto
Copy link
Collaborator

No description provided.

nikic and others added 30 commits January 6, 2026 09:11
The offset hint can be negative, so we should use getSigned() here. This
avoids an assertion failure with
llvm#171456.

Extend the dynamic_cast tests to include a 32-bit target to cover this
case.
The register containing the values stored by `QC.SWMI` need to be
`GPRNoX0`.
…summaries (llvm#174398)

Depends on:
* llvm#174385

(only last commit is relevant for this review)

The `${var%s}` format isn't capable of formatting references to
C-strings. So the summary for those becomes `<no value available>`. This
patch prevents the system C-string formatter from applying to
references, which means the summary for such types will be empty. This
prompts LLDB to instead print the child, which is the referenced
C-string.

Before:
```
(lldb) v ref
(const char *&) ref = 0x000000016fdfe960 <no value available>
```

After:
```
(lldb) v ref
(const char *&) ref = 0x000000016fdfec40 (&ref = "hi")
```

An alternative would be to support references in the `ValueObject` dump
methods. We assume C-string are pointers/arrays in a lot of places, so
such a fix would be a more intrusive undertaking, and I'm not sure we
would want to support references there in the first place. So for now I
went with the fallback logic in this PR.
…lvm#173449)

## Summary

This PR fixes llvm#173370 in the `tosa-validate` pass that occurs when the
input IR does not contain any TOSA operations.

**Crash Message:**
```text
LLVM ERROR: can't create Attribute 'mlir::tosa::TargetEnvAttr' because storage uniquer isn't initialized: the dialect was likely not loaded, or the attribute wasn't added with addAttributes<...>() in the Dialect::initialize() method.
```
## Problem

When `mlir-opt` parses an input file without TOSA operations, the
`TosaDialect` is not lazily loaded. However, the `TosaValidation` pass
previously called `lookupTargetEnvOrDefault` (which attempts to create a
`TargetEnvAttr`) before checking if the dialect was loaded. This
resulted in an assertion failure because the attribute storage uniquer
was not initialized.

## Solution

I resolved the issue by placing the `TosaDialect` declaration at the
very top of `runOnOperation`.
This ensures that `lookupTargetEnvOrDefault` is not accessed when the
dialect is uninitialized, preventing the crash.


## Test

Added two test in `mlir/test/Dialect/Tosa/tosa_validation_init.mlir`.

First case is without TOSA operation.

```
// CHECK-LABEL: func.func @test_validation_pass_init
func.func @test_validation_pass_init(%arg0: tensor<1xf32>) -> tensor<1xf32> {
  // CHECK: math.asin
  %0 = math.asin %arg0 : tensor<1xf32>
  return %0 : tensor<1xf32>
}
```

Second case is with TOSA Operation.
```
// CHECK-LABEL: func.func @test_tosa_ops
func.func @test_tosa_ops(%arg0: tensor<1x2x3x4xf32>, %arg1: tensor<1x2x3x4xf32>) -> tensor<1x2x3x4xf32> {
  // CHECK: tosa.add
  %0 = tosa.add %arg0, %arg1 : (tensor<1x2x3x4xf32>, tensor<1x2x3x4xf32>) -> tensor<1x2x3x4xf32>
  return %0 : tensor<1x2x3x4xf32>
}
```
…requirements (llvm#174509)

Replace version matching with the new decorator to prevent typos, and
make to it clear why we skipped the test.
…() (llvm#171456)

Reapply after additional fixes in llvm#174426 and llvm#174431.

-----

Disable implicit truncation in the ConstantInt constructor by default.
This means that it needs to be passed a signed/unsigned (depending on
the IsSigned flag) value matching the bit width.

The intention is to prevent the recurring bug where people write
something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty`
is larger than 64-bit and then the value is incorrect due to missing
type extension.

This is the continuation of
llvm#112670, which originally
allowed implicit truncation in this constructor to reduce initial scope
of the change.
This patch follows the PR#421[1] from the ACLE

These 2 FP8 intrinsics had single removed from them: from
``svmla[_single]_za16[_mf8]_vg2x1_fpm`` to
``svmla_za16[_mf8]_vg2x1_fpm`` and from
``svmla[_single]_za32[_mf8]_vg4x1_fpm`` to
``svmla_za32[_mf8]_vg4x1_fpm``

[1]ARM-software/acle#421
…imes layout (llvm#172316)

Fixes llvm#172024

This is something a lot of people can probably figure out themselves but
having this obvious wrong turn in the getting started document isn't a
good first impression.

So I've added a note to highlight how to deal with it.

I don't want to go into detail there about the layout itself, but it
should be enough that people know to check by listing the contents of
the lib/ folder.
…X,Y)` fold to SDPatternMatch. NFC. (llvm#174554)

Merge the pair of commuted patterns.
…d code (llvm#174105)

We add barriers to the firstprivate copy region when they are required
to avoid a race condition with the lastprivate clause.

The problem is that these barriers are added by the compiler not implied
by user code so it is the compiler's problem to avoid deadlock.

I came across a testcase whilst working on taskloop support that looks a
bit like this
```
!$omp parallel
  !$omp single
    !$omp taskloop firstprivate(a) lastprivate(a)
      ...
  !$omp end single
!$omp end parallel
```

This is so that there are multiple threads for the generated tasks to be
distributed over, but we don't generate the tasks afresh in every
thread.

The problem comes when the taskloop requires a barrier to prevent the
datarace between firstprivate and lastprivate. This barrier will then be
generated inside of SINGLE and so only one thread will encounter the
barrier: leading to a deadlock.

This patch works around the problem by detecting this situation
statically and then not generating the barrier. There are cases where we
cannot detect this statically (e.g. if the TASKLOOP is inside a function
call inside of SINGLE). The program will still deadlock in this case
after my patch. I'm unsure what the solution would be for that case. I
want to fix this simple case in LLVM 22 before engaging in a longer
discussion as to whether there is a better way to handle the more
general case.

Testing using wsloop because I want to land this (or not) independently
of taskloop. Note that for wsloop it would be up to the programmer to
remember to use the nowait clause, but nowait cannot be used to control
generation of this barrier because it refers to the barrier after the
construct not after firstprivate copyin (before the construct
execution).
)

Reverts llvm#172477

This is causing failures for RVA23 (including some tests running away in
their execution causing OOM, hence the builder dying). I will attempt to
follow up on the PR with a reproducer of some kind.
https://lab.llvm.org/buildbot/#/builders/210/builds/7243
This patch add intrinsics for crpyto instructions defined in
ARM-software/acle#411 ACLE proposal
…llvm#174421)

llvm-mca currently attempts to read the input file (or stdin) even when
invoked with -mcpu=help. When the input is stdin, this causes the tool
to block unless an empty stdin is provided.

This patch now allows the available CPUs/features to be printed without
requiring stdin, while existing behaviour for all other invocations
still requires stdin.

- mcpu-help.test has been added

Follow-on from reverted
llvm#173399.

Implements @mshockwave's suggestion.
This patch was generated by following commands:
1. `npm install --save-dev prettier-plugin-organize-imports`
2. `npm run format`
3. `npm audit fix`

It partially addresses
[issue](llvm#151598) and improves
quality of ts code (formatting and unused imports).
…lvm#169445)

In llvm#168534 we made the
`TypePrinter` re-use `printNestedNameSpecifier` for printing scopes.
However, the way that the names of anonymous/unnamed types get printed
by the two are slightly inconsistent with each other.

`printNestedNameSpecifier` calls all `TagType`s without an identifer
`(anonymous)`. On the other hand, `TypePrinter` prints them slightly
more accurate (it differentiates anonymous vs. unnamed decls) and allows
for some additional customization points. E.g., with `MSVCFormatting`,
it will print `` `unnamed struct'`` instead of `(unnamed struct)`.
`printNestedNameSpecifier` already accounts for `MSVCFormatting` for
namespaces, but doesn't for `TagType`s. This inconsistency means that if
an unnamed tag is printed as part of a scope then it's displayed as
`(anonymous struct)`, but if it's the entity whose scope is being
printed, then it shows as `(unnamed struct)`.

This patch moves the printing of anonymous/unnamed tags into
`TagDecl::printName`. All the callsites that previously printed
anonymous tag decls now call `printName` to handle it. To preserve the
behaviour of not printing the kind name (i.e., `struct`/`class`/`enum`)
when printing the inner type of an elaborated type (i.e., avoiding
`struct (unnamed struct)`), this patch adds a
`PrintingPolicy::SuppressTagKeywordInAnonNames` that is appropriately
set when we want to suppress the tag keyword inside the anonymous name.
I had to make sure we set this bit to `false` when printing
nested-name-specifiers because we always want the tag keyword there
(e.g., `foo::(anonymous struct)::bar`) and for a `clangd` special case
which is described in a comment in the source.

**Test changes**

Mostly we now more accurately print the kind name of anonymous entities.
So there's a lot of `anonymous` -> `unnamed` changes. There are a
handful of `clangd` tests where the name of the entity is now `(unnamed
struct)` instead of just `(unnamed)`. That should be consistent with how
we choose to omit the tag keyword elsewhere. Since we're just printing
the name of the entity here, we include the kind tag.
…m#174527)

For dl `__builtin_amdgcn_fdot2` builtins, using 'x' in the def so that
it will take _Float16 for HIP/C++ and half for OpenCL.
Make llvm-exegesis more usable on AArch64 by doing the following:

Add some missing exegesis handling of register classes;
Add some missing LLVM AArch64 OperandTypes.

Llvm-exegesis can now handle many more AArch64 instructions.

AArch64 load/store instructions are not yet supported by llvm-exegesis,
until llvm#144895 lands.

---------

Co-authored-by: Cullen Rhodes <cullen.rhodes@arm.com>
This change allows using clang's `-ffat-lto-objects` flag with COFF
targets such as `i386-pc-win32`.

Follow-up to 759fb0a from which it was
split off. The added tests are adapted from the pre-existing ELF tests.
…les (llvm#172042)

[llvm#172040](llvm#172040)

This patch implements the scripts for generating the lookup tables and
associated utils for wctype classification functions. Not all Unicode
properties are covered as not all need a lookup table, the rest will be
hardcoded. The size of the generated tables is 47,8KB.
Some of the MIR test hit a bug where it errors if there is a
raw global reference as the referenced value. Worked around some
of those by just keeping a no-op bitcast constant expression.
…lvm#174569)

Reapply the zero handling, reverted in
108a22e

The failing libc test should have been fixed by
e25eacf
This patch extends the X86CompressEVEX pass to recognize and compress multi-instruction masking patterns to MOVMSK instructions. 

Fixes llvm#171746
…Int::get() (llvm#171456)"

This reverts commit d189b49.

Still causes assertion failures on some buildbots.
…m#174404)

This lets us properly annotate ranges for gpu.cluster_block_id and
gpu.cluster_dim_blocks. It also allows us to fill in the
nvvm.cluster_dim attribute for use in the NVVM backend.
@z1-cciauto z1-cciauto requested a review from fabianmcg as a code owner January 6, 2026 12:06
@z1-cciauto z1-cciauto requested a review from a team January 6, 2026 12:06
@z1-cciauto
Copy link
Collaborator Author

@z1-cciauto z1-cciauto merged commit 57baf6b into amd-staging Jan 6, 2026
14 checks passed
@z1-cciauto z1-cciauto deleted the upstream_merge_202601060706 branch January 6, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.