forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #1011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The offset hint can be negative, so we should use getSigned() here. This avoids an assertion failure with llvm#171456. Extend the dynamic_cast tests to include a 32-bit target to cover this case.
The register containing the values stored by `QC.SWMI` need to be `GPRNoX0`.
…summaries (llvm#174398) Depends on: * llvm#174385 (only last commit is relevant for this review) The `${var%s}` format isn't capable of formatting references to C-strings. So the summary for those becomes `<no value available>`. This patch prevents the system C-string formatter from applying to references, which means the summary for such types will be empty. This prompts LLDB to instead print the child, which is the referenced C-string. Before: ``` (lldb) v ref (const char *&) ref = 0x000000016fdfe960 <no value available> ``` After: ``` (lldb) v ref (const char *&) ref = 0x000000016fdfec40 (&ref = "hi") ``` An alternative would be to support references in the `ValueObject` dump methods. We assume C-string are pointers/arrays in a lot of places, so such a fix would be a more intrusive undertaking, and I'm not sure we would want to support references there in the first place. So for now I went with the fallback logic in this PR.
…lvm#173449) ## Summary This PR fixes llvm#173370 in the `tosa-validate` pass that occurs when the input IR does not contain any TOSA operations. **Crash Message:** ```text LLVM ERROR: can't create Attribute 'mlir::tosa::TargetEnvAttr' because storage uniquer isn't initialized: the dialect was likely not loaded, or the attribute wasn't added with addAttributes<...>() in the Dialect::initialize() method. ``` ## Problem When `mlir-opt` parses an input file without TOSA operations, the `TosaDialect` is not lazily loaded. However, the `TosaValidation` pass previously called `lookupTargetEnvOrDefault` (which attempts to create a `TargetEnvAttr`) before checking if the dialect was loaded. This resulted in an assertion failure because the attribute storage uniquer was not initialized. ## Solution I resolved the issue by placing the `TosaDialect` declaration at the very top of `runOnOperation`. This ensures that `lookupTargetEnvOrDefault` is not accessed when the dialect is uninitialized, preventing the crash. ## Test Added two test in `mlir/test/Dialect/Tosa/tosa_validation_init.mlir`. First case is without TOSA operation. ``` // CHECK-LABEL: func.func @test_validation_pass_init func.func @test_validation_pass_init(%arg0: tensor<1xf32>) -> tensor<1xf32> { // CHECK: math.asin %0 = math.asin %arg0 : tensor<1xf32> return %0 : tensor<1xf32> } ``` Second case is with TOSA Operation. ``` // CHECK-LABEL: func.func @test_tosa_ops func.func @test_tosa_ops(%arg0: tensor<1x2x3x4xf32>, %arg1: tensor<1x2x3x4xf32>) -> tensor<1x2x3x4xf32> { // CHECK: tosa.add %0 = tosa.add %arg0, %arg1 : (tensor<1x2x3x4xf32>, tensor<1x2x3x4xf32>) -> tensor<1x2x3x4xf32> return %0 : tensor<1x2x3x4xf32> } ```
…requirements (llvm#174509) Replace version matching with the new decorator to prevent typos, and make to it clear why we skipped the test.
…() (llvm#171456) Reapply after additional fixes in llvm#174426 and llvm#174431. ----- Disable implicit truncation in the ConstantInt constructor by default. This means that it needs to be passed a signed/unsigned (depending on the IsSigned flag) value matching the bit width. The intention is to prevent the recurring bug where people write something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty` is larger than 64-bit and then the value is incorrect due to missing type extension. This is the continuation of llvm#112670, which originally allowed implicit truncation in this constructor to reduce initial scope of the change.
This patch follows the PR#421[1] from the ACLE These 2 FP8 intrinsics had single removed from them: from ``svmla[_single]_za16[_mf8]_vg2x1_fpm`` to ``svmla_za16[_mf8]_vg2x1_fpm`` and from ``svmla[_single]_za32[_mf8]_vg4x1_fpm`` to ``svmla_za32[_mf8]_vg4x1_fpm`` [1]ARM-software/acle#421
…imes layout (llvm#172316) Fixes llvm#172024 This is something a lot of people can probably figure out themselves but having this obvious wrong turn in the getting started document isn't a good first impression. So I've added a note to highlight how to deal with it. I don't want to go into detail there about the layout itself, but it should be enough that people know to check by listing the contents of the lib/ folder.
…X,Y)` fold to SDPatternMatch. NFC. (llvm#174554) Merge the pair of commuted patterns.
…d code (llvm#174105) We add barriers to the firstprivate copy region when they are required to avoid a race condition with the lastprivate clause. The problem is that these barriers are added by the compiler not implied by user code so it is the compiler's problem to avoid deadlock. I came across a testcase whilst working on taskloop support that looks a bit like this ``` !$omp parallel !$omp single !$omp taskloop firstprivate(a) lastprivate(a) ... !$omp end single !$omp end parallel ``` This is so that there are multiple threads for the generated tasks to be distributed over, but we don't generate the tasks afresh in every thread. The problem comes when the taskloop requires a barrier to prevent the datarace between firstprivate and lastprivate. This barrier will then be generated inside of SINGLE and so only one thread will encounter the barrier: leading to a deadlock. This patch works around the problem by detecting this situation statically and then not generating the barrier. There are cases where we cannot detect this statically (e.g. if the TASKLOOP is inside a function call inside of SINGLE). The program will still deadlock in this case after my patch. I'm unsure what the solution would be for that case. I want to fix this simple case in LLVM 22 before engaging in a longer discussion as to whether there is a better way to handle the more general case. Testing using wsloop because I want to land this (or not) independently of taskloop. Note that for wsloop it would be up to the programmer to remember to use the nowait clause, but nowait cannot be used to control generation of this barrier because it refers to the barrier after the construct not after firstprivate copyin (before the construct execution).
) Reverts llvm#172477 This is causing failures for RVA23 (including some tests running away in their execution causing OOM, hence the builder dying). I will attempt to follow up on the PR with a reproducer of some kind. https://lab.llvm.org/buildbot/#/builders/210/builds/7243
This patch add intrinsics for crpyto instructions defined in ARM-software/acle#411 ACLE proposal
…llvm#174421) llvm-mca currently attempts to read the input file (or stdin) even when invoked with -mcpu=help. When the input is stdin, this causes the tool to block unless an empty stdin is provided. This patch now allows the available CPUs/features to be printed without requiring stdin, while existing behaviour for all other invocations still requires stdin. - mcpu-help.test has been added Follow-on from reverted llvm#173399. Implements @mshockwave's suggestion.
Follow the LLVM coding standard
This patch was generated by following commands: 1. `npm install --save-dev prettier-plugin-organize-imports` 2. `npm run format` 3. `npm audit fix` It partially addresses [issue](llvm#151598) and improves quality of ts code (formatting and unused imports).
…lvm#169445) In llvm#168534 we made the `TypePrinter` re-use `printNestedNameSpecifier` for printing scopes. However, the way that the names of anonymous/unnamed types get printed by the two are slightly inconsistent with each other. `printNestedNameSpecifier` calls all `TagType`s without an identifer `(anonymous)`. On the other hand, `TypePrinter` prints them slightly more accurate (it differentiates anonymous vs. unnamed decls) and allows for some additional customization points. E.g., with `MSVCFormatting`, it will print `` `unnamed struct'`` instead of `(unnamed struct)`. `printNestedNameSpecifier` already accounts for `MSVCFormatting` for namespaces, but doesn't for `TagType`s. This inconsistency means that if an unnamed tag is printed as part of a scope then it's displayed as `(anonymous struct)`, but if it's the entity whose scope is being printed, then it shows as `(unnamed struct)`. This patch moves the printing of anonymous/unnamed tags into `TagDecl::printName`. All the callsites that previously printed anonymous tag decls now call `printName` to handle it. To preserve the behaviour of not printing the kind name (i.e., `struct`/`class`/`enum`) when printing the inner type of an elaborated type (i.e., avoiding `struct (unnamed struct)`), this patch adds a `PrintingPolicy::SuppressTagKeywordInAnonNames` that is appropriately set when we want to suppress the tag keyword inside the anonymous name. I had to make sure we set this bit to `false` when printing nested-name-specifiers because we always want the tag keyword there (e.g., `foo::(anonymous struct)::bar`) and for a `clangd` special case which is described in a comment in the source. **Test changes** Mostly we now more accurately print the kind name of anonymous entities. So there's a lot of `anonymous` -> `unnamed` changes. There are a handful of `clangd` tests where the name of the entity is now `(unnamed struct)` instead of just `(unnamed)`. That should be consistent with how we choose to omit the tag keyword elsewhere. Since we're just printing the name of the entity here, we include the kind tag.
…m#174527) For dl `__builtin_amdgcn_fdot2` builtins, using 'x' in the def so that it will take _Float16 for HIP/C++ and half for OpenCL.
Make llvm-exegesis more usable on AArch64 by doing the following: Add some missing exegesis handling of register classes; Add some missing LLVM AArch64 OperandTypes. Llvm-exegesis can now handle many more AArch64 instructions. AArch64 load/store instructions are not yet supported by llvm-exegesis, until llvm#144895 lands. --------- Co-authored-by: Cullen Rhodes <cullen.rhodes@arm.com>
This change allows using clang's `-ffat-lto-objects` flag with COFF targets such as `i386-pc-win32`. Follow-up to 759fb0a from which it was split off. The added tests are adapted from the pre-existing ELF tests.
…les (llvm#172042) [llvm#172040](llvm#172040) This patch implements the scripts for generating the lookup tables and associated utils for wctype classification functions. Not all Unicode properties are covered as not all need a lookup table, the rest will be hardcoded. The size of the generated tables is 47,8KB.
…m supported version (llvm#172664)
Some of the MIR test hit a bug where it errors if there is a raw global reference as the referenced value. Worked around some of those by just keeping a no-op bitcast constant expression.
…lvm#174569) Reapply the zero handling, reverted in 108a22e The failing libc test should have been fixed by e25eacf
This patch extends the X86CompressEVEX pass to recognize and compress multi-instruction masking patterns to MOVMSK instructions. Fixes llvm#171746
…Int::get() (llvm#171456)" This reverts commit d189b49. Still causes assertion failures on some buildbots.
…m#174404) This lets us properly annotate ranges for gpu.cluster_block_id and gpu.cluster_dim_blocks. It also allows us to fill in the nvvm.cluster_dim attribute for use in the NVVM backend.
…element masks (llvm#174570) We can't convert these to CONCAT_VECTORS/KUNPCK, but we might be able to concat the operands directly.
ronlieb
approved these changes
Jan 6, 2026
Collaborator
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.