forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #1030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
z1-cciauto
merged 41 commits into
amd-staging
from
amd/merge/upstream_merge_20260107111146-1
Jan 8, 2026
Merged
merge main into amd-staging #1030
z1-cciauto
merged 41 commits into
amd-staging
from
amd/merge/upstream_merge_20260107111146-1
Jan 8, 2026
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
llvm#170356) …rinsics This patch adds support in Clang for these assembly instructions FCVTXNT, FCVTLT, {B}FCVTNT By implementing these prototypes: // Variant is available for _f64_f32 svfloat32_t svcvtlt_f32[_f16]_z (svbool_t pg, svfloat16_t op); // Variants are available for: // _f32_f64, _bf16_f32 svfloat16_t svcvtnt_f16[_f32]_z (svfloat16_t even, svbool_t pg, svfloat32_t op); svfloat32_t svcvtxnt_f32[_f64]_z (svfloat32_t even, svbool_t pg, svfloat64_t op); according to the ACLE[1] [1] ARM-software/acle#412 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…0793) Windows x86 binaries will now be built and uploaded automatically when a new release is tagged.
This is generating a random integer, so truncating is fine. Fixes the issue reported in: llvm#171456 (comment)
…select" (llvm#174758) Reverts llvm#173990 Reverting to address post-commit review feedback. Will recommit with fixes.
Fix real/imag when taking a primitive parameter _and_ being discarded, and fix the case where their subexpression can't be classified. Fixes llvm#174668
…gion simplification (llvm#173505) This commit simplifies the `remove-dead-values` pass and fixes a bug in the handling of `RegionBranchOpInterface` ops. The pass used to produce invalid IR ("null value found") for the newly added test case. `remove-dead-values` is a pass for additional IR simplification that cannot be performed by the canonicalizer pass. Based on a liveness analysis, it erases dead values / IR. (The liveness analysis is a dataflow analysis that has more information about the IR than a canonicalization pattern, which can see only "local" information.) Region-based ops are difficult. The liveness analysis may determine that an SSA value is dead. However, that does not mean that the value can actually be removed. Doing so may violate an region data flow (as modeled by the `RegionBranchOpInterface`). As an example, consider the case where a region branch terminator may dispatch to one of two region successor with the same forwarded values. A successor input (block argument) can be erased only if it is dead on both successors. Before this commit, there used to be complex logic to determine when it is safe to erase an SSA value. That logic was broken. The new implementation does not remove any block arguments or op results of region-based ops. Instead, operands of region-based ops and region branch terminators are replaced with `ub.poison` if all of their successor values are dead. This simplifies the IR good enough for the canonicalizer to perform the remaining region simplification (i.e., dropping block arguments etc.). RFC: https://discourse.llvm.org/t/rfc-delegate-simplification-of-region-based-ops-from-remove-dead-values-to-canonicalizer/89194
Glue does not carry any value (in the LLVM IR Value sense) that could be considered uniform or divergent.
…CMakeLists.txt Once we start adding more tests, having this in a separate CMakeLists.txt is more maintainable.
…74022) Summary: The other GPU enabled libraries, (openmp, flang-rt, compiler-rt, libc, libcxx, libcxx-abi) all support builds through a runtime cross-build. In these builds we use a separate CMake build that cross-compiles to a single target. This patch provides basic support for this with the `libclc` libraries. Changes include adding support for the more standard GPU compute triples (amdgcn-amd-amdhsa, nvptx64-nvidia-cuda) and building only one target in this mode. Some things left to do: This patch does not change the compiler invocations, this method would allow us to use standard CMake routines but this keeps it minimal. The prebuild support is questionable and doesn't fit into this scheme because it's a host executable, I'm ignoring it for now. The installed location should just use the triple with no `libclc/` subdirectory handling I believe.
`std::span` didn't have a formatter for MSVC's STL yet. The type is quite useful in C++ 20, so this PR adds a formatter for it. Since the formatter is new, I made it work with both DWARF and PDB from the start.
…172514) This patch updates the legalization of spv_insertelt and spv_extractelt to handle non-constant (dynamic) indices. When a dynamic index is encountered, the vector is spilled to the stack, and the element is accessed via OpAccessChain (lowered from spv_gep). This patch also adds custom legalization for G_STORE to scalarize vector stores and refines the legalization rules for G_LOAD, G_STORE, and G_BUILD_VECTOR. Fixes llvm#170534
…4756) This PR is quite similiar to llvm#174700. In this PR, I added a C API for each (upstream) MLIR attributes to retrieve its name (for example, `StringAttr -> mlirStringAttrGetName() -> "builtin.string"`), and exposed a corresponding type_name class attribute in the Python bindings (e.g., `StringAttr.attr_name -> "builtin.string"`). This can be used in various places to avoid hard-coded strings, such as eliminating the manual string in `irdl.base("#builtin.string")`. Note that parts of this PR (mainly mechanical changes) were produced via GitHub Copilot and GPT-5.2. I have manually reviewed the changes and verified them with tests to ensure correctness.
Add tablegen patterns to provide codegen for SCVTF and UCVTF operating purely on SIMD & FP registers, using explicit bitcasts.
I know this is required for at least one feature, because
TestSectionAPI.py has failures if zlib isn't enabled. So I think it's
useful for users to be able to check.
Now that it's in the config, I have also used it to make a test
annotation so we don't get the failure in TestSectionAPI.py when zlib is
disabled.
Which for future reference was:
Traceback (most recent call last):
File
"/home/davspi01/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py",
line 452, in wrapper
return func(self, *args, **kwargs)
File
"/home/davspi01/llvm-project/lldb/test/API/python_api/section/TestSectionAPI.py",
line 67, in test_compressed_section_data
self.assertEqual(section_data, [0x20, 0x30, 0x40, 0x50, 0x60, 0x70,
0x80, 0x90])
AssertionError: Lists differ: [] != [32, 48, 64, 80, 96, 112, 128, 144]
As it failed to decode the compressed section.
SPG says two bits for each operand.
This used to cause an ISel failure.
…he `Format` API into it (llvm#174618) This patch creates a new `FormatEntity::Formatter` class and moves `FormatEntity::Format` (and related APIs) into it. Most of the parameters to `Format` are immutable across all recursive calls, so I made them `const` member variables of `Formatter`. The main changes are just mechanical renaming of: ``` FormatEntity::Format(...) ``` to ``` FormatEntity::Formatter(...).Format(stream, entry, valobj) ``` and making use of the member variables from inside `Format`. We can probably make most of the parameters to the `Formatter` constructor defaulted, but I chose not to in this patch to keep the diff smaller. The motivation for this is that I'm planning on adding logic to detect recursive format entities (which would crash LLDB). That requires some state, which in my opinion is best kept inside the `Formatter` class instead of another parameter to `Format`. The patch should be entirely NFC.
In patch llvm#161840 I added bitcasts when lowering some NEON int scalar nodes, but I didn't properly tests that bitcasts are correctly emitted on the result as well. This patch adds those tests.
This patch adds the /linkreprofullpathrsp flag with the same behaviour as link.exe. This flag emits a file containing the full paths to each object passed to the link line. This is used in particular when linking Arm64X binaries, as you need the full path to all the Arm64 objects that were used in a standard Arm64 build. See: https://learn.microsoft.com/en-us/cpp/build/reference/link-repro-full-path-rsp for the Microsoft documentation of the flag.
…m#174788) Setting SubtargetPredicate around these multiclasses is redundant since it is always explicitly overridden for every def inside the multiclass.
…rmatter tests (llvm#174770) When building `cross-project-tests` with `_FORTIFY_SOURCE` set, we get following warnings: ``` In file included from /app/gcc/14.2.0/include/c++/14.2.0/x86_64-pc-linux-gnu/bits/os_defines.h:39, from /app/gcc/14.2.0/include/c++/14.2.0/x86_64-pc-linux-gnu/bits/c++config.h:680, from /app/gcc/14.2.0/include/c++/14.2.0/type_traits:38, from ../include/llvm/ADT/ADL.h:12, from ../include/llvm/ADT/Hashing.h:47, from ../include/llvm/ADT/ArrayRef.h:12, from ../../cross-project-tests/debuginfo-tests/llvm-prettyprinters/lldb/arrayref.cpp:1: /usr/include/features.h:381:4: warning: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Wcpp] 381 | # warning _FORTIFY_SOURCE requires compiling with optimization (-O) | ^~~~~~~ ``` This patch works around this by undefining the macro when compiling the LLDB formatter tests. If this ever becomes an issue we could try to detect `_FORTIFY_SOURCE` and skip the tests if set.
…matchesAnyListedRegexName (llvm#174414) This clarifies that patterns are regular expressions. Closes: llvm#174229
Instead of long text, use bullet points for readability.
…cing the result (llvm#174781) castAs<> will at least assert the cast is valid while getAs<> will always just return nullptr and then explode
…tions into helper function
…dules. (llvm#171769)" (llvm#174783) This reverts commit 1928c1e. We have at least one repro, but I won't be able to work on this until next week. Also with Clang 22 cut upcoming, we probably need to revert for now.
…uffer loads with GEPs in DXILResourceAccess pass (llvm#174666) Fixes llvm#174656 --------- Co-authored-by: Alex Sepkowski <alexsepkowski@gmail.com>
…lvm#174786) Not escaping < and > was causing the text not to get displayed in the documentation.
fixes llvm#170777 If we don't use vector type and instead continue to pass on the matrix type when we enter `EmitExtVectorElementExpr` Then we don't need to store the row and column length on the LValue. Using the Matrix type means we can reuse the isMatrixRow() cases in EmitLoadOfLValue and EmitStoreThroughLValue and not have to support a new lValue that is a hybrid between the ExtVectorElt and MatrixRow cases. All we need to do to support this is pass the list of column indices as a `ConstantDataVector` and check the size of this Vector to know how many column iterations we need to do. Further just index into the vector to fetch the right encoded element index value.
This adds a legalization pass to convert zero size arrays to legal types for common cases. It doesn't handle all cases, but if we see real use cases for other cases, we can add them in the future. For globals, and their initializers, we generally replace `[0 x T]` with `ptr`. For instructions, we either replace `[0 x T]` with `poision`, for `alloca` we just allocate `T`. This is motivated by IR generated by the OpenMP front end. Issue: llvm#170150 --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
…() (llvm#171456) Reapply after additional fixes. ----- Disable implicit truncation in the ConstantInt constructor by default. This means that it needs to be passed a signed/unsigned (depending on the IsSigned flag) value matching the bit width. The intention is to prevent the recurring bug where people write something like `ConstantInt::get(Ty, -1)`, and this "works" until `Ty` is larger than 64-bit and then the value is incorrect due to missing type extension. This is the continuation of llvm#112670, which originally allowed implicit truncation in this constructor to reduce initial scope of the change.
Comgr :: cache-tests/spirv-translator-cached.cl Comgr :: spirv-tests/spirv-to-reloc-debuginfo.hip Comgr :: spirv-tests/spirv-to-reloc.hip Comgr :: spirv-tests/spirv-translator.cl Comgr :: spirv-tests/spirv-translator.hip
Collaborator
dpalermo
approved these changes
Jan 7, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.