Skip to content

Conversation

@z1-cciauto
Copy link
Collaborator

No description provided.

aemerson and others added 30 commits December 10, 2025 16:14
In order to allow arm64 code to run on MTE environments, we need to make the
compiler only assume the top 4 bits can be ignored as MTE occupies the lower 4.

rdar://164645323
…1236)

This commit leaves "b" aliased to the old _regexp-break for now. The two
variants are identical except that `_regexp-break` allows you to say:

`(lldb) b <unrecognized_input> 
`
which gets translated to:

`break set <unrecognized_input>
`

So switching people to `_regexp-break-add` would be a surprising
behavior change. It would be wrong for `_regexp_break-add` have one
branch that call `break set`, so to avoid surprise, I'll add the command
and let people who are playing with `break add` instead of `break set`
can set the alias to the new one by hand for now.
Having duplicate mode entries previously asserted (or silently replaced
the last value with a new one in release builds). Report an error with
a helpful message instead.

Pull Request: llvm#171715
I have a change to validate the operand classes emitted in the AsmParser
and that caused llvm/test/MC/RISCV/rv32p-valid.s to fail due to the rd_wb
register using a different register class from rd:
`PWADDA_H operand 1 register X6 is not a member of register class GPRPair`
This happens because tablegen's AsmMatcherEmitter emits code to literally
copy over the tied registers and does not feed them through the equivalent
of RISCVAsmParser::validateTargetOperandClass() which would allow adjusting
these operand classes.

Ideally we would handle this in tablegen (or at least add an error), but
the tied operand handling logic is rather complex and I don't understand
it yet. For now just update the rd register class to match rd_wb.

Pull Request: llvm#171738
…TINS_DIR to COMPILER_RT_TEST_BUILTINS_DIR (llvm#171741)

Co-authored-by: David Tenty <daltenty@ibm.com>
llvm#171745)

… instrs. (llvm#169779)"

This reverts commit 2b958b9.

I might have broken the sanitizer-x86_64-linux bot


/home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_procmaps_linux.cpp
clang++:
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:248:
const T &llvm::ArrayRef<llvm::DbgValueLocEntry>::operator[](size_t)
const [T = llvm::DbgValueLocEntry]: Assertion `Index < Length &&
"Invalid index!"' failed.
…llvm#171721)

This enables MachineVerifier and MachineIR printing support for these
operands.
Add support for allocate statement with a source that is a device
variable.
Co-authored-by: Jérôme Duval <jerome.duval@gmail.com>
…pile commands (llvm#169640)

This patch fixes an issue in progress reporting where the processed item
counter could exceed the total item count, leading to confusing outputs
like [22/18].

Closes [llvm#169168](llvm#169168)
Previously, `isOSGlibc()` was returning true for musl triples as well.
This commit changes `isOSGlibc()` to return false for musl triples, and
updates all existing `isOSGlibc()` checks to call `isOSGlibc() ||
isMusl()`, in order to preserve existing behaviour.
…lvm#171747)

Replace dyn_cast with cast. The dyn_cast can never fail now. Previously
it never succeeded.
This implements WG14 N3734 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3734.pdf),
aka `_Defer`; it is currently only supported in C if `-fdefer-ts` is passed.
This patch adds support for the `ud ui5` macro instruction. The `ui5`
operand must be inthe range `0-31`. The macro expands to:

`amswap.w $rd, $r1, $rj`

where `ui5` specifies the register number used for `$rd` in the expanded
instruction, and `$rd` is the same as `$rj`.

Relevant binutils patch:

https://sourceware.org/pipermail/binutils/2025-December/146042.html
…lvm#171079)

This patch adds support for generating the Xqcilsm load/store multiple
instructions as a part of the RISCVLoadStoreOptimizer pass. For now we
only combine two load/store instructions into a load/store multiple.
Support for converting more loads/stores will be added in follow-up
patches. These instructions are only applicable for 32-bit loads/stores
with an alignment of 4-bytes.
…71568)

Changed the range computation in computeOverflowForUnsignedMul to use
computeConstantRange as well.

This expands the patterns that InstCombine manages to narrow a mul that
has values that come from zext, for example if a value comes from a div
operation then the known bits doesn't give the narrowest possible range
for that value.

---------

Co-authored-by: Adar Dagan <adar.dagan@mobileye.com>
…#171643)

Previously this only happened for constants of some types and missed
incorrect ptrtoaddr.
llvm#162653)

This folds `icmp (ptrtoaddr x, ptrtoaddr y)` to `icmp (x, y)`, matching
the existing ptrtoint fold. Restrict both folds to only the case where
the result type matches the address type.
    
I think that all folds this can do in practice end up actually being
valid for ptrtoint to a type large than the address size as well, but I
don't really see a way to justify this generically without making
assumptions about what kind of folding the recursive calls may do.

This is based on the icmp semantics specified in
llvm#163936.
-- This commit is the fourth in the series of adding matchers
for linalg.*conv*/*pool*. Refer:
llvm#163724
-- In this commit all variants of Conv2D convolution ops have been
   added.
-- It also refactors the way these matchers work to make adding more
matchers concise.

Signed-off-by: Abhishek Varma <abhvarma@amd.com>

---------

Signed-off-by: Abhishek Varma <abhvarma@amd.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Co-authored-by: hanhanW <hanhan0912@gmail.com>
…ames. NFC. (llvm#171645)

Both `decomposeBitTestICmp` and `decomposeBitTest` have a parameter
called `lookThroughTrunc`. This was spelled in full (i.e. `lookThroughTrunc`)
in the header. However, in the implementation, it's written as `lookThruTrunc`.

I opted to convert all instances of `lookThruTrunc` into
`lookThroughTrunc` to reduce surprise while reading the code and for
conformity.

---

The other change in this PR is the renaming of the wrapper around
`decomposeBitTest()`. Even though it was a wrapper around
`CmpInstAnalysis.h`'s `decomposeBitTest`, the function was called
`decomposeBitTestICmp`. This is quite confusing because such a function
_also_ exists in `CmpInstAnalysis.h`, but it is _not_ the one actually
being used in `InstCombineAndOrXor.cpp`.
Add `f64:32:64` to the data layout for AIX, to indicate that doubles
have a 32-bit ABI alignment and 64-bit preferred alignment.

Clang was already taking this into account, but it was not reflected in
LLVM's data layout.

A notable effect of this change is that `double` loads/stores with 4
byte alignment are no longer considered "unaligned" and avoid the
corresponding unaligned access legalization. I assume that this is
correct/desired for AIX. (The codegen previously already relied on this
in some places related to the call ABI simply by dint of assuming
certain stack locations were 8 byte aligned, even though they were only
actually 4 byte aligned.)

Fixes llvm#133599.
…171072)

This patch try to move all vl patterns and sd node patterns to
RISCVInstrInfoVVLPatterns.td and RISCVInstrInfoVSDPatterns.td
respectively. It removes redefinition of pattern classes for zvfbfa and
make it easier to maintain and change.

Note: this does not include intrinsic patterns, if we want to also unify
intrinsic patterns we need to also move pseudo instruction definitions
of zvfbfa to RISCVInstrInfoVPseudos.td.
…teExtInst` instead of `SPIRVRegularizer` (llvm#170155)

This patch consist of 2 parts:
* A first part that removes the scalar to vector promotion for built-ins
in the `SPIRVRegularizer`;
* and a second part that implements the promotion for built-ins from
scalar to vector in `generateExtInst`.

The implementation in `SPIRVRegularizer` had several issues:
* It rolled its own built-in pattern matching that was extremely
permissive
  * the compiler would crash if the built-in had a definition
  * the compiler would crash if the built-in had no arguments
* The compiler would crash if there were more than 2 function
definitions in the module.
* It'd be better if this was implemented as a module pass; where we
iterate over the users of the function, instead of scanning the whole
module for callers.

This patch does the scalar to vector promotion just before the
`OpExtInst` is generated. Without relying on the IR transformation.

One change in the generated code from the previous implementation is
that this version uses a single `OpCompositeConstruct` operation to
convert the scalar into a vector. The old implementation inserted an
element at the 0 position in an `undef` vector (using
`OpCompositeInsert`); then copied that element for every vector element
using `OpVectorShuffle`.

This patch also adds a test (`OpExtInst_vector_promotion_bug.ll`) that
highlights an issue in the builtin pattern matching that we're using:
our pattern matching doesn't consider the number of arguments, only the
demangled name, first and last arguments (`min(int,int,int)` matches the same builtin as `min(int, int)`).
Before this patch, `insertelement/extractelement` with dynamic indices
would
fail to select with `-O0` for vector 32-bit element types with sizes 3,
5, 6 and 7,
which did not map to a `SI_INDIRECT_SRC/DST` pattern.

Other "weird" sizes bigger than 8 (like 13) are properly handled
already.

To solve this issue we add the missing patterns for the problematic
sizes.

Solves SWDEV-568862
ldionne and others added 20 commits December 11, 2025 09:31
…llvm#171651)

Allocators should be extremely cheap, if not free, to copy. Furthermore,
we have requirements on allocator types that copies must compare equal,
and that move and copy must be the same.

Hence, taking an allocator by reference should not provide benefits
beyond making a copy of it. However, taking the allocator by reference
leads to complexity in __split_buffer, which can be removed if we stop
using that pattern.
Some [ideas for
improvement](llvm#169858 (review))
came up during review of recent changes to `isTRNMask`.
This PR applies them also to `isZIPMask`, which is implemented almost
identically.
This essentially reverts llvm#100685 and fixes the bidirectional and random
access specializations to be actually used.

```
Benchmark                                                                old             new    Difference    % Difference
------------------------------------------------------------  --------------  --------------  ------------  --------------
rng::find_end(deque<int>)_(match_near_end)/1000                       366.91           47.63       -319.28         -87.02%
rng::find_end(deque<int>)_(match_near_end)/1024                      3273.31           35.42      -3237.89         -98.92%
rng::find_end(deque<int>)_(match_near_end)/8192                    171608.41          285.04    -171323.38         -99.83%
rng::find_end(deque<int>)_(near_matches)/1000                       31808.40        19214.35     -12594.05         -39.59%
rng::find_end(deque<int>)_(near_matches)/1024                       37428.72        20773.87     -16654.85         -44.50%
rng::find_end(deque<int>)_(near_matches)/8192                     1719468.34      1213967.45    -505500.89         -29.40%
rng::find_end(deque<int>)_(process_all)/1000                          275.81          336.29         60.49          21.93%
rng::find_end(deque<int>)_(process_all)/1024                          258.88          320.36         61.47          23.74%
rng::find_end(deque<int>)_(process_all)/1048576                    277117.41       327640.37      50522.96          18.23%
rng::find_end(deque<int>)_(process_all)/8192                         2166.36         2533.52        367.16          16.95%
rng::find_end(deque<int>)_(same_length)/1000                         1280.06          362.53       -917.53         -71.68%
rng::find_end(deque<int>)_(same_length)/1024                         1419.99          417.58      -1002.40         -70.59%
rng::find_end(deque<int>)_(same_length)/8192                        11363.81         2870.63      -8493.18         -74.74%
rng::find_end(deque<int>)_(single_element)/1000                       277.22          363.52         86.31          31.13%
rng::find_end(deque<int>)_(single_element)/1024                       257.11          353.94         96.84          37.66%
rng::find_end(deque<int>)_(single_element)/8192                      2059.02         2762.29        703.27          34.16%
rng::find_end(deque<int>,_pred)_(match_near_end)/1000                 696.84           70.07       -626.77         -89.94%
rng::find_end(deque<int>,_pred)_(match_near_end)/1024                4774.82           70.75      -4704.07         -98.52%
rng::find_end(deque<int>,_pred)_(match_near_end)/8192              267492.37          549.57    -266942.81         -99.79%
rng::find_end(deque<int>,_pred)_(near_matches)/1000                 39414.88        31070.43      -8344.46         -21.17%
rng::find_end(deque<int>,_pred)_(near_matches)/1024                 38168.52        32362.18      -5806.34         -15.21%
rng::find_end(deque<int>,_pred)_(near_matches)/8192               2594717.16      1938056.79    -656660.38         -25.31%
rng::find_end(deque<int>,_pred)_(process_all)/1000                    600.88          586.92        -13.96          -2.32%
rng::find_end(deque<int>,_pred)_(process_all)/1024                    613.00          592.66        -20.33          -3.32%
rng::find_end(deque<int>,_pred)_(process_all)/1048576              600059.65       603440.98       3381.33           0.56%
rng::find_end(deque<int>,_pred)_(process_all)/8192                   4850.32         4764.56        -85.76          -1.77%
rng::find_end(deque<int>,_pred)_(same_length)/1000                   1514.90          700.34       -814.57         -53.77%
rng::find_end(deque<int>,_pred)_(same_length)/1024                   1561.14          705.80       -855.34         -54.79%
rng::find_end(deque<int>,_pred)_(same_length)/8192                  12544.84         5024.45      -7520.39         -59.95%
rng::find_end(deque<int>,_pred)_(single_element)/1000                 603.79          650.63         46.84           7.76%
rng::find_end(deque<int>,_pred)_(single_element)/1024                 614.93          656.43         41.50           6.75%
rng::find_end(deque<int>,_pred)_(single_element)/8192                4885.89         5225.71        339.82           6.96%
rng::find_end(forward_list<int>)_(match_near_end)/1000                770.05          769.32         -0.73          -0.09%
rng::find_end(forward_list<int>)_(match_near_end)/1024               4833.13         4733.24        -99.90          -2.07%
rng::find_end(forward_list<int>)_(match_near_end)/8192             259324.32       261066.84       1742.52           0.67%
rng::find_end(forward_list<int>)_(near_matches)/1000                38301.11        38608.61        307.50           0.80%
rng::find_end(forward_list<int>)_(near_matches)/1024                39370.54        39878.59        508.05           1.29%
rng::find_end(forward_list<int>)_(near_matches)/8192              2527338.50      2527722.47        383.97           0.02%
rng::find_end(forward_list<int>)_(process_all)/1000                   713.63          720.74          7.11           1.00%
rng::find_end(forward_list<int>)_(process_all)/1024                   727.81          731.60          3.79           0.52%
rng::find_end(forward_list<int>)_(process_all)/1048576             757728.47       766470.14       8741.67           1.15%
rng::find_end(forward_list<int>)_(process_all)/8192                  5821.05         5817.80         -3.25          -0.06%
rng::find_end(forward_list<int>)_(same_length)/1000                  1458.99         1454.50         -4.49          -0.31%
rng::find_end(forward_list<int>)_(same_length)/1024                  1507.73         1515.78          8.05           0.53%
rng::find_end(forward_list<int>)_(same_length)/8192                 20432.32        18658.93      -1773.39          -8.68%
rng::find_end(forward_list<int>)_(single_element)/1000                712.41          708.41         -4.00          -0.56%
rng::find_end(forward_list<int>)_(single_element)/1024                728.05          728.78          0.73           0.10%
rng::find_end(forward_list<int>)_(single_element)/8192               5795.48         6332.88        537.40           9.27%
rng::find_end(forward_list<int>,_pred)_(match_near_end)/1000          843.67          846.77          3.10           0.37%
rng::find_end(forward_list<int>,_pred)_(match_near_end)/1024         5267.90         5343.84         75.94           1.44%
rng::find_end(forward_list<int>,_pred)_(match_near_end)/8192       280912.75       286141.10       5228.35           1.86%
rng::find_end(forward_list<int>,_pred)_(near_matches)/1000          43386.35        44489.38       1103.03           2.54%
rng::find_end(forward_list<int>,_pred)_(near_matches)/1024          44929.84        45608.55        678.71           1.51%
rng::find_end(forward_list<int>,_pred)_(near_matches)/8192        2723281.29      2765369.43      42088.14           1.55%
rng::find_end(forward_list<int>,_pred)_(process_all)/1000             763.13          763.85          0.72           0.09%
rng::find_end(forward_list<int>,_pred)_(process_all)/1024             796.98          773.40        -23.58          -2.96%
rng::find_end(forward_list<int>,_pred)_(process_all)/1048576       858071.76       846166.06     -11905.69          -1.39%
rng::find_end(forward_list<int>,_pred)_(process_all)/8192            6282.19         6244.95        -37.24          -0.59%
rng::find_end(forward_list<int>,_pred)_(same_length)/1000            1560.18         1583.03         22.86           1.47%
rng::find_end(forward_list<int>,_pred)_(same_length)/1024            1603.94         1612.22          8.28           0.52%
rng::find_end(forward_list<int>,_pred)_(same_length)/8192           16907.98        15638.35      -1269.63          -7.51%
rng::find_end(forward_list<int>,_pred)_(single_element)/1000          746.72          754.08          7.36           0.99%
rng::find_end(forward_list<int>,_pred)_(single_element)/1024          761.27          771.75         10.48           1.38%
rng::find_end(forward_list<int>,_pred)_(single_element)/8192         6166.83         6687.87        521.04           8.45%
rng::find_end(list<int>)_(match_near_end)/1000                        793.99           67.06       -726.93         -91.55%
rng::find_end(list<int>)_(match_near_end)/1024                       4682.12           79.82      -4602.31         -98.30%
rng::find_end(list<int>)_(match_near_end)/8192                     263187.10          582.64    -262604.46         -99.78%
rng::find_end(list<int>)_(near_matches)/1000                        38066.70        34687.59      -3379.11          -8.88%
rng::find_end(list<int>)_(near_matches)/1024                        39721.77        36150.04      -3571.73          -8.99%
rng::find_end(list<int>)_(near_matches)/8192                      2543369.85      2247297.03    -296072.82         -11.64%
rng::find_end(list<int>)_(process_all)/1000                           716.89          726.65          9.76           1.36%
rng::find_end(list<int>)_(process_all)/1024                           742.41          744.05          1.64           0.22%
rng::find_end(list<int>)_(process_all)/1048576                     822449.08       873801.46      51352.38           6.24%
rng::find_end(list<int>)_(process_all)/8192                          7704.49         9766.50       2062.02          26.76%
rng::find_end(list<int>)_(same_length)/1000                          1508.19          710.90       -797.28         -52.86%
rng::find_end(list<int>)_(same_length)/1024                          1540.23          735.35       -804.88         -52.26%
rng::find_end(list<int>)_(same_length)/8192                         22786.44        10752.45     -12033.98         -52.81%
rng::find_end(list<int>)_(single_element)/1000                        699.16          734.76         35.60           5.09%
rng::find_end(list<int>)_(single_element)/1024                        717.09          750.91         33.82           4.72%
rng::find_end(list<int>)_(single_element)/8192                       9502.45        10289.21        786.76           8.28%
rng::find_end(list<int>,_pred)_(match_near_end)/1000                  841.98           83.86       -758.12         -90.04%
rng::find_end(list<int>,_pred)_(match_near_end)/1024                 5463.71           76.95      -5386.76         -98.59%
rng::find_end(list<int>,_pred)_(match_near_end)/8192               287070.76          647.14    -286423.62         -99.77%
rng::find_end(list<int>,_pred)_(near_matches)/1000                  43878.61        38899.00      -4979.61         -11.35%
rng::find_end(list<int>,_pred)_(near_matches)/1024                  45672.50        40520.68      -5151.82         -11.28%
rng::find_end(list<int>,_pred)_(near_matches)/8192                2764800.76      2495879.89    -268920.87          -9.73%
rng::find_end(list<int>,_pred)_(process_all)/1000                     764.46          774.78         10.32           1.35%
rng::find_end(list<int>,_pred)_(process_all)/1024                     786.81          793.05          6.24           0.79%
rng::find_end(list<int>,_pred)_(process_all)/1048576               934166.34       954637.60      20471.26           2.19%
rng::find_end(list<int>,_pred)_(process_all)/8192                    9509.24        10209.73        700.49           7.37%
rng::find_end(list<int>,_pred)_(same_length)/1000                    1545.67          782.96       -762.71         -49.34%
rng::find_end(list<int>,_pred)_(same_length)/1024                    1580.94          796.87       -784.08         -49.60%
rng::find_end(list<int>,_pred)_(same_length)/8192                   21558.41        13370.92      -8187.49         -37.98%
rng::find_end(list<int>,_pred)_(single_element)/1000                  766.49          762.81         -3.68          -0.48%
rng::find_end(list<int>,_pred)_(single_element)/1024                  784.75          781.47         -3.28          -0.42%
rng::find_end(list<int>,_pred)_(single_element)/8192                 9722.26        10399.11        676.85           6.96%
rng::find_end(vector<int>)_(match_near_end)/1000                      267.82           25.34       -242.48         -90.54%
rng::find_end(vector<int>)_(match_near_end)/1024                     2259.46           25.78      -2233.68         -98.86%
rng::find_end(vector<int>)_(match_near_end)/8192                   119747.92          214.53    -119533.39         -99.82%
rng::find_end(vector<int>)_(near_matches)/1000                      16913.73        14102.20      -2811.53         -16.62%
rng::find_end(vector<int>)_(near_matches)/1024                      16097.97        14767.26      -1330.71          -8.27%
rng::find_end(vector<int>)_(near_matches)/8192                    1102803.07       823463.30    -279339.78         -25.33%
rng::find_end(vector<int>)_(process_all)/1000                         233.43          380.28        146.85          62.91%
rng::find_end(vector<int>)_(process_all)/1024                         238.86          389.32        150.46          62.99%
rng::find_end(vector<int>)_(process_all)/1048576                   269619.36       391698.75     122079.39          45.28%
rng::find_end(vector<int>)_(process_all)/8192                        2011.46         3061.40       1049.94          52.20%
rng::find_end(vector<int>)_(same_length)/1000                         632.19          253.50       -378.69         -59.90%
rng::find_end(vector<int>)_(same_length)/1024                         556.53          254.87       -301.66         -54.20%
rng::find_end(vector<int>)_(same_length)/8192                        4597.26         2095.57      -2501.68         -54.42%
rng::find_end(vector<int>)_(single_element)/1000                      231.57          417.64        186.06          80.35%
rng::find_end(vector<int>)_(single_element)/1024                      236.41          427.03        190.62          80.63%
rng::find_end(vector<int>)_(single_element)/8192                     1918.95         3367.29       1448.33          75.48%
rng::find_end(vector<int>,_pred)_(match_near_end)/1000                581.49           52.67       -528.82         -90.94%
rng::find_end(vector<int>,_pred)_(match_near_end)/1024               3545.40           53.74      -3491.65         -98.48%
rng::find_end(vector<int>,_pred)_(match_near_end)/8192             190482.78          432.30    -190050.48         -99.77%
rng::find_end(vector<int>,_pred)_(near_matches)/1000                28878.24        24723.01      -4155.23         -14.39%
rng::find_end(vector<int>,_pred)_(near_matches)/1024                30035.85        25597.45      -4438.40         -14.78%
rng::find_end(vector<int>,_pred)_(near_matches)/8192              1858596.45      1584796.11    -273800.34         -14.73%
rng::find_end(vector<int>,_pred)_(process_all)/1000                   518.92          813.46        294.53          56.76%
rng::find_end(vector<int>,_pred)_(process_all)/1024                   531.17          710.20        179.03          33.70%
rng::find_end(vector<int>,_pred)_(process_all)/1048576             674064.13       905070.15     231006.01          34.27%
rng::find_end(vector<int>,_pred)_(process_all)/8192                  4254.34         6372.76       2118.43          49.79%
rng::find_end(vector<int>,_pred)_(same_length)/1000                  1106.96          526.23       -580.73         -52.46%
rng::find_end(vector<int>,_pred)_(same_length)/1024                  1133.60          539.70       -593.90         -52.39%
rng::find_end(vector<int>,_pred)_(same_length)/8192                  8988.10         4302.83      -4685.27         -52.13%
rng::find_end(vector<int>,_pred)_(single_element)/1000                528.11          523.69         -4.42          -0.84%
rng::find_end(vector<int>,_pred)_(single_element)/1024                539.58          838.49        298.91          55.40%
rng::find_end(vector<int>,_pred)_(single_element)/8192               4301.43         7313.22       3011.79          70.02%
std::find_end(deque<int>)_(match_near_end)/1000                       347.82           38.56       -309.26         -88.91%
std::find_end(deque<int>)_(match_near_end)/1024                      3340.80           34.54      -3306.27         -98.97%
std::find_end(deque<int>)_(match_near_end)/8192                    171599.83          281.87    -171317.96         -99.84%
std::find_end(deque<int>)_(near_matches)/1000                       29703.68        19712.27      -9991.41         -33.64%
std::find_end(deque<int>)_(near_matches)/1024                       32312.41        20008.21     -12304.20         -38.08%
std::find_end(deque<int>)_(near_matches)/8192                     1851286.99      1216112.34    -635174.65         -34.31%
std::find_end(deque<int>)_(process_all)/1000                          256.69          315.96         59.27          23.09%
std::find_end(deque<int>)_(process_all)/1024                          260.97          305.42         44.45          17.03%
std::find_end(deque<int>)_(process_all)/1048576                    273310.08       309499.13      36189.05          13.24%
std::find_end(deque<int>)_(process_all)/8192                         2071.33         2606.57        535.25          25.84%
std::find_end(deque<int>)_(same_length)/1000                         1422.58          441.07       -981.51         -68.99%
std::find_end(deque<int>)_(same_length)/1024                         1844.27          350.75      -1493.52         -80.98%
std::find_end(deque<int>)_(same_length)/8192                        14681.69         2839.26     -11842.43         -80.66%
std::find_end(deque<int>)_(single_element)/1000                       291.63          344.82         53.19          18.24%
std::find_end(deque<int>)_(single_element)/1024                       257.97          330.19         72.21          27.99%
std::find_end(deque<int>)_(single_element)/8192                      2220.10         2505.02        284.92          12.83%
std::find_end(deque<int>,_pred)_(match_near_end)/1000                 694.70           69.60       -625.11         -89.98%
std::find_end(deque<int>,_pred)_(match_near_end)/1024                4735.45           71.12      -4664.33         -98.50%
std::find_end(deque<int>,_pred)_(match_near_end)/8192              267417.02          561.03    -266855.99         -99.79%
std::find_end(deque<int>,_pred)_(near_matches)/1000                 42199.71        31597.49     -10602.22         -25.12%
std::find_end(deque<int>,_pred)_(near_matches)/1024                 38007.49        32362.16      -5645.33         -14.85%
std::find_end(deque<int>,_pred)_(near_matches)/8192               2607708.49      1935799.88    -671908.60         -25.77%
std::find_end(deque<int>,_pred)_(process_all)/1000                    599.65          552.71        -46.94          -7.83%
std::find_end(deque<int>,_pred)_(process_all)/1024                    615.88          554.17        -61.71         -10.02%
std::find_end(deque<int>,_pred)_(process_all)/1048576              598471.63       599441.79        970.16           0.16%
std::find_end(deque<int>,_pred)_(process_all)/8192                   4853.45         4394.20       -459.25          -9.46%
std::find_end(deque<int>,_pred)_(same_length)/1000                   1511.68          797.64       -714.04         -47.23%
std::find_end(deque<int>,_pred)_(same_length)/1024                   1568.63          810.85       -757.78         -48.31%
std::find_end(deque<int>,_pred)_(same_length)/8192                  12609.34         5092.02      -7517.32         -59.62%
std::find_end(deque<int>,_pred)_(single_element)/1000                 601.22          628.80         27.58           4.59%
std::find_end(deque<int>,_pred)_(single_element)/1024                 613.25          627.15         13.89           2.27%
std::find_end(deque<int>,_pred)_(single_element)/8192                4823.85         4795.25        -28.60          -0.59%
std::find_end(forward_list<int>)_(match_near_end)/1000                762.64          769.74          7.10           0.93%
std::find_end(forward_list<int>)_(match_near_end)/1024               4767.93         4840.87         72.94           1.53%
std::find_end(forward_list<int>)_(match_near_end)/8192             260275.68       260835.21        559.53           0.21%
std::find_end(forward_list<int>)_(near_matches)/1000                38020.76        38197.53        176.77           0.46%
std::find_end(forward_list<int>)_(near_matches)/1024                39028.86        39333.38        304.51           0.78%
std::find_end(forward_list<int>)_(near_matches)/8192              2524921.48      2523470.32      -1451.16          -0.06%
std::find_end(forward_list<int>)_(process_all)/1000                   699.95          699.93         -0.02          -0.00%
std::find_end(forward_list<int>)_(process_all)/1024                   715.24          712.07         -3.17          -0.44%
std::find_end(forward_list<int>)_(process_all)/1048576             755926.33       756976.31       1049.98           0.14%
std::find_end(forward_list<int>)_(process_all)/8192                  5696.72         5672.92        -23.81          -0.42%
std::find_end(forward_list<int>)_(same_length)/1000                  1485.84         1480.19         -5.65          -0.38%
std::find_end(forward_list<int>)_(same_length)/1024                  1493.62         1516.95         23.33           1.56%
std::find_end(forward_list<int>)_(same_length)/8192                 16833.75        13551.42      -3282.33         -19.50%
std::find_end(forward_list<int>)_(single_element)/1000                688.87          675.02        -13.85          -2.01%
std::find_end(forward_list<int>)_(single_element)/1024                688.89          691.59          2.69           0.39%
std::find_end(forward_list<int>)_(single_element)/8192               5735.87         6748.85       1012.98          17.66%
std::find_end(forward_list<int>,_pred)_(match_near_end)/1000          836.01          853.28         17.27           2.07%
std::find_end(forward_list<int>,_pred)_(match_near_end)/1024         5259.92         5299.30         39.39           0.75%
std::find_end(forward_list<int>,_pred)_(match_near_end)/8192       279479.85       285593.49       6113.65           2.19%
std::find_end(forward_list<int>,_pred)_(near_matches)/1000          42577.60        44550.54       1972.94           4.63%
std::find_end(forward_list<int>,_pred)_(near_matches)/1024          44374.19        45697.95       1323.76           2.98%
std::find_end(forward_list<int>,_pred)_(near_matches)/8192        2711138.03      2742988.33      31850.30           1.17%
std::find_end(forward_list<int>,_pred)_(process_all)/1000             752.03          762.75         10.72           1.43%
std::find_end(forward_list<int>,_pred)_(process_all)/1024             767.04          781.48         14.44           1.88%
std::find_end(forward_list<int>,_pred)_(process_all)/1048576       843453.35       861838.82      18385.47           2.18%
std::find_end(forward_list<int>,_pred)_(process_all)/8192            6241.65         6308.05         66.40           1.06%
std::find_end(forward_list<int>,_pred)_(same_length)/1000            2384.18         1589.21       -794.97         -33.34%
std::find_end(forward_list<int>,_pred)_(same_length)/1024            2428.97         1617.17       -811.80         -33.42%
std::find_end(forward_list<int>,_pred)_(same_length)/8192           16961.22        14972.86      -1988.36         -11.72%
std::find_end(forward_list<int>,_pred)_(single_element)/1000          743.31          752.77          9.47           1.27%
std::find_end(forward_list<int>,_pred)_(single_element)/1024          763.62          768.70          5.08           0.67%
std::find_end(forward_list<int>,_pred)_(single_element)/8192         6189.73         6934.04        744.31          12.02%
std::find_end(list<int>)_(match_near_end)/1000                        773.76           76.41       -697.35         -90.12%
std::find_end(list<int>)_(match_near_end)/1024                       4715.36           69.09      -4646.27         -98.53%
std::find_end(list<int>)_(match_near_end)/8192                     264864.51          584.19    -264280.32         -99.78%
std::find_end(list<int>)_(near_matches)/1000                        37650.69        35233.45      -2417.24          -6.42%
std::find_end(list<int>)_(near_matches)/1024                        39239.25        36699.13      -2540.13          -6.47%
std::find_end(list<int>)_(near_matches)/8192                      2543446.71      2252625.27    -290821.44         -11.43%
std::find_end(list<int>)_(process_all)/1000                           718.00          724.59          6.59           0.92%
std::find_end(list<int>)_(process_all)/1024                           735.14          746.70         11.57           1.57%
std::find_end(list<int>)_(process_all)/1048576                     812620.48       869606.78      56986.30           7.01%
std::find_end(list<int>)_(process_all)/8192                          8217.98         8462.53        244.55           2.98%
std::find_end(list<int>)_(same_length)/1000                          1500.85          716.45       -784.39         -52.26%
std::find_end(list<int>)_(same_length)/1024                          1534.13          736.62       -797.51         -51.98%
std::find_end(list<int>)_(same_length)/8192                         20274.06        10621.82      -9652.24         -47.61%
std::find_end(list<int>)_(single_element)/1000                        717.05          725.64          8.60           1.20%
std::find_end(list<int>)_(single_element)/1024                        732.87          742.44          9.57           1.31%
std::find_end(list<int>)_(single_element)/8192                       9835.11        11896.39       2061.28          20.96%
std::find_end(list<int>,_pred)_(match_near_end)/1000                  845.46           75.09       -770.37         -91.12%
std::find_end(list<int>,_pred)_(match_near_end)/1024                 5301.60           77.14      -5224.46         -98.54%
std::find_end(list<int>,_pred)_(match_near_end)/8192               281976.13          648.87    -281327.25         -99.77%
std::find_end(list<int>,_pred)_(near_matches)/1000                  44076.98        39576.32      -4500.67         -10.21%
std::find_end(list<int>,_pred)_(near_matches)/1024                  45531.64        41020.11      -4511.54          -9.91%
std::find_end(list<int>,_pred)_(near_matches)/8192                2756383.66      2503085.29    -253298.37          -9.19%
std::find_end(list<int>,_pred)_(process_all)/1000                     766.06          764.48         -1.58          -0.21%
std::find_end(list<int>,_pred)_(process_all)/1024                     780.35          799.51         19.15           2.45%
std::find_end(list<int>,_pred)_(process_all)/1048576               894643.71       898947.94       4304.24           0.48%
std::find_end(list<int>,_pred)_(process_all)/8192                    8436.41         9977.74       1541.33          18.27%
std::find_end(list<int>,_pred)_(same_length)/1000                    1545.22          784.29       -760.92         -49.24%
std::find_end(list<int>,_pred)_(same_length)/1024                    1583.27          808.52       -774.74         -48.93%
std::find_end(list<int>,_pred)_(same_length)/8192                   21850.99        10896.50     -10954.48         -50.13%
std::find_end(list<int>,_pred)_(single_element)/1000                  752.03          755.00          2.97           0.39%
std::find_end(list<int>,_pred)_(single_element)/1024                  774.22          784.14          9.92           1.28%
std::find_end(list<int>,_pred)_(single_element)/8192                10219.43        10396.49        177.05           1.73%
std::find_end(vector<int>)_(match_near_end)/1000                      277.37           28.45       -248.91         -89.74%
std::find_end(vector<int>)_(match_near_end)/1024                     2247.56           25.80      -2221.76         -98.85%
std::find_end(vector<int>)_(match_near_end)/8192                   119785.10          212.44    -119572.66         -99.82%
std::find_end(vector<int>)_(near_matches)/1000                      16351.34        14073.13      -2278.21         -13.93%
std::find_end(vector<int>)_(near_matches)/1024                      16656.33        14654.36      -2001.97         -12.02%
std::find_end(vector<int>)_(near_matches)/8192                    1181392.88       828918.96    -352473.91         -29.84%
std::find_end(vector<int>)_(process_all)/1000                         231.14          235.80          4.66           2.01%
std::find_end(vector<int>)_(process_all)/1024                         235.87          232.06         -3.81          -1.61%
std::find_end(vector<int>)_(process_all)/1048576                   239922.25       238229.38      -1692.87          -0.71%
std::find_end(vector<int>)_(process_all)/8192                        1837.43         1802.25        -35.19          -1.91%
std::find_end(vector<int>)_(same_length)/1000                         632.59          252.80       -379.79         -60.04%
std::find_end(vector<int>)_(same_length)/1024                         524.51          257.58       -266.94         -50.89%
std::find_end(vector<int>)_(same_length)/8192                        5159.01         2090.12      -3068.89         -59.49%
std::find_end(vector<int>)_(single_element)/1000                      229.56          250.47         20.91           9.11%
std::find_end(vector<int>)_(single_element)/1024                      234.86          252.18         17.32           7.37%
std::find_end(vector<int>)_(single_element)/8192                     1825.74         1981.90        156.16           8.55%
std::find_end(vector<int>,_pred)_(match_near_end)/1000                574.17           52.98       -521.19         -90.77%
std::find_end(vector<int>,_pred)_(match_near_end)/1024               3525.35           54.03      -3471.32         -98.47%
std::find_end(vector<int>,_pred)_(match_near_end)/8192             190155.81          423.41    -189732.40         -99.78%
std::find_end(vector<int>,_pred)_(near_matches)/1000                28541.98        24598.37      -3943.61         -13.82%
std::find_end(vector<int>,_pred)_(near_matches)/1024                29696.55        25675.27      -4021.28         -13.54%
std::find_end(vector<int>,_pred)_(near_matches)/8192              1846970.41      1596191.84    -250778.57         -13.58%
std::find_end(vector<int>,_pred)_(process_all)/1000                   519.71          592.14         72.43          13.94%
std::find_end(vector<int>,_pred)_(process_all)/1024                   529.74          491.07        -38.67          -7.30%
std::find_end(vector<int>,_pred)_(process_all)/1048576             631923.41       643729.57      11806.16           1.87%
std::find_end(vector<int>,_pred)_(process_all)/8192                  4215.05         3909.30       -305.75          -7.25%
std::find_end(vector<int>,_pred)_(same_length)/1000                  1095.46          524.99       -570.47         -52.08%
std::find_end(vector<int>,_pred)_(same_length)/1024                  1117.95          537.65       -580.31         -51.91%
std::find_end(vector<int>,_pred)_(same_length)/8192                  8923.95         4307.13      -4616.83         -51.74%
std::find_end(vector<int>,_pred)_(single_element)/1000                516.52          656.32        139.80          27.07%
std::find_end(vector<int>,_pred)_(single_element)/1024                528.82          673.72        144.90          27.40%
std::find_end(vector<int>,_pred)_(single_element)/8192               4210.37         5529.52       1319.15          31.33%
Geomean                                                              6995.43         3440.97      -3554.46         -50.81%
```
…iginally legal f64 values that we can store directly. (llvm#171602)

Based off feedback from llvm#171478
…m#171637)

They were using the wrong scheduler resource. They're also missing from
the optimisation guides, but WriteLD should be closer at least.
…lvm#169914)

This is technically ABI breaking, since `is_trivial` and
`is_trivially_default_constructible` now return different results.
However, I don't think that's a significant issue, since `allocator` is
almost always used in classes which own memory, making them non-trivial
anyways.
…m#169413)

We've seen in quite a few cases while optimizing `__tree`'s copy
construction that `_DetachedTreeCache` is actually quite slow and not
necessarily an optimization at all. This patch removes the code, since
it's now only used by `operator=(initializer_list)`, which should be
quite cold code. We might look into actually optimizing it again in the
future, but I doubt an optimization will be small enough compared to the
likely speedup in real-world code this would give.
…lvm#165160)

This removes a bit of code duplication and might simplify future
segmented iterator optimitations.
Adding Annotation Inference in Lifetime Analysis.

This PR implicitly adds lifetime bound annotations to the AST which is
then used by functions which are parsed later to detect UARs etc.
Example:

```cpp
std::string_view f1(std::string_view a) {
  return a;
}

std::string_view f2(std::string_view a) {
  return f1(a);
}

std::string_view ff(std::string_view a) {
  std::string stack = "something on stack";
  return f2(stack); // warning: address of stack memory is returned
}
```

Note:

1. We only add lifetime bound annotations to the functions being
analyzed currently.
2. Currently, both annotation suggestion and inference work
simultaneously. This can be modified based on requirements.
3. The current approach works given that functions are already present
in the correct order (callee-before-caller). For not so ideal cases, we
can create a CallGraph prior to calling the analysis. This can be done
in the next PR.
Depends upon llvm#170900

Re-land llvm#169544

Previously we were less specific for POINTER/TARGET: encoding that they
could alias with (almost) anything.

In the new system, the "target data" tree is now a sibling of the other
trees (e.g. "global data"). POITNTER variables go at the root of the
"target data" tree, whereas TARGET variables get their own nodes under
that tree. For example,

```
integer, pointer :: ip
real, pointer :: rp
integer, target :: it
integer, target :: it2(:)
real, target :: rt
integer :: i
real :: r
```
- `ip` and `rp` may alias with any variable except `i` and `r`.
- `it`, `it2`, and `rt` may alias only with `ip` or `rp`.
- `i` and `r` cannot alias with any other variable.

Fortran 2023 15.5.2.14 gives restrictions on entities associated with
dummy arguments. These do not allow non-target globals to be modified
through dummy arguments and therefore I don't think we need to make all
globals alias with dummy arguments.

I haven't implemented it in this patch, but I wonder whether it is ever
possible for `ip` to alias with `rt`.

While I was updating the tests I fixed up some tests that still assumed
that local alloc tbaa wasn't the default.

Cray pointers/pointees are (optionally) modelled as aliasing with all
non-descriptor data. This is not enabled by default.

I found no functional regressions in the gfortran test suite.
…m#170323) (llvm#171787)

```
Step 7 (test-check-all) failure: Test just built components: check-all completed (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/insert_vector_dynelt.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
# RUN: at line 3
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck --check-prefixes=GCN-O0 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji
# .---command stderr------------
# |
# | # After Instruction Selection
# | # Machine code for function insert_dyn_i32_6: IsSSA, TracksLiveness
# | Function Live Ins: $sgpr16 in %8, $sgpr17 in %9, $sgpr18 in %10, $sgpr19 in %11, $sgpr20 in %12, $sgpr21 in %13, $vgpr0 in %14, $vgpr1 in %15
# |
# | bb.0 (%ir-block.0):
# |   successors: %bb.1(0x80000000); %bb.1(100.00%)
# |   liveins: $sgpr16, $sgpr17, $sgpr18, $sgpr19, $sgpr20, $sgpr21, $vgpr0, $vgpr1
# |   %15:vgpr_32 = COPY $vgpr1
# |   %14:vgpr_32 = COPY $vgpr0
# |   %13:sgpr_32 = COPY $sgpr21
# |   %12:sgpr_32 = COPY $sgpr20
# |   %11:sgpr_32 = COPY $sgpr19
# |   %10:sgpr_32 = COPY $sgpr18
# |   %9:sgpr_32 = COPY $sgpr17
# |   %8:sgpr_32 = COPY $sgpr16
# |   %17:sgpr_192 = REG_SEQUENCE %8:sgpr_32, %subreg.sub0, %9:sgpr_32, %subreg.sub1, %10:sgpr_32, %subreg.sub2, %11:sgpr_32, %subreg.sub3, %12:sgpr_32, %subreg.sub4, %13:sgpr_32, %subreg.sub5
# |   %16:sgpr_192 = COPY %17:sgpr_192
# |   %19:vreg_192 = COPY %17:sgpr_192
# |   %28:sreg_64_xexec = IMPLICIT_DEF
# |   %27:sreg_64_xexec = S_MOV_B64 $exec
# |
# | bb.1:
# | ; predecessors: %bb.1, %bb.0
# |   successors: %bb.1(0x40000000), %bb.3(0x40000000); %bb.1(50.00%), %bb.3(50.00%)
# |
# |   %26:vreg_192 = PHI %19:vreg_192, %bb.0, %18:vreg_192, %bb.1
# |   %29:sreg_64 = PHI %28:sreg_64_xexec, %bb.0, %30:sreg_64, %bb.1
# |   %31:sreg_32_xm0 = V_READFIRSTLANE_B32 %14:vgpr_32, implicit $exec
# |   %32:sreg_64 = V_CMP_EQ_U32_e64 %31:sreg_32_xm0, %14:vgpr_32, implicit $exec
# |   %30:sreg_64 = S_AND_SAVEEXEC_B64 killed %32:sreg_64, implicit-def $exec, implicit-def $scc, implicit $exec
# |   $m0 = COPY killed %31:sreg_32_xm0
# |   %18:vreg_192 = V_INDIRECT_REG_WRITE_MOVREL_B32_V8 %26:vreg_192(tied-def 0), %15:vgpr_32, 3, implicit $m0, implicit $exec
# |   $exec = S_XOR_B64_term $exec, %30:sreg_64, implicit-def $scc
# |   S_CBRANCH_EXECNZ %bb.1, implicit $exec
# |
# | bb.3:
```

This reverts commit 15df9e7.
…m#171725)

Currently fmul is not reassociated unless it has nsz, although
this should be unnecessary.
…lvm#171158)

Add additional bound for the induction variable of the scf.forall such
that:
%iv <= %lower_bound + (%trip_count - 1) * step

Same as llvm#126426 but for
scf.forall loop
The patch updates the lowering of `id` based pmevent
also to intrinsics. The mask is simply (1 << event-id).

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
This function contains most of the logic for BTI:
- it takes the BasicBlock and the instruction used to jump to it.
- Then it checks if the first non-pseudo instruction is a sufficient
landing pad for the used call.
- if not, it generates the correct BTI instruction.

Also introduce the isCallCoveredByBTI helper to simplify the logic.
nsz can only change the behavior of the sign bit.
The sign bit for fmul can be implemented as xor,
which is associative. DAGCombiner already reassociates
the multiply by 2 constants without nsz.

Fixes llvm#64967
This patch adds TLS support for SystemZ on top of orc-runtime support. A
separate orc-runtime support llvm#171062 has been created from earlier TLS
support #[170706](llvm#170706).

See conversations in
[llvm#170706](llvm#170706)

---------

Co-authored-by: anoopkg6 <anoopkg6@github.com>
llvm#171797)

This patch fixes toolchain-msvc.test on Windows ARM64 hosts running
under native ARM64 environment via vcvarsarm64.bat. Our lab buildbot
recently switched from using cross vcvarsamd64_arm64.bat environment to
native vcvarsarm64.bat. This patch updates FileCheck patterns to also
allow HostARM64 and arm64 PATH entries.

Changes:
-> Extend host regex to match HostARM64 (case-insensitive)
-> Allow arm64 in PATH tail.
-> Apply same fix in both 32-bit and 64-bit sections.
@z1-cciauto z1-cciauto requested a review from a team December 11, 2025 11:23
@z1-cciauto
Copy link
Collaborator Author

@z1-cciauto z1-cciauto merged commit a8a8321 into amd-staging Dec 11, 2025
19 checks passed
@z1-cciauto z1-cciauto deleted the upstream_merge_202512110623 branch December 11, 2025 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.