532822 Commits

Author SHA1 Message Date
George Burgess IV
a8585654c2
[llvm][utils] skip revert-checking reverts across branches (#134108)
e2ba1b6ffde4ec607342b1b746d1b57f0f04390a references that it reverts a
commit that's not a parent of e2ba1b6ffde4ec607342b1b746d1b57f0f04390a.

Functionally, this can (and demonstrably does) work(*), but from the
standpoint of the revert checker, it's nonsense. Print a `logging.error`
when it's detected.

Tested by running the revert checker against a commit range that
includes the aforementioned commit; the logging.error was fired
appropriately.

(*) - the specifics here are:
- the _SHA_ that was referenced was on a non-main branch, but
- the commit from the non-main branch was merged into the non-main
branch from main
- ...so the _functional_ commit being reverted was originally landed on
main, but the _SHA_ referenced from main was from a branch that was cut
before the reverted-commit was landed on main
2025-04-02 13:44:18 -07:00
Abhinav Kumar
51d1c72886
[libc] Added support for fixed-points in `is_signed and is_unsigned`. (#133371)
Fixes #133365

## Changes Done
- Changed the signed checking to 
```cpp
struct is_signed : bool_constant<((is_fixed_point<T> || is_arithmetic_v<T>) && (T(-1) < T(0)))>
```
in ``/libc/src/__support/CPP/type_traits/is_signed.h``. Added check for
fixed-points.
- But, got to know that this will fail for ``unsigned _Fract`` or any
unsigned fixed-point because ``unsigned _Fract`` can’t represent -1 in
T(-1), while ``unsigned int`` can handle it via wrapping.
- That's why I explicity added ``is_signed`` check for ``unsigned``
fixed-points.
- Same changes to ``/libc/src/__support/CPP/type_traits/is_unsigned.h``.
- Added tests for ``is_signed`` and ``is_unsigned``.
2025-04-02 16:41:47 -04:00
Krzysztof Drewniak
554859c736
[TTI] Make isLegalMasked{Load,Store} take an address space (#134006)
In order to facilitate targets that only support masked loads/stores
on certain address spaces (AMDGPU will support them in an upcoming
patch, but only for address space 7), add an AddressSpace parameter
to isLegalMaskedLoad and isLegalMaskedStore
2025-04-02 15:38:10 -05:00
erichkeane
bb8a7a7349 [OpenACC] Implement 'pqr-list' has at least one 1 item.
OpenACC Github PR#499 defines the pqr-list as having at least 1 item. We
already handle that for all but 'wait', so this patch just does the work
to add it for 'wait', plus adds tests.
2025-04-02 13:33:18 -07:00
Andrzej Warzyński
2bee24632f
[mlir][bugfix] Fix erroneous condition in getEffectsOnResource (#133638)
This patch corrects an invalid condition in `getEffectsOnResource` used
to identify relevant "resources":

```cpp
return it.getResource() != resource;
```

The current implementation assumes that only one instance of each
resource will exist, so comparing raw pointers is both safe and
sufficient. This assumption stems from constructs like:

```cpp
static DerivedResource *get() {
  static DerivedResource instance;
  return &instance;
}
```
i.e., resource instances returned via static singleton methods.

However, as discussed in
 * https://github.com/llvm/llvm-project/issues/129216,

this assumption breaks in practice — notably on macOS (Apple Silicon)
when built with:

* `-DBUILD_SHARED_LIBS=On`.

In such cases, multiple instances of the same logical resource may exist
across shared library boundaries, leading to incorrect behavior and
causing failures in tests like:

* test/Dialect/Transform/check-use-after-free.mlir

This patch replaces the pointer comparison with a comparison based on
resource identity:

```cpp
return it.getResource()->getResourceID() != resource->getResourceID();
```
This approach aligns better with the intent of `getEffectsOnResource`,
which is to:

```cpp
/// Collect all of the effect instances that operate on the provided
/// resource (...)
```

Fixes #129216
2025-04-02 21:26:41 +01:00
Luke Lau
df9e5ae5b4
[InstCombine] Match scalable splats in m_ImmConstant (#132522)
#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.

However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.

This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.

But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.

I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2025-04-02 21:21:52 +01:00
Krzysztof Parzyszek
564e04b703
[flang][OpenMP] Use function symbol on DECLARE TARGET (#134107)
Consider:
```
function foo()
  !$omp declare target(foo) ! This `foo` was a function-result symbol
  ...
end
```
When resolving symbols, for this case use the symbol corresponding to
the function instead of the symbol corresponding to the function result.

Currently, this will result in an error:
```
error: A variable that appears in a DECLARE TARGET directive must be
declared in the scope of a module or have the SAVE attribute, either
explicitly or implicitly
```
2025-04-02 15:16:33 -05:00
David Peixotto
2026873fb8
Add enable/disable api for SystemRuntime plugins (#133794)
This commit adds support for enabling and disabling plugins by name. The
changes are made generically in the `PluginInstances` class, but
currently we only expose the ability to SystemRuntime plugins. Other
plugins types can be added easily.

We had a few design goals for how disabled plugins should work

1. Plugins that are disabled should still be visible to the system. This
allows us to dynamically enable and disable plugins and report their
state to the user.
2. Plugin order should be stable across disable and enable changes. We
want avoid changing the order of plugin lookup. When a plugin is
re-enabled it should return to its original slot in the creation order.
3. Disabled plugins should not appear in PluginManager operations.
Clients should be able to assume that only enabled plugins will be
returned from the PluginManager.

For the implementation we modify the plugin instance to maintain a bool
of its enabled state. Existing clients external to the Instances class
expect to iterate over only enabled instance so we skip over disabed
instances in the query and snapshot apis. This way the client does not
have to manually check which instances are enabled.
2025-04-02 13:15:31 -07:00
Nikolas Klauser
c59d3a2684
[libc++] Add visibility annotations to the std namespace with GCC (#133233)
This allows us to remove the need for `_LIBCPP_TEMPLATE_VIS` and fixes a
bunch of missing annotations for RTTI when used across dylib boundaries.
`_LIBCPP_TEMPLATE_VIS` itself will be removed in a separate patch, since
it touches a lot of code.

This patch is a no-op for Clang. Only GCC is affected.
2025-04-02 22:12:59 +02:00
Brox Chen
fb0e7b5f16
[AMDGPU][True16][CodeGen] Implement sgpr folding in true16 (#128929)
We haven't implemented 16 bit SGPRs. Currently allow 32-bit SGPRs to be
folded into True16 bit instructions taking 16 bit values. Also use
sgpr_32 when Imm is copied to spgr_lo16 so it could be further folded.
This improves generated code quality.
2025-04-02 16:08:26 -04:00
ofri frishman
6f1347d57b
[MLIR] Bubble up tensor.extract_slice through tensor.collapse_shape (#131982)
Add a pattern that bubbles up tensor.extract_slice through
tensor.collapse_shape.
The pattern is registered in a pattern population function that is used
by the transform op
transform.apply_patterns.tensor.bubble_up_extract_slice and by the
tranform op transform.structured.fuse as a cleanup pattern.
This pattern enables tiling and fusing op chains which contain
tensor.collapse_shape if added as a cleanup pattern of tile and fuse
utility.
Without this pattern that would not be possible, as
tensor.collapse_shape does not implement the tiling interface. This is
an additional pattern to the one added in PR #126898
2025-04-02 21:06:43 +01:00
Jonas Devlieghere
c87dc2b7d4
[lldb-dap] Speed up TestDAP_Progress (#134048)
While trying to make progress on #133782, I noticed that
TestDAP_Progress was taking 90 seconds to complete. This patch brings
that down to 10 seocnds by making the following changes:

1. Don't call `wait_for_event` with a 15 second timeout. By the time we
call this, all progress events have been emitted, which means that we're
just sitting there until we hit the timeout.

2. Don't use 10 steps (= 10 seconds) for indeterminate progress. We have
two indeterminate progress tests so that's 6 seconds instead of 20.

3. Don't launch the process over and over. Once we have a dap session,
we can clear the progress vector and emit new progress events.
2025-04-02 12:41:47 -07:00
Florian Hahn
3bdf9a0880
[EquivalenceClasses] Use SmallVector for deterministic iteration order. (#134075)
Currently iterators over EquivalenceClasses will iterate over std::set,
which guarantees the order specified by the comperator. Unfortunately in
many cases, EquivalenceClasses are used with pointers, so iterating over
std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence
classes before using them to try to get a deterministic order
(LowerTypeTests, SplitModule), but there are others that do not at the
moment and this can result at least in non-determinstic value naming in
Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a
extra SmallVector and removes code from LowerTypeTests and SplitModule
to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but
close to noise:

https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions

PR: https://github.com/llvm/llvm-project/pull/134075
2025-04-02 20:27:43 +01:00
Sarah Spall
60efed3f20
[HLSL] Update __builtin_hlsl_dot builtin Sema Checking to fix error when passed an array literal 1u.xxxx (#133941)
update dot builtin sema checking and codegen
new test 
fix tests
Closes #133659
2025-04-02 12:27:01 -07:00
Alexey Bataev
843ef77dc2 [SLP]Update mapping between values and their matching entries upon selection
Need to update the mapping between gathered values and their matching
entries, if the list of the entries is updated and only some of them are
selected for final shuffling.

Fixes #134085
2025-04-02 11:59:32 -07:00
Amy Huang
f475ccd379
Fix to the libc BUILD.bazel file after changing atan_utils.h deps. (#134128)
Additional fix for libc BUILD.bazel after commit 8741412 (#133980)

This seems to match libc/src/math/generic/CMakeLists.txt.
2025-04-02 11:05:36 -07:00
Arvind Sudarsanam
32dff27060
[clang-sycl-linker] Fix flaky failure and add REQUIRES (Try #2) (#134130)
This should fix failures caused by
https://github.com/llvm/llvm-project/pull/133967
Attn: @sarnex
Thanks

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-04-02 18:05:17 +00:00
Juan Manuel Martinez Caamaño
beae0e9f1a
[AMDGPU] Use a target feature to enable __builtin_amdgcn_global_load_lds on gfx9/10 (#133055)
This patch introduces the `vmem-to-lds-load-insts` target feature, which
can be used to enable builtins `__builtin_amdgcn_global_load_lds` and
`__builtin_amdgcn_raw_ptr_buffer_load_lds` on platforms which have this
feature.

This feature is only available on gfx9/10.

A limitation of using a common target feature for both builtins is that
we could have made `__builtin_amdgcn_raw_ptr_buffer_load_lds` available
on gfx6,7,8.
2025-04-02 20:00:09 +02:00
Juan Manuel Martinez Caamaño
0375ef07c3
[Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (#133741)
This built-in maps to `V_CVT_OFF_F32_I4` which treats its input as
a 4-bit signed integer and returns `0.0625f * src`.

SWDEV-518861
2025-04-02 19:51:40 +02:00
Nick Sarnie
540dd89778
Revert "[clang-sycl-linker] Fix flaky failure and add REQUIRES" (#134127)
Reverts llvm/llvm-project#134125
2025-04-02 17:44:23 +00:00
Jorge Gorbe Moya
a57023b6a0 Add missing include for llvm::Error after 74ec038ffb34575ee93fa313cd0ea0db0c0a7e0a 2025-04-02 10:43:05 -07:00
Arvind Sudarsanam
4688719cf5
[clang-sycl-linker] Fix flaky failure and add REQUIRES (#134125)
This should fix failures caused by
https://github.com/llvm/llvm-project/pull/133967
Attn: @sarnex 
Thanks

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-04-02 17:38:11 +00:00
Adrian Prantl
a3ac318e5f [lldb] Skip test with older version of clang 2025-04-02 10:31:44 -07:00
Kazu Hirata
aa33c09561 [flang] Fix a warning
This patch fixes:

  flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp:184:18: error:
  unused variable 'loc' [-Werror,-Wunused-variable]
2025-04-02 10:14:50 -07:00
Snehasish Kumar
c18994c7cd
[Metadata] Preserve MD_prof when merging instructions when one is missing. (#132433)
Preserve branch weight metadata when merging instructions if one of the
instructions is missing metadata. This is similar in behaviour to what
we do today for other types of metadata such as mmra, memprof and
callsite metadata.
2025-04-02 11:13:45 -06:00
Snehasish Kumar
dde0be9d97
[Metadata] Handle memprof, callsite merging when one is missing. (#132106)
For memprof and callsite metadata we want to pick one deterministically
and keep that even if one of them may be missing.
2025-04-02 11:10:02 -06:00
erichkeane
d7724c8ea3 [OpenACC] allow 'if' clause on 'atomic' construct
This was added in OpenACC PR #511 in the 3.4 branch.  From an AST/Sema
perspective this is pretty trivial as the infrastructure for 'if'
already exists, however the atomic construct needed to be taught to take
clauses.  This patch does that and adds some testing to do so.
2025-04-02 10:03:24 -07:00
Fangrui Song
b6e2df54c4 [MC] Move some member variables from AsmParser to MCAsmParser
to eliminate some virtual functions and avoid duplication
between AsmParser/MasmParser.
2025-04-02 09:59:18 -07:00
Luke Lau
711b15d179
[RISCV] Mark subvector extracts from index 0 as cheap (#134101)
Previously we only marked fixed length vector extracts as cheap, so this
extends it to any extract at index 0 which should just be a subreg
extract.

This allows extracts of i1 vectors to be considered for DAG combines,
but also scalable vectors too.

This causes some slight improvements with large legalized fixed-length
vectors, but the underlying motiviation for this is to actually prevent
an unprofitable DAG combine on a scalable vector in an upcoming patch.
2025-04-02 17:57:13 +01:00
Ryan Buchner
fa2a6d68c6
[CodeGenPrepare][RISCV] Combine (X ^ Y) and (X == Y) where appropriate (#130922)
Fixes #130510.

In RISCV, modify the folding of (X ^ Y == 0) -> (X == Y) to account for
cases where the (X ^ Y) will be re-used.

If a constant is being used for the XOR before a branch, ensure that it
is small enough to fit within a 12-bit immediate field. Otherwise, the
equality check is more efficient than the check against 0, see the
following:
```
# %bb.0:
        lui     a1, 5
        addiw   a1, a1, 1365
        xor     a0, a0, a1
        beqz    a0, .LBB0_2
# %bb.1: 
        ret
.LBB0_2: 
```

```
# %bb.0:
        lui     a1, 5
        addiw   a1, a1, 1365
        beq    a0, a1, .LBB0_2
# %bb.1: 
        xor     a0, a0, a1
        ret
.LBB0_2: 
```

Similarly, if the XOR is between 1 and a size one integer, we should
still fold away the XOR since that comparison can be optimized as a
comparison against 0.
```
# %bb.0:
        slt a0, a0, a1
        xor  a0, a0, 1
        beqz    a0, .LBB0_2
# %bb.1: 
        ret
.LBB0_2: 
```

```
# %bb.0:
        slt a0, a0, a1
        bnez    a0, .LBB0_2
# %bb.1: 
        xor  a0, a0, 1
        ret
.LBB0_2: 
```

One question about my code is that I used a hard-coded value for the
width of a RISCV ALU immediate. Do you know of a way that I can gather
this from the `context`, I was unable to devise one.
2025-04-02 09:56:09 -07:00
Nikita Popov
74ec038ffb [OMPIRBuilder] Don't include MemorySSAUpdater.h (NFC)
This header does not use MemorySSA in any way -- it was just using
a typedef from it. Write out the type instead.
2025-04-02 18:48:51 +02:00
AdityaK
340f06a8d4
Fix: bail out when divisor is zero (#133518)
Fixes: #131279
2025-04-02 09:44:18 -07:00
Matt Arsenault
efca37fda5
llvm-reduce: Change exit code for uninteresting inputs (#134021)
This makes it easier to reduce llvm-reduce with llvm-reduce to filter
cases where the input reduced too much.

Not sure if it's possible to test the exit code in lit.
2025-04-02 23:43:06 +07:00
Fraser Cormack
ddc48fefe3
[libclc] Move native_(exp10|powr|tan) to CLC library (#134080)
These are the three remaining native builtins not yet ported.

There are elementwise versions of exp10 and tan which correspond to the
intrinsics, which may be preferable to the current versions which route
through other native builtins. Those could be changed in a follow-up if
desired.
2025-04-02 17:37:17 +01:00
Arvind Sudarsanam
4a4d41e723
[SYCL][SPIR-V Backend][clang-sycl-linker] Add SPIR-V backend support inside clang-sycl-linker (#133967)
This PR does the following:

1. Use SPIR-V backend to do LLVM to SPIR-V translation inside
clang-sycl-linker
2. Remove llvm-spirv translator from clang-sycl-linker Currently, no
SPIR-V extensions are enabled for SYCL compilation flow. This will be
updated in subsequent commits.

Thanks

Note: This is one of the many PRs being introduced to add SYCL
programming model support to LLVM
([RFC](https://discourse.llvm.org/t/rfc-add-sycl-programming-model-support/50812)).

---------

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-04-02 16:29:41 +00:00
Fehr Mathieu
8b67f36258
[mlir] [arith] Fix ceildivsi lowering in arith-expand (#133774)
This fixes the current lowering of `arith.ceildivsi` in the arith-expand
pass, which was previously incorrect. The new version is based on the
lowering of `arith.floordivsi`, and will not introduce new undefined
behavior or poison during the lowering. It also replaces one division
with a multiplication.

The previous lowering of `ceildivsi(n, m)` was the following:
```
x = (m > 0) ? -1 : 1
(n*m>0) ? ((n+x) / m) + 1 : - (-n / m)
```

This caused two problems:
* In the case where `n` is INT_MIN and `m` is positive, the result would
be poison instead of an actual value
* In the case where `n` is INT_MAX and `m` is `-1`, this would trigger
undefined behavior, while the original code wouldn't. This is because
`n+x` would be equal to `INT_MIN` (`INT_MAX + 1`), so the `(n+x) / m`
division would overflow and trigger UB.
2025-04-02 17:26:58 +01:00
Daniel Thornburgh
e84b57dfbf
[LLD][ELF] Support OVERLAY NOCROSSREFS (#133807)
This allows NOCROSSREFS to be specified in OVERLAY linker script
descriptions. This is a particularly useful part of the OVERLAY syntax,
since it's very rarely possible for one overlay section to sensibly
reference another.

Closes #128790
2025-04-02 09:25:18 -07:00
Jonas Devlieghere
6f959a46c0
[lldb] uint8_t -> unsigned short in std::independent_bits_engine
According to [1], the template parameter must be cv-unqualified and one
of unsigned short, unsigned int, unsigned long, or unsigned long long.

Should fix the following MSVC error:

  error: static assertion failed due to requirement
  '_Is_any_of_v<unsigned char, unsigned short, unsigned int, unsigned
  long, unsigned long long>': invalid template argument for
  independent_bits_engine: N4659

[1] https://en.cppreference.com/w/cpp/numeric/random/independent_bits_engine
2025-04-02 09:23:35 -07:00
Ramkumar Ramachandra
00122bb56b
[LV] Regen a test with UTC (#134076) 2025-04-02 17:23:00 +01:00
Ramkumar Ramachandra
f7591ee161
[LV] Exercise type-mismatch with RT-check conflict rdx (#130295)
The test suite of LoopVectorize suffers from a coverage hole when types
mismatch, and runtime checks are needed, with a conflict redux. Fix this
coverage hole by adding tests.
2025-04-02 17:22:40 +01:00
Shilei Tian
84cb08e118
[MLIR][AMDGPU] Bump to COV6 (#133849)
We already bump to COV6 by default in the front-end and backend. This PR
is for MLIR.

Note that COV6 requires ROCm 6.3+.
2025-04-02 12:14:24 -04:00
Andrzej Warzyński
7dce16a0e5
[docs][GitHub] Update docs for stacked PRs (#132424) 2025-04-02 17:00:49 +01:00
Nikolas Klauser
04676c6160
Revert "Enable unnecessary-virtual-specifier by default" (#134105)
Reverts llvm/llvm-project#133265

This causes the whole libc++ CI to fail, since we're not building
against a compiler built from current trunk. Specifically, the CMake
changes causes some feature detection to fail, resulting in CMake
being unable to configure libc++.
2025-04-02 17:59:08 +02:00
Jonas Devlieghere
ee60f7c7a4
[lldb] Hoist UUID generation into the UUID class (#133662)
Hoist UUID generation into the UUID class and add a trivial unit test.
This also changes the telemetry code to drop the double underscore if we
failed to generate a UUID and subsequently logs to the Host instead of
Object log channel.
2025-04-02 08:50:28 -07:00
Luke Lau
c132bd6885 [RISCV] Add test for vmv.s.x of an immediate into a zeroinitializer vector. NFC
The immediate version of ffaaaceaa1cfaa7103196cc7f307ffcb61d73558
2025-04-02 16:48:57 +01:00
Matt Arsenault
2b7fe90a01
llvm-reduce: Avoid worklist in simplify-instruction (#134066) 2025-04-02 22:21:28 +07:00
Peng Liu
e6c2fdc90f
[libc++] Fix ambiguous call to std::max in vector<bool> (#119801)
Closes #121713.
2025-04-02 11:14:14 -04:00
Luke Lau
ffaaaceaa1 [RISCV] Add test for vmv.s.x into a zeroinitializer vector. NFC
This is generated by the loop vectorizer for out-of-loop add
reductions with some starting value
2025-04-02 16:09:44 +01:00
Simon Pilgrim
3843dfeaf7 [X86] Add demanded elts test coverage for vXi16 VPERMW nodes
Requested for #133923
2025-04-02 15:32:01 +01:00
yonghong-song
f99072bd8c
[Clang][BPF] Add tests for btf_type_tag c2x-style attributes (#133666)
For btf_type_tag implementation, in order to have the same results with
clang (__attribute__((btf_type_tag("...")))), gcc intends to use c2x
syntax '[[...]]'. Clang also supports similar c2x syntax. Currently, the
clang selftest contains the following five tests:
```
  attr-btf_type_tag-func.c
  attr-btf_type_tag-similar-type.c
  attr-btf_type_tag-var.c
  attr-btf_type_tag-func-ptr.c
  attr-btf_type_tag-typedef-field.c
```

Tests attr-btf_type_tag-func.c and attr-btf_type_tag-var.c already have
c2x syntax test.

Test attr-btf_type_tag-func-ptr.c does not support c2x syntax when
'__attribute__((...))' is replaced with with '[[...]]'. This should not
be an issue since we do not have use cases for function pointer yet.

This patch added '[[...]]' syntax for
```
  attr-btf_type_tag-similar-type.c
  attr-btf_type_tag-typedef-field.c
```
2025-04-02 07:31:32 -07:00