532151 Commits

Author SHA1 Message Date
Simon Pilgrim
5e0e04f087
[X86] combineX86ShufflesRecursively - replace Root node argument with opcode/valuetype/ismaskedshuffle data. NFC. (#132437)
Preparatory cleanup up patch to makes it easier for combineX86ShufflesRecursively/combineX86ShuffleChain to handle length changing shuffles up the shuffle chain than what combineX86ShuffleChainWithExtract can manage.

Instead of passing the original Root node, pass the root opcode and the current effective value type (which may have widened as we recurse through EXTRACT_SUBVECTOR/TRUNCATE nodes etc.).
2025-03-25 10:20:23 +00:00
Ramkumar Ramachandra
e8d882a95b
[LV] Audit and fix nits in cl::opts (NFC) (#130601)
Non-static cl::opts should be under the llvm namespace.
2025-03-25 10:19:45 +00:00
Benjamin Maxwell
107260cc29
[AArch64][SME2] Don't preserve ZT0 around SME ABI routines (#132722)
This caused ZT0 to be preserved around `__arm_tpidr2_save` in functions
with "aarch64_new_zt0". The block in which `__arm_tpidr2_save` is called
is added by the SMEABIPass and may be reachable in cases where ZA has
not been enabled* (so using `str zt0` is invalid).

* (when za_save_buffer is null and num_za_save_slices is zero)
2025-03-25 10:09:25 +00:00
Florian Hahn
9c7e38896f
[VPlan] Split off reduction printing tests, add find-last-IV test.
Splits off reduction printing tests, to limit growth and add test case
for printing find-last-IV (https://github.com/llvm/llvm-project/pull/132689)
2025-03-25 10:06:28 +00:00
Luke Hutton
d4570ea813
[mlir][tosa] Disallow invalid datatype combinations in the validation pass (#131595)
This commit checks if the operands/results of an operator can be found
in the profile compliance mapping, if it isn't the operator is
considered invalid. As a result, operator datatype combinations that are
not listed under the "Supported Data Types" of the TOSA specification
are disallowed and the validation pass results in failure.

Signed-off-by: Luke Hutton <luke.hutton@arm.com>
2025-03-25 10:05:39 +00:00
Akshat Oke
f8e908a0ed
[AMDGPU][NPM] Port SIInsertHardClauses to NPM (#130062) 2025-03-25 15:33:32 +05:30
Felipe de Azevedo Piovezan
07c82b1622
[lldb] Implement missing queue overloads from ThreadMemory (#132906)
This commit makes ThreadMemory a real "forwarder" class by implementing
the missing queue methods: they will just call the corresponding backing
thread method.

To make this patch NFC(*) and not change the behavior of the Python OS
plugin, NamedThreadMemoryWithQueue also overrides these methods to
simply call the `Thread` method, just as it was doing before. This also
makes it obvious that there are missing pieces of this class if it were
to provide full queue support.

(*) This patch is NFC in the sense that all llvm.org plugins will not
have any behavior change, but downstream consumers of ThreadMemory will
benefit from the newly implemented forwarding methods.
2025-03-25 06:52:07 -03:00
Felipe de Azevedo Piovezan
65ad02b882
[lldb][NFC] Break ThreadMemory into smaller abstractions (#132905)
ThreadMemory attempts to be a class abstracting the notion of a "fake"
Thread that is backed by a "real" thread. According to its
documentation, it is meant to be a class forwarding most methods to the
backing thread, but it does so only for a handful of methods.

Along the way, it also tries to represent a Thread that may or may not
have a different name, and may or may not have a different queue from
the backing thread. This can be problematic for a couple of reasons:

1. It forces all users into this optional behavior.
2. The forwarding behavior is incomplete: not all methods are currently
being forwarded properly. Some of them involve queues and seem to have
been intentionally left unimplemented.

This commit creates the following separation:

ThreadMemory <- ThreadMemoryProvidingName <-
ThreadMemoryProvidingNameAndQueue

ThreadMemory captures the notion of a backed thread that _really_
forwards all methods to the backing thread. (Missing methods should be
implemented in a later commit, and allowing them to be implemented
without changing behavior of other derived classes is the main purpose
of this refactor).

ThreadMemoryProvidingNameAndQueue is a ThreadMemory that allows users to
override the thread name. If a name is present, it is used; otherwise
the forwarding behavior is used.

ThreadMemoryProvidingNameAndQueue is a ThreadMemoryProvidingName that
allows users to override queue information. If queue information is
present, it is used; otherwise, the forwarding behavior is used.

With this separation, we can more explicitly implement missing methods
of the base class and override them, if needed, in
ThreadMemoryProvidingNameAndQueue. But this commit really is NFC, no new
methods are implemented and no method implementation is changed.
2025-03-25 06:50:52 -03:00
Dhruv Srivastava
e6e8252ba0
[lldb][AIX] Minor AIX specific changes (#132718)
This PR is in reference to porting LLDB on AIX.

Link to discussions on llvm discourse and github:

1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. https://github.com/llvm/llvm-project/issues/101657
The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601

AIX build specific changes
2025-03-25 15:16:23 +05:30
Simon Pilgrim
6984cfea6c [X86] Ensure concat(blendi(),blendi()) -> vselect() uses legal select mask types
For 256-bit selections, we could be using sub-i8/vXi8 selection condition masks - extend these to i8 and then extract the lowest mask subvector

Fixes #132844
2025-03-25 09:14:08 +00:00
Fraser Cormack
d46a699953
[libclc] Move asin/acos/atan to the CLC library (#132788)
This commit simultaneously moves these three functions to the CLC
library and optimizing them for vector types by avoiding scalarization.
2025-03-25 09:11:32 +00:00
Martin Storsjö
64779455b8 Revert "[YAML][NFC] precommit wrong test case (#131782)"
This reverts commit cb4ae35de0b4c19149379f16c7b279d80a669f9d.

That commit broke compilation with GCC:

../unittests/Support/YAMLIOTest.cpp:1280:20: error: explicit specialization of
template<class T> struct llvm::yaml::MappingTraits’ outside its namespace must u
se a nested-name-specifier [-fpermissive]
 1280 | template <> struct MappingTraits<V> {
      |                    ^~~~~~~~~~~~~~~~
2025-03-25 10:36:14 +02:00
Martin Storsjö
b33bec9b21 Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#127450)"
This reverts commit 71a0cfd93263552ddc0bfd2ea7b0abe9a578f87e.

This commit triggers failed asserts when compiling ffmpeg. The
issue is reproducible with a small standalone reproducer like this:

    void make_filters_from_proto(int *filter[][2], int bands) {
      int c, q, n;
      for (;; q++) {
        n = 0;
        for (; n < 7; n++) {
          int theta = (q * (n - 6) + (n >> 1) - 3) % bands;
          if (theta)
            c = theta;
          filter[q][n][0] = c;
        }
      }
    }

$ clang -target x86_64-linux-gnu -c repro.c -O3
clang: ../lib/Transforms/Vectorize/SLPVectorizer.cpp:989: llvm::SmallVector<llvm
::Value*> {anonymous}::BinOpSameOpcodeHelper::InterchangeableInfo::getOperand(ll
vm::Instruction*) const: Assertion `FromCIValue.isZero() && "Cannot convert the
instruction."' failed.

The same issue also reproduces for a large number of other target
triples, aarch64-linux-gnu and others.
2025-03-25 10:22:44 +02:00
Martin Storsjö
dd059338a2 Revert "[Vectorize] Fix a warning"
This reverts commit 4c68061254c896214b7ad5ab807ac4ba11517812.

Reverting as part of a revert of a preceding commit.
2025-03-25 10:21:05 +02:00
Aiden Grossman
e696f4e500 [llvm-exegesis] Fix LBR checks/test
This patch fixes the LBR check in the local lit config. The test would
segfault as the loop body would be basically empty, causing a divide by
zero error. More investigation is needed there so we do not actually hit
that assertion and report a cleaner error somewhere. Specifying an
actual opcode to benchmark fixes the problem. The test would also fail
as -mcpu was set to the default x86 CPU rather than the one currently
being run on, so it would always fail to find a perf counter. This patch
fixes that by simply removing the -mcpu flag.

Given these issues, I'm not sure these tests have ever run in the ~5
years they have been in tree. There were some issues reported in
\#132861, so I guess we'll see if there are further issues when the
testing becomes more broad.
2025-03-25 08:10:58 +00:00
Ricardo Jesus
847e46ca01
[AArch64] Add initial support for -mcpu=olympus. (#132368)
This patch adds support for the NVIDIA Olympus core.

This does not add any special tuning decisions, and those may come
later.
2025-03-25 08:09:04 +00:00
Timm Baeder
9b060d1e6a
[clang][bytecode] Fix zero-init of atomic floating point objects (#132782)
We can't pass the AtomicType along to ASTContext::getFloatTypeSemantics.
2025-03-25 08:05:04 +01:00
Congcong Cai
cb4ae35de0
[YAML][NFC] precommit wrong test case (#131782) 2025-03-25 14:44:12 +08:00
Timm Baeder
bcedb368e3
[clang][bytecode] Support composite arrays in memcpy op (#132775)
See the attached test case.
2025-03-25 07:17:10 +01:00
Timm Bäder
1e2ad6793a Revert "[clang][bytecode] Implement __builtin_{wcscmp,wcsncmp} (#132723)"
This reverts commit f7aea4d081f77dba48b0fc019f59b678fb679aa8.

This broke the clang-solaris11-sparcv9 builder:
https://lab.llvm.org/buildbot/#/builders/13/builds/6151
2025-03-25 07:15:30 +01:00
Kazu Hirata
fac8fe9cf9
[Target] Use *Set::insert_range (NFC) (#132879)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-24 22:42:04 -07:00
Kazu Hirata
75210df5a2
[AMDGPU] Avoid repeated map lookups (NFC) (#132877) 2025-03-24 22:41:27 -07:00
Paul Schwabauer
cca0f8113e
[PATCH] [clang][modules] Fix serialization and de-serialization of PCH module file refs (#105994) (#132802)
The File ID is incorrectly calculated, resulting in an out-of-bounds
access. The test code is more complex because the File fetching only
happens in specific scenarios.

---------

Co-authored-by: ShaderKeeper <no-reply@shaderkeeper.com>
Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
2025-03-25 13:24:21 +08:00
Jim Lin
26a52f828d [RISCV] RISCVInstrInfoSFB.td shouldn't be included in Vendor extensions section. NFC.
RISCVInstrInfoSFB.td is for Short Forward Branch, not a kind of Vendor extension.
2025-03-25 13:00:34 +08:00
Sameer Sahasrabuddhe
f6a3cd54bc
[clang] `noconvergent` does not affect calls to convergent functions (#132701)
When placed on a function, the ``clang::noconvergent`` attribute ensures
that the function is not assumed to be convergent. But the same
attribute has no effect on function calls. A call is convergent if the
callee is convergent. This is based on the fact that in LLVM, a call
always inherits all the attributes of the callee. Only ``convergent`` is
an attribute in LLVM IR, and there is no equivalent of
``clang::noconvergent``.
2025-03-25 10:44:08 +05:30
Fangrui Song
9ee950be95 MCValue: Simplify code with getSubSym
The MCValue::SymB MCSymbolRefExpr member might be replaced with a
MCSymbol in the future. Reduce direct access.
2025-03-24 21:52:40 -07:00
Lang Hames
473b059505 [ORC] Add ExecutorAddrRange::fromPtrRange convenience method.
This can be used to construct an ExecutorAddrRange from a pair of pointers, or
a pointer and a size.

This will be used to reduce boilerplate in an upcoming patch.
2025-03-25 15:45:48 +11:00
Matt Arsenault
37b5f77f8b
llvm-reduce: Fix asserting on TargetExtTypes that do not support zeroinit (#132733)
So far I've been unsuccessful in finding an example where the used constant
value is directly observed in the output. This avoids an assert in an intermediate
step of value replacement.
2025-03-25 11:40:55 +07:00
Matt Arsenault
bfb549ff33
llvm-reduce: Fix operand reduction asserting on target ext types (#132732)
Not all TargetExtTypes support zeroinit, so use poison as a substitute
if unavailable.
2025-03-25 11:38:04 +07:00
Rahul Joshi
eeb4132b8d
[NFC] Fix macro redefinition warning in NewPMDriver.cpp (#132854) 2025-03-24 20:16:48 -07:00
tangaac
a6d366268d
[LoongArch] Pre-commit tests for vector shift (#132702) 2025-03-25 10:31:54 +08:00
Owen Pan
da7f1564a8
[clang-format] Don't wrap before attributes in parameter lists (#132519)
Fix #132240
2025-03-24 19:18:13 -07:00
Chuanqi Xu
e5f100676e
[clangd] [C++20] [Modules] Add modules suffix for 'Header' Source Switch (#131591)
Support the trivial "header"/source switch for module interfaces.

I initially thought the naming are bad and we should rename it. But
later I feel it is better to split patches as much as possible.

From the codes it looks like there are problems. e.g., `isHeaderFile`.
But let's try to fix them in different patches.
2025-03-25 09:43:13 +08:00
Valentin Clement (バレンタイン クレメン)
5be9082fed
[flang][cuda] Carry over the dynamic shared memory size to gpu.launch_func (#132837) 2025-03-24 18:37:19 -07:00
Kazu Hirata
4c68061254 [Vectorize] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:855:52: error:
  unused variable 'SupportedOp' [-Werror,-Wunused-const-variable]
2025-03-24 17:38:47 -07:00
Paul Kirth
c1ed4f6423
[clang-doc] Format test files (#132428)
Many of the test files had an inconsistent formatting. This patch ran
clang-format over them using the project's .clang-format file, with
column limit = 0, to prevent test directives from being split over
multiple lines.
2025-03-24 17:27:16 -07:00
Han-Kuan Chen
71a0cfd932
[SLP] Make getSameOpcode support interchangeable instructions. (#127450)
We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., `add x, 0` is equivalent to
`mul x, 1`).
Non-constant values are not supported, as they may incur high costs with
little benefit.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2025-03-25 08:24:46 +08:00
Paul Kirth
ece59a8cb9
Reland Support for mustache templating language (#132467)
The last version of this patch had memory leaks due to using the
BumpPtrAllocator for data types that required destructors to run to
release heap memory (e.g. via std::vector and std::string). This version
avoids that by using smart pointers, and dropping support for
BumpPtrAllocator.

We should refactor this code to use the BumpPtrAllocator again, but that
can be addressed in future patches, since those are more invasive
changes that need to refactor many of the core data types to avoid
owning allocations.

Adds Support for the Mustache Templating Language. See specs here:
https://mustache.github.io/mustache.5.html This patch implements
support+tests for majority of the features of the language including:

    - Variables
    - Comments
    - Lambdas
    - Sections

This meant as a library to support places where we have to generate
HTML, such as in clang-doc.

Co-authored-by: Peter Chou <peter.chou@mail.utoronto.ca>
2025-03-24 17:23:25 -07:00
Joseph Huber
25bf4e262c
[Offload] Remove handling for COV4 binaries from offload/ (#131033)
Summary:
We moved from cov4 to cov5 a long time ago, and it guards simplifying
some front end code, so we should be able to move up with this.
2025-03-24 18:58:20 -05:00
Shilei Tian
ff8aa300d6
[AMDGPU] Remove outdated COV6 warning (#132814) 2025-03-24 19:57:07 -04:00
Paul Kirth
0aa4c35beb
[libc][__support] Fix -Wimplicit-int-conversion warning (#132839)
Newer versions of clang now warn about these, so use explicit
conversion instead.
2025-03-24 16:47:07 -07:00
David Benjamin
e6de45a229
[tsan] Don't treat uncontended pthread_once as a potentially blocking region (#132477)
guard_acquire is a helper function used to implement TSan's
__cxa_guard_acquire and pthread_once interceptors.
https://reviews.llvm.org/D54664 introduced optional hooks to support
cooperative multi-threading. It worked by marking the entire
guard_acquire call as a potentially blocking region.

In principle, only the contended case needs to be a potentially blocking
region. This didn't matter for __cxa_guard_acquire because the compiler
emits an inline fast path before calling __cxa_guard_acquire. That is,
once we call __cxa_guard_acquire at all, we know we're in the contended
case.

https://reviews.llvm.org/D107359 then unified the __cxa_guard_acquire
and pthread_once interceptors, adding the hooks to pthread_once.
However, unlike __cxa_guard_acquire, pthread_once callers are not
expected to have an inline fast path. The fast path is inside the
function.

As a result, TSan unnecessarily calls into the cooperative
multi-threading engine on every pthread_once call, despite applications
generally expecting pthread_once to be fast after initialization. Fix
this by deferring the hooks to the contended case inside guard_acquire.
2025-03-24 19:30:15 -04:00
Joseph Huber
ef2735d243
[Flang] Detect endianness in the preprocessor (#132767)
Summary:
Currently we use `TestBigEndian` in CMake to determine endianness. This
doesn't work on all platforms and is deprecated since CMake 3.20.
Instead of using CMake, we can just use the GNU/Clang preprocessor
definitions.

The only difficulty is MSVC, mostly because they don't support the same
macros. But, as far as I'm aware, MSVC / Windows targets are always
little endian, and if not we can just override it for that specific
target in the future.
2025-03-24 18:29:05 -05:00
Krzysztof Parzyszek
c221d64206
[flang] Remove mentions of evaluate::Variable<T> (#132805)
The template itself was not defined anywhere. The closest thing was a
forward declaration in flang/include/flang/Evaluate/variable.h.
2025-03-24 18:26:57 -05:00
Thurston Dang
3ce3d889f6
[asan] Re-exec without ASLR if needed on 64-bit Linux (#132682)
This generalizes https://github.com/llvm/llvm-project/pull/131975 to non-32-bit Linux (i.e., 64-bit Linux).

This works around an edge case in 64-bit Linux, whereby the memory layout is incompatible if the stack size is unlimited AND ASLR entropy is 31+ bits (see https://github.com/google/sanitizers/issues/856#issuecomment-2747076811).

More generally, this "re-exec without ASLR if layout is incompatible" is a hammer that can work around most shadow mapping issues, without incurring the overhead of using a dynamic shadow.
2025-03-24 16:24:38 -07:00
joaosaffran
567b0f8923
[HLSL] Add support to branch/flatten attributes to switch (#131739)
closes: [#125754](https://github.com/llvm/llvm-project/issues/125754)

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
2025-03-24 16:17:19 -07:00
Jannick Kremer
20fc2d5aa5
[libclang/python] Add some bindings to the Cursor interface (#132377)
Make Cursor hashable
Add `Cursor.has_attr()`
Add `Cursor.specialized_template`

This covers the Cursor interface changes added by #120590

---------

Co-authored-by: Mathias Stearn <redbeard0531@gmail.com>
2025-03-25 00:08:32 +01:00
Sarah Spall
14d25613cb
[HLSL] Finish exposing half types and intrinsics always (#132804)
We previously made an implementation error when adding half overloads
for HLSL library functionality. The half type is always defined in HLSL
and half intrinsics should not be conditionally included.
When native 16-bit types are disabled half is a unique 32-bit float type
with lesser promotion rank than float.

Apply pattern #81782 to intrinsics added in #95999.
Closes #132793
2025-03-24 15:34:58 -07:00
LLVM GN Syncbot
0adc672ed4 [gn build] Port 9a82f742b497 2025-03-24 21:56:36 +00:00
Helena Kotas
9a82f742b4
[HLSL][NFC] Refactor HLSLExternalSemaSource (#131032)
Moving builder classes into separate files
`HLSLBuiltinTypeDeclBuilder.cpp`/`.h`, changing a some
`HLSLExternalSemaSource` methods to private and removing unused methods.

This is a prep work before we start adding more builtin types and
methods, like textures, resource constructors or matrices. For example
constructors could make use of the `BuiltinTypeMethodBuilder`, but this
helper class was defined in `HLSLExternalSemaSource.cpp` after the
method that creates a constructor. Rather than reshuffling the code one
big source file I am moving the builders into a separate cpp & header
file and placing the helper classes declarations up top.

Currently the new header only exposes `BuiltinTypeDeclBuilder` to
`HLSLExternalSemaSource`. In the future but we might decide to expose
more helper classes as needed.
2025-03-24 14:56:05 -07:00