514275 Commits

Author SHA1 Message Date
Matt Arsenault
a8e1311a1c
[RFC] IR: Define noalias.addrspace metadata (#102461)
This is intended to solve a problem with lowering atomics in
OpenMP and C++ common to AMDGPU and NVPTX.

In OpenCL and CUDA, it is undefined behavior for an atomic instruction
to modify an object in thread private memory. In OpenMP, it is defined.
Correspondingly, the hardware does not handle this correctly. For
AMDGPU,
32-bit atomics work and 64-bit atomics are silently dropped. We
therefore
need to codegen this by inserting a runtime address space check,
performing
the private case without atomics, and fallback to issuing the real
atomic
otherwise. This metadata allows us to avoid this extra check and branch.

Handle this by introducing metadata intended to be applied to atomicrmw,
indicating they cannot access the forbidden address space.
2024-10-07 23:21:42 +04:00
Sandeep Dasgupta
90a5744beb
Remove redundant checks related to quantized type (#110604)
[APFloat::getSmallest](915df1ae41/llvm/include/llvm/ADT/APFloat.h (L1060))
(and similarly `APFloat:getLargest`)
```
APFloat getSmallest(const fltSemantics &Sem, bool Negative = false); 
```
return the positive number when the default value for the second
argument is used.

With that being said, the check
[QuantTypes.cpp#L325](96f37ae453/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp (L325))

```c++
 if (scale <= 0.0 || std::isinf(scale) || std::isnan(scale))
    return emitError() << "illegal scale: " << scale;
```  

is already covered by the check which follows
[QuantTypes.cpp#L327](96f37ae453/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp (L327))

```c++
  if (scale < minScale || scale > maxScale)
    return emitError() << "scale out of expressed type range [" << minScale
                       << ", " << maxScale << "]";
```

given that range `[positive-smallest-finite-number,
positive-largest-finite-number]` does not include `inf` and `nan`s.

I propose to remove the redundant check. Any suggestion for improving
the error message is welcome.
2024-10-07 15:16:29 -04:00
Sterling-Augustine
3f5039323c
[SandboxVectorizer][NFC] Remove unused include (#111418) 2024-10-07 11:47:00 -07:00
hill
adbc37d999
[clang-tidy] Fix incorrect command-line option in docs (#111405)
Updated the `HeaderFilterRegex` description to reference
`--header-filter` instead of the incorrect `--header-filter-regex` in
the clang-tidy documentation.
2024-10-07 20:37:00 +02:00
Philip Reames
f11568bcb0 Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)"
This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634.  Change was not approved when landed.
2024-10-07 11:31:57 -07:00
Vladislav Dzhidzhoev
32e90bbe57
[lldb][test] Support remote run of Shell tests (#95986)
1. This commit adds LLDB_TEST_PLATFORM_URL, LLDB_TEST_SYSROOT,
LLDB_TEST_PLATFORM_WORKING_DIR, LLDB_SHELL_TESTS_DISABLE_REMOTE cmake
flags to pass arguments for cross-compilation and remote running of both Shell&API tests.
2. To run Shell tests remotely, it adds 'platform select' and 'platform connect' commands to %lldb
substitution.
3. 'remote-linux' feature added to lit to disable tests failing with
remote execution.
4. A separate working directory is assigned to each test to avoid
conflicts during parallel test execution.
5. Remote Shell testing is run only when LLDB_TEST_SYSROOT is set for
building test sources. The recommended compiler for that is Clang.

---------

Co-authored-by: Vladimir Vereschaka <vvereschaka@accesssoftek.com>
2024-10-07 20:31:33 +02:00
Sterling-Augustine
93bfa7886b
[SandboxVectorizer] Define SeedBundle: a set of instructions to be vectorized [retry] (#111073)
[Retry 110696 with a proper rebase.]

Seed collection will assemble instructions to be vectorized into
SeedBundles. This data structure is not intended to be used directly,
but will be the basis for load bundles, store bundles, and so on.
2024-10-07 11:20:50 -07:00
Oleksandr T.
41b09c5346
[Clang] omit parentheses in fold expressions with a single expansion (#110761)
Fixes #101863
2024-10-07 20:14:46 +02:00
Benoit Jacob
d8a656ffaf
[MLIR] AMDGPUToROCDL: Use a bitcast op to reintepret a vector of i8 as single integer. (#111400)
Found by inspecting AMDGPU assembly - so the arithmetic ops created
there were definitely making their way into the target ISA. A
`LLVM::BitcastOp` seems equivalent, and evaporates as expected in the
target asm.

Along the way, I thought that this helper function `mfmaConcatIfNeeded`
could be renamed to `convertMFMAVectorOperand` to better convey its
contract; so I don't need to think about whether a bitcast is a
legitimate "concat" :-)

---------

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2024-10-07 14:14:18 -04:00
Paul Kirth
fabe7e39df
[llvm][gold] Fix syntax error (#111412)
This seems to have been overlooked in #109847, probably because most
bots don't build w/ gold enabled.
2024-10-07 10:56:52 -07:00
Alexandros Lamprineas
1297ff1765
[FMV][AArch64][NFC] Cleanup attribute metadata from test files. (#111386)
We have a dedicated test to check the target-features for FMV
(clang/test/CodeGen/aarch64-fmv-dependencies.c) therefore I am removing
the autogenerated checks from irrelevant tests since the noise is making
it harder to review actual codegen changes.
2024-10-07 18:54:53 +01:00
Justin Bogner
0b8fec6946
[DirectX] Fix broken test and accidental fallthrough in #110616 (#111410)
Fix an obvious typo in these tests to get them passing, and also fix the
-Wimplicit-fallthrough warning that fires when trying to build.

Reverting #110616 was tricky because of dependencies, so I'm just doing
the easy fix directly here.
2024-10-07 10:33:35 -07:00
Heejin Ahn
69577b2454
[WebAssembly] Support type checker for new EH (#111069)
This adds supports for the new EH instructions (`try_table` and
`throw_ref`) to the type checker.

One thing I'd like to improve on is the locations in the errors for
`catch_***` clauses. Currently they just point to the starting column of
`try_table` instruction itself. But to figure out where catch clauses
start you need to traverse `OperandVector` and check
`WebAssemblyOperand::isCatchList` on them to see which one is the catch
list operand, but `WebAssemblyOperand` class is in AsmParser and
AsmTypeCheck does not have access to it:

cdfdc857cb/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (L43-L204)
And even if AsmTypeCheck has access to it, currently it treats the list
of catch clauses as a single `WebAssemblyOperand` so there is no way to
get the starting location of each `catch_***` clause in the current
structure.

This also renames `valTypeToStackType` to `valTypesToStackTypes`, given
that it takes two type lists.
2024-10-07 10:29:26 -07:00
Nimish Mishra
71b2c4dbc7 [flang] Remove failing integration test 2024-10-07 22:58:19 +05:30
Rainer Orth
b3c1403dc0
[clang] Fix std::tm etc. mangling on Solaris (#106353)
Recently, Solaris bootstrap got broken because Solaris uses a
non-standard mangling of `std::tm` and a few others. This was fixed with
a hack in PR #100724. The Solaris ABI requires mangling `std::tm` as
`tm` and similarly for `std::div_t`, `std::ldiv_t`, and `std::lconv`,
which is what this patch implements. The hack needs to stay in place to
allow building with older versions of `clang`.

Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11` (2-stage
builds with both `clang-19` and `gcc-14` as build compiler), and
`x86_64-pc-linux-gnu`.
2024-10-07 19:05:23 +02:00
Prashant Kumar
971b579bc6
[MLIR] Don't drop attached discardable attributes (#111261)
The creation of pack op was dropping discardable attributes.
2024-10-07 22:21:30 +05:30
Jacob Lalonde
5d372ea6a1
[LLDB][DYLD] Remove logic around not rebasing when main executable has a load address (#110885)
This is a part of #109477 that I'm making into it's own patch. Here we
remove logic from the DYLD that prevents it's logic from running if the
main executable already has a load address. Instead we let the DYLD
fully determine what should be loaded and what shouldn't.
2024-10-07 09:45:56 -07:00
Da-Viper
d4c1789112
Make env and source map dictionaries #95137 (#106919)
Fixes #95137
2024-10-07 12:38:36 -04:00
Simon Pilgrim
f07e1c8619 [clang][x86] Update MMX intrinsic tests for both C/C++ and 32/64-bit targets 2024-10-07 17:36:18 +01:00
Simon Pilgrim
b795c28d8e [clang][x86] Update AVX512F intrinsic tests for both C/C++
Requires some better checking of labels and some call instructions to handle additional markers
2024-10-07 17:36:17 +01:00
Justin Bogner
b2c615fc79 Reapply "[SPIRV] Add radians intrinsic"
I had too many tabs open and reverted #110800 by mistake, it was
supposed to be #110616.

This reverts commit dec6fe3de0f6475ea83391e5b0b4036cf56db35b.
2024-10-07 09:24:34 -07:00
Justin Bogner
dec6fe3de0
Revert "[SPIRV] Add radians intrinsic" (#111398)
Reverts llvm/llvm-project#110800

`llvm\test\CodeGen\DirectX\radians.ll` is failing after this change.
@adam-yang please send a new PR with the issue resolved once you've had
time to investigate.
2024-10-07 09:23:38 -07:00
Rahman Lavaee
1f17c2d20d
[LLD] Deprecate --lto-basic-block-sections=labels (#110697)
This option is now replaced by `--lto-basic-block-address-map`.
2024-10-07 09:22:36 -07:00
Kazu Hirata
02b9c97b75
[memprof] Simplify code with MapVector::operator[] (NFC) (#111335)
Note that the following are all equivalent to each other:

  Map.insert({Key, Value()}).first->second
  Map.try_emplace(Key).first->second
  Map[Key]
2024-10-07 09:00:05 -07:00
Michael Buch
5e7cc37422 [lldb][test] TestDataFormatterLibcxxOptionalSimulator.py: don't use __builtin_printf
This caused Windows CI to fail with:
```
Build Command Output:

make: Entering directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/optional/TestDataFormatterLibcxxOptionalSimulator.test_r0'

"C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang++.exe"  -std=c++11 -gdwarf -O0   -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\data-formatter\data-formatter-stl\libcxx-simulators\optional -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info    -fno-exceptions -D_HAS_EXCEPTIONS=0 -fms-compatibility-version=19.0 -std=c++14  -DREVISION=0 -std=c++14 --driver-mode=g++ -MT main.o -MD -MP -MF main.d -c -o main.o C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\data-formatter\data-formatter-stl\libcxx-simulators\optional/main.cpp

C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe main.o -gdwarf -O0   -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\data-formatter\data-formatter-stl\libcxx-simulators\optional -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info     -fuse-ld=lld  --driver-mode=g++ -o "a.out"

lld-link: error: undefined symbol: printf

>>> referenced by main.o:(main)

clang: error: linker command failed with exit code 1 (use -v to see invocation)

make: *** [Makefile.rules:515: a.out] Error 1

make: Leaving directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/optional/TestDataFormatterLibcxxOptionalSimulator.test_r0'
```
2024-10-07 17:38:14 +02:00
Vladislav Dzhidzhoev
0e8a10b099
[lldb][test] Mark test() in TestBSDArchives.py as passing remotely (#111199)
It was xfail'ed in de2ddc8f3146b. However, it passes on a buildbot
https://lab.llvm.org/staging/#/builders/195/builds/3988.
2024-10-07 17:37:44 +02:00
RipleyTom
c5f7a32356
[X86] Add AMD Llano family detection (#111312)
Very simple one liner, adds the missing detection for the Llano family
which is essentially a refreshed K10:
Documentation of the family id:
https://en.wikichip.org/wiki/amd/cpuid#Family_18_.2812h.29
Documentation that it fits into amdfam10:
https://en.wikipedia.org/wiki/AMD_10h#12h
2024-10-07 08:33:26 -07:00
Nuri Amari
2edd897a42
Make WriteIndexesThinBackend multi threaded (#109847)
We've noticed that for large builds executing thin-link can take on the
order of 10s of minutes. We are only using a single thread to write the
sharded indices and import files for each input bitcode file. While we
need to ensure the index file produced lists modules in a deterministic
order, that doesn't prevent us from executing the rest of the work in
parallel.

In this change we use a thread pool to execute as much of the backend's
work as possible in parallel. In local testing on a machine with 80
cores, this change makes a thin-link for ~100,000 input files run in ~2
minutes. Without this change it takes upwards of 10 minutes.

---------

Co-authored-by: Nuri Amari <nuriamari@fb.com>
2024-10-07 08:16:46 -07:00
Alex Bradbury
2fe1f84db3 [test] Fix llc-start-stop.ll when the default target enables the loop terminator folding pass
Previously this would fail if the default target enabled the loop
terminator folding pass (currently just RISC-V), as it runs after loop
strength reduction.
2024-10-07 16:06:44 +01:00
Aaron Ballman
8565213f2f
[clang] Code owners -> Maintainers transition (#108997)
This is the initial transition from using "code owners" to using
"maintainers".
2024-10-07 11:00:54 -04:00
Simon Pilgrim
f15fe73d23 [clang][x86] Update AVX2 intrinsic tests for both C/C++
Requires some call instructions to handle additional markers
2024-10-07 15:57:22 +01:00
Simon Pilgrim
aa4d94839e [clang][x86] Update FMA/FMA4 intrinsic tests for both C/C++ and 32/64-bit targets
Requires some call instructions to handle additional markers
2024-10-07 15:57:22 +01:00
Simon Pilgrim
f71d62178d [clang][x86] Update F16C intrinsic tests for both C/C++ and 32/64-bit targets
Requires some call instructions to handle additional markers
2024-10-07 15:57:22 +01:00
Simon Pilgrim
aa656366ce [clang][x86] Update AVX1 intrinsic tests for both C/C++
Requires some call instructions to handle additional markers
2024-10-07 15:57:21 +01:00
Pavel Labath
0e2970f0ad
[lldb] Make libc++ simulator tests compatible with category-based ski… (#111353)
…pping, which works by setting a field on the function object. This
doesn't work on a functools.partial object. Use a real function instead.
2024-10-07 16:53:26 +02:00
David Spickett
0e8555d4db
[libclc] Remove mention of BSD license in readme (#111371)
This seems to be an artifact from the intial import in 2012, but even if
not, folks are better off reading the LICENSE.TXT file for the full
details if they need them.

Fixes #109968
2024-10-07 15:26:04 +01:00
Michael Maitland
989c437d7f [RISCV][GISEL][NFC] Add break statement to reduce diff on future changes of preISelLower 2024-10-07 07:23:52 -07:00
Juan Manuel Martinez Caamaño
d5ec01b0dd
Revert "[NFC][EarlyIfConverter] Replace boolean Predicate for a class (#108519)" (#111372)
This reverts commit 9e7315912656628b606e884e39cdeb261b476f16.
2024-10-07 16:03:11 +02:00
Matt Arsenault
5f94b0cbdd
AMDGPU: Try to reuse dest reg for s_add_i32 frame indexes (#111201)
Hack around the register scavenger doing the wrong thing.
It does not find the result register as available in the
case the frame index add isn't also reading the dest register.
This is the quick fix for a regression where the scavenge would
create a broken spill of SGPR to memory. I believe this is still
broken for cases we cannot use the result register.

I'm confused about what position the scavenger iterator
is supposed to be in, and what RestoreAfter is for. The scavenger
is missing a full set of forward/backward APIs and there seems
to be an off by one somewhere.
2024-10-07 18:01:24 +04:00
Kazu Hirata
5fdda41474
[Transforms] Avoid repeated hash lookups (NFC) (#111329) 2024-10-07 07:01:07 -07:00
Kazu Hirata
0614b3cfac
[Analysis] Simplify code with DenseMap::operator[] (NFC) (#111331) 2024-10-07 07:00:45 -07:00
Alexandros Lamprineas
40f0f7b4ec
[FMV][AArch64] Unify features ssbs and ssbs2. (#110297)
According to https://developer.arm.com/documentation/102105/latest Arm
Architecture Reference Manual for A-profile architecture: Known issues

2.206 D22789
In section C5.2.25 "SSBS, Speculative Store Bypass Safe", under the
heading 'Configurations', the text that reads:

"This register is present only when FEAT_SSBS is implemented. Otherwise,
direct accesses to SSBS are UNDEFINED."

is changed to read:

"This register is present only when FEAT_SSBS2 is implemented.
Otherwise, direct accesses to SSBS are UNDEFINED."

This suggests that it's not worth splitting FEAT_SSBS2 from FEAT_SSBS in
the compiler, since FEAT_SSBS cannot be used for predicating the MRS/MSR
instructions. Those can access PSTATE.SSBS only when FEAT_SSBS2 is
available. Moreover, there are no hardware implementations which
implement FEAT_SSBS without FEAT_SSBS2, therefore unifying these
features in the specification should not be a regression for feature
detection.

Approved in ACLE as https://github.com/ARM-software/acle/pull/350
2024-10-07 15:00:08 +01:00
Kazu Hirata
31e8c539e0
[Affine] Avoid repeated hash lookups (NFC) (#111330) 2024-10-07 06:59:23 -07:00
Kazu Hirata
4bc0916011
[Linalg] Avoid repeated hash lookups (NFC) (#111328) 2024-10-07 06:56:22 -07:00
Kazu Hirata
4c9c2d6082
[AST] Avoid repeated hash lookups (NFC) (#111327)
Here I'm splitting up the existing "if" statement into two.  Mixing
hasDefinition() and insert() in one "if" condition would be extremely
confusing as hasDefinition() doesn't change anything while insert()
does.
2024-10-07 06:55:56 -07:00
BARRET
1666d13078
[CMake]: Remove unnecessary dependencies on LLVM/MLIR (#111255)
Previous https://github.com/llvm/llvm-project/pull/110362 (reverted)
caused breakage. Here is the PR with fix.

My build cmdline:

```
cmake ../llvm \
    -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=install \
    -DCMAKE_C_COMPILER=gcc-9 \
    -DCMAKE_CXX_COMPILER=g++-9 \
    -DCMAKE_CUDA_COMPILER=$(which nvcc) \
    -DLLVM_ENABLE_LLD=OFF \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DLLVM_BUILD_EXAMPLES=ON \
    -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
    -DLLVM_CCACHE_BUILD=ON \
    -DMLIR_ENABLE_BINDINGS_PYTHON=ON \
    -DBUILD_SHARED_LIBS=ON \
    -DLLVM_ENABLE_PROJECTS='llvm;mlir'
```
2024-10-07 15:52:43 +02:00
Hans Wennborg
b2784ec328 Revert "[X86] For minsize memset/memcpy, use byte or double-word accesses (#87003)"
This caused assertion failures:

  llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7736:
  SDValue getMemsetValue(SDValue, EVT, SelectionDAG &, const SDLoc &):
  Assertion `C->getAPIntValue().getBitWidth() == 8' failed.

See comment on the PR for a reproducer.

> repstosb and repstosd are the same size, but stosd is only done for 0
> because the process of multiplying the constant so that it is copied
> across the bytes of the 32-bit number adds extra instructions that cause
> the size to increase. For 0, repstosb and repstosd are the same size,
> but stosd is only done for 0 because the process of multiplying the
> constant so that it is copied across the bytes of the 32-bit number adds
> extra instructions that cause the size to increase. For 0, we do not
> need to do that at all.
>
> For memcpy, the same goes, and as a result the minsize check was moved
> ahead because a jmp to memcpy encoded takes more bytes than repmovsb.

This reverts commit 6de5305b3d7a4a19a29b35d481a8090e2a6d3a7e.
2024-10-07 15:18:29 +02:00
Simon Pilgrim
73c9ad2633
[clang][x86] Add constexpr support for some basic SSE1 intrinsics (#111001)
This is an initial patch to enable constexpr support on the more basic SSE1 intrinsics - such as initialization, arithmetic, logic and fixed shuffles.

The plan is to incrementally extend this for SSE2/AVX etc. - initially for the equivalent basic intrinsics, but we can add support for some of the ia32 builtins as well we the need arises.
2024-10-07 14:11:29 +01:00
Simon Pilgrim
5dc7a5e50b [clang][x86] popcntintrin.h - merge the __DEFAULT_FN_ATTRS / __DEFAULT_FN_ATTRS_CONSTEXPR defines. NFC.
We only need one define - so consistently use __DEFAULT_FN_ATTRS like we do in other headers.
2024-10-07 14:10:39 +01:00
Guillaume Chatelet
dda107b8cb
Revert "[libc][bazel] Enable software prefetching for memcpy" (#111370)
Reverts llvm/llvm-project#108939

When `AVX` is available but `-mprefer-vector-width=128` some of the
`mov` instructions turn into the x86 `rep;movsb` instruction leading to
poor performance on "old" architectures (sandybridge, haswell). The
possible solutions are : get rid of the `-mprefer-vector-width` option
or use smaller static copy sizes in
`inline_memcpy_x86_sse2_ge64_sw_prefetching`. Right now a copy size of 3
cache lines (192B) relying exclusively on xmm registers gets turned into
`rep;movsb`.
2024-10-07 15:00:31 +02:00