524430 Commits

Author SHA1 Message Date
Michał Górny
af6616676f
[mlir] Link libraries that aren't included in libMLIR to libMLIR (#123477)
Use `mlir_target_link_libraries()` to link dependencies of libraries
that are not included in libMLIR, to ensure that they link to the dylib
when they are used in Flang. Otherwise, they implicitly pull in all
their static dependencies, effectively causing Flang binaries to
simultaneously link to the dylib and to static libraries, which is never
a good idea.

I have only covered the libraries that are used by Flang. If you wish, I
can extend this approach to all non-libMLIR libraries in MLIR, making
MLIR itself also link to the dylib consistently.
2025-01-20 17:25:20 +00:00
Michal Paszkowski
5810f157cd
[SPIR-V] Fix SPIRVEmitIntrinsics undefined behavior (#123625)
Before this change InstrSet in SPIRVEmitIntrinsics was uninitialized
before running runOnFunction. This change adds a new function
getPreferredInstructionSet in SPIRVSubtarget.
2025-01-20 09:23:07 -08:00
Simon Pilgrim
7084110518 X86ISelLowering.cpp - remove unused variable missed in #123617 2025-01-20 17:21:40 +00:00
Simon Pilgrim
8ff195cda1 SIISelLowering.cpp - remove unused variable missed in #123617 2025-01-20 17:19:42 +00:00
Kshitij Paranjape
19bd2d6102
[ConstantFolding] Add ilogb in isMathLibCallNoop (#122582)
ilogb libcall was not being constant folded correctly. This patch adds 
ilogb case in isMathLibCallNoop with correct error condition.

Fixes #101873
2025-01-20 12:19:07 -05:00
Alex MacLean
3606876b67
[SDAG] Fix CSE for ADDRSPACECAST nodes (#122912)
Correct CSE in SelectionDAG can make DAG combining more effective and
reduces the size of the DAG and thus should improve compile time.
2025-01-20 09:09:22 -08:00
Nikolas Klauser
0fa05456a8
[libc++] Define an internal API for std::invoke and friends (#116637)
Currently we're using quite different internal names for the
`std::invoke` family of type traits. This adds a layer around the
current implementation to make it easier to understand when it is used
and makes it easier to define multiple implementations of it.
2025-01-20 18:00:15 +01:00
Nikolas Klauser
c248fc1880
[Clang] Document some of the implementation-defined keywords (#84591) 2025-01-20 17:58:16 +01:00
Stephan Hageboeck
7abf44069a
Add missing include to X86MCTargetDesc.h (#123320)
In gcc-15, explicit includes of `<cstdint>` are required when fixed-size
integers are used. In this file, this include only happened as a side
effect of including SmallVector.h

Although llvm compiles fine, the root-project would benefit from
explicitly including it here, so we can backport the patch.

Maybe interesting for @hahnjo and @vgvassilev
2025-01-20 18:52:47 +02:00
Mats Larsen
b95ed30ea2
[IR] Remove unused variables from #123617
Failed to notice them when landing that patch - apologies!
2025-01-21 01:47:38 +09:00
Timm Baeder
e8674af6f4
[clang][bytecode] Diagnose IntegralToPointer casts to non-void (#123619)
But keep evaluating. This is what the current interpreter does as well.
2025-01-20 17:41:18 +01:00
Jay Foad
f33e3d422d
[AMDGPU] Fix DAG types for V_MAD_I64_I32 and V_MAD_U64_U32. NFC. (#123629)
These instructions return a 64-bit result and a 1-bit carry, unlike
smul_lohi and umul_lohi which return a pair of 32-bit results.

This does not appear to make any difference in practice because the DAG
types are not used for anything before these nodes are converted to
MachineInstrs.
2025-01-20 16:29:23 +00:00
Fraser Cormack
c8eb865747
[libclc] Move mad to the CLC library (#123607)
All targets build `__clc_mad` -- even SPIR-V targets -- since it
compiles to the optimal `llvm.fmuladd` intrinsic. There is no change to
the bytecode generated for non-SPIR-V targets.

The `mix` builtin, which is implemented as a wrapper around `mad`, is
left as an OpenCL-layer wrapper of `__clc_mad`. I don't know if it's
worth having a specific CLC version of `mix`.

The changes to the other CLC files/functions are moving uses of `mad` to
`__clc_mad`, and reformatting. There is an additional instance of
`trunc` becoming `__clc_trunc`, which was missed before.
2025-01-20 16:27:51 +00:00
Victor Campos
8368018f20
Fix test of -print-multi-flags-experimental in case of multilib custom flags (#123577)
The test was failing in the case where a `multilib.yaml` file was
present in the installation. This is because the presence of a multilib
YAML file leads to the diagnosing of validity of the multilib custom
flags.

This patch fixes the test by creating a new YAML file with multilib
custom flags to be used by the test.
2025-01-20 16:15:44 +00:00
Mats Petersson
9da7c3ba17
[Flang][OpenMP][NFC] Add tests for align and allocator in allocate clauses (#121356)
No functional change.

(Also, tried to filter out all ALLOCATOR modifiers, but that makes some
other tests fail).
2025-01-20 16:04:23 +00:00
Mats Jun Larsen
416f1c465d
[IR] Replace of PointerType::get(Type) with opaque version (NFC) (#123617)
In accordance with https://github.com/llvm/llvm-project/issues/123569

In order to keep the patch at reasonable size, this PR only covers for
the llvm subproject, unittests excluded.
2025-01-21 00:32:56 +09:00
Ruhung
9c7e02d579
[InstCombine] Fold umax(nuw_mul(x, C0), x + 1) into (x == 0 ? 1 : nuw_mul(x, C0)) (#123468)
This PR introduces the following transformations:

- If C0 is not 0:  
umax(nuw_shl(x, C0), x + 1) -> x == 0 ? 1 : nuw_shl(x, C0)  
- If C0 is not 0 or 1:  
umax(nuw_mul(x, C0), x + 1) -> x == 0 ? 1 : nuw_mul(x, C0)  

Fixes #122388.
Alive2 proof: https://alive2.llvm.org/ce/z/rkp_8U
2025-01-20 16:32:35 +01:00
David Green
8552c49046
[AArch64] Enable UseFixedOverScalableIfEqualCost for more Cortex-x cpus. (#122807)
For similar reasons for fixed-width being prefered to scalable for
Neoverse V2, this patch enables the UseFixedOverScalableIfEqualCost
feature when using -mcpu=cortex-x2, x3, x4 and x925 that are similar to
Neoverse V2.
2025-01-20 15:05:15 +00:00
Renat Idrisov
aa3c31a86f
[MLIR] Prevent invalid IR from being passed outside of RemoveDeadValues (#121079)
This is a follow-up for https://github.com/llvm/llvm-project/pull/119110
and a fix for https://github.com/llvm/llvm-project/issues/118450

RemoveDeadValues used to delete Values and analyzing the IR at the same
time, because of that, `isMemoryEffectFree` got invalid IR with
half-deleted linalg.generic operation. This PR separates analysis and
cleanup to prevent such situation.

Thank you!

---------

Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
2025-01-20 14:48:32 +00:00
Fabian Ritter
cc5eba1737
[AMDGPU] Reject misaligned SGPR constraints for inline asm (#123590)
The indices of SGPR register pairs need to be 2-aligned and SGPR
quadruplets need to be 4-aligned. With this patch, we report an error
when inline asm register constraints specify a misaligned register
index, instead of silently dropping the specified index.

Fixes #123208

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-01-20 15:47:11 +01:00
David Sherwood
fcec8756e2
[LoopVectorize][NFC] Simplify ScaledReductionExitInstrs map (#123368)
For the following variable

DenseMap<const Instruction *, std::pair<PartialReductionChain,
unsigned>>
  ScaledReductionExitInstrs;

we never actually need the PartialReductionChain when using the map.
I've cleaned this up so that this now becomes

  DenseMap<const Instruction *, unsigned> ScaledReductionMap;
2025-01-20 14:41:27 +00:00
Joseph Huber
723a3e746a [OpenMP] Fix mispelled attribute and warning
Summary:
This is spelled `ompx_aligned_barrier` when used directly, but wasn't
included in the list of known assumptions. Fix that so now th test
works.
2025-01-20 08:40:19 -06:00
Andrzej Warzyński
3b001db4f9
[mlir][vector] Update tests for xfer permutation lowering (1/N) (#123076)
1. Remove `%c0 = arith.constant 0 : index` from testt functions. This
   extra Op is not needed (the index can be passed as an argument), so
   this is just noise.
2. Replaced `%cst_0` with `%pad` to communicate what the underlying SSA
   value is intended for.
3. Unified some comments.
2025-01-20 14:36:07 +00:00
zhijian lin
b92cc78060
[llvm-objdump] Print out xcoff load section of xcoff object file with option private-headers (#121226)
[llvm-objdump] Print out xcoff load section of xcoff object file with
option private-headers
2025-01-20 09:35:01 -05:00
Fraser Cormack
9cf24652e7
[AMDGPU] Fix spurious NoAlias results (#122309)
After a30e50fc, AMDGPUAAResult is being called in more situations where
BasicAA isn't sure. This exposed some regressions where NoAlias is being
incorrectly returned for two identical pointers.

The fix is to check the underlying objects for equality before returning
NoAlias.
2025-01-20 14:19:30 +00:00
Joseph Huber
58af82b462
[OpenMP] Remove 'omp assumes' scopes now that we have no inline ASM (#123611)
Summary:
We used this globally scoped `ext_no_call_asm` as a sort of hack around
the compiler that allowed the attributor to optimize out inline assembly
calls to PTX instructions. Quite some time ago I got rid of every inline
assembly call and replaced it with a builitin, so this can just be
deleted.

Furthermore, I use the `[[omp::assume]]` attribute directly for the
aligned barrier usage. This prints an unknown assumption warning (even
though it isn't) so I'm just silencing that for now until I fix it
later.

---------

Co-authored-by: Michael Kruse <github@meinersbur.de>
2025-01-20 08:11:06 -06:00
Timm Baeder
b5c9cba3f3
[clang][bytecode] Don't memcpy() FixedPoint values (#123599)
llvm::FixedPoint is not trivially copyable.
2025-01-20 15:10:12 +01:00
David Sherwood
a733c1fa90
[AArch64][NFC] Move getPartialReductionCost into cpp file (#123370)
The function getPartialReductionCost is already quite large and
is likely to grow in size as we add support for more cases in
future. Therefore, I think it's best to move this into the cpp
file.
2025-01-20 14:07:03 +00:00
Dominic Chen
9b853f63be
[libc++] Fix vector sanitization annotations on destruction (#121031)
In https://reviews.llvm.org/D136765 / https://reviews.llvm.org/D144155,
the asan annotations for `std::vector` were modified to unpoison freed
backing memory on destruction, instead of leaving it poisoned. However,
calling `__clear()` instead of `clear()` skips informing the asan runtime
of this decrease in the accessible container size, which breaks the
invariant that the value of `old_mid` should match the value of `new_mid`
from the previous call to `__sanitizer_annotate_contiguous_container`, which
can trip the sanity checks for the partial poison between [d1, d2) and the
container redzone between [d2, c), if enabled. To fix this, ensure that
`clear()` is called instead, as is already done by `__vdeallocate()`.
Also remove `__clear()`, since it is no longer called.
2025-01-20 08:57:52 -05:00
Kirill Chibisov
977d744b21
[mlir][emitc] Set default dialect to emitc in ops with block (#123036)
This is a follow up to 68a3908148c (func: Set default dialect to
'emitc'), but for other instructions with blocks to make it look
consistent.
2025-01-20 14:48:28 +01:00
bernhardu
57466db7a4
[win/asan] GetInstructionSize: Support some more 3 byte instructions. (#120474)
This patch adds several instructions seen when trying to run a
executable built with ASan with llvm-mingw.
(x86 and x86_64, using the git tip in llvm-project).

Also includes instructions collected by
Roman Pišl and Eric Pouech in the Wine bug reports below.

```
Related: https://github.com/llvm/llvm-project/issues/96270

Co-authored-by: Roman Pišl <rpisl@seznam.cz>
                https://bugs.winehq.org/show_bug.cgi?id=50993
                https://bugs.winehq.org/attachment.cgi?id=70233
Co-authored-by: Eric Pouech <eric.pouech@gmail.com>
                https://bugs.winehq.org/show_bug.cgi?id=52386
                https://bugs.winehq.org/attachment.cgi?id=71626
```
2025-01-20 14:25:52 +01:00
Sjoerd Meijer
456ec1c2f4
[LoopInterchange] Remove 'S' Scalar Dependencies (#119345)
We are not handling 'S' scalar dependencies correctly and have at least
the following miscompiles related to that:

[LoopInterchange] incorrect handling of scalar dependencies and dependence vectors starting with ">" #54176
[LoopInterchange] Interchange breaks program correctness #46867
[LoopInterchange] Loops should not interchanged due to dependencies #47259
[LoopInterchange] Loops should not interchanged due to control flow #47401

This patch does no longer insert the "S" dependency/direction into the
dependency matrix, so a dependency is never "S". We seem to have
forgotten what the exact meaning is of this dependency type, and don't
see why it should be treated differently.

We prefer correctness over incorrect and more aggressive results. I.e.,
this prevents the miscompiles at the expense of handling less cases,
i.e. making interchange more pessimistic. However, some of the cases
that are now rejected for dependence analysis reasons, were rejected
before too but for other reasons (e.g. profitability). So at least for
the llvm regression tests, the number of regression are very reasonable.
This should be a stopgap. We would like to get interchange enabled by
default and thus prefer correctness over unsafe transforms, and later
see if we can get solve the regressions.
2025-01-20 13:04:58 +00:00
Alexey Bataev
1c5b12257d [NVPTX][DEBUGINFO][NFC]Reduce test file to ease maintenance 2025-01-20 05:03:35 -08:00
Graham Hunter
d9f165ddea
[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)
Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder.

The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.
2025-01-20 12:57:05 +00:00
Matthias Gehre
5ce271ef74
[MLIR] TosaToLinalgNamed: Lower unsigned tosa.max_pool2d (#123290)
This PR allows to lower **unsigned** `tosa.max_pool2d` to linalg.
```
// CHECK-LABEL: @max_pool_ui8
func.func @max_pool_ui8(%arg0: tensor<1x6x34x62xui8>) -> tensor<1x4x32x62xui8> {
  // CHECK: builtin.unrealized_conversion_cast {{.*}} : tensor<1x6x34x62xui8> to tensor<1x6x34x62xi8>
  // CHECK: arith.constant 0
  // CHECK: linalg.pooling_nhwc_max_unsigned {{.*}} : (tensor<1x4x32x62xi8>) -> tensor<1x4x32x62xi8>
  // CHECK: builtin.unrealized_conversion_cast {{.*}} : tensor<1x4x32x62xi8> to tensor<1x4x32x62xui8>
  %0 = tosa.max_pool2d %arg0 {pad = array<i64: 0, 0, 0, 0>, kernel = array<i64: 3, 3>, stride = array<i64: 1, 1>} : (tensor<1x6x34x62xui8>) -> tensor<1x4x32x62xui8>
  return %0 : tensor<1x4x32x62xui8>
}
```
It does this by
- converting the MaxPool2dConverter from OpRewriterPattern to
OpConversion Pattern
- adjusting the padding value to the the minimum unsigned value when the
max_pool is unsigned
- lowering to `linalg.pooling_nhwc_max_unsigned` (which uses
`arith.maxui`) when the max_pool is unsigned
2025-01-20 13:42:18 +01:00
Timm Baeder
d70f54f248
[clang][bytecode] Fix reporting failed local constexpr initializers (#123588)
We need to emit the 'initializer of X is not a constant expression' note
for local constexpr variables as well.
2025-01-20 13:25:50 +01:00
Abid Qadeer
0ec153b9fd
[flang][debug] Remove an unused function to fix build. (#123602) 2025-01-20 12:18:27 +00:00
Abid Qadeer
af91372b75
[flang][debug] Improve handling of cyclic derived types. (#122770)
When `RecordType` is converted to corresponding `DIType`, we cache the
information to avoid doing the conversion again.

Our conversion of `RecordType` looks like this:

`ConvertRecordType(RecordType Ty)`
1. If type `Ty` is already in the cache, then return the corresponding
item.
2. Create a place holder `DICompositeTypeAttr` (called `ty_self` below)
for `Ty`
3. Put `Ty->ty_self` in the cache
4. Convert members of `Ty`. This may cause `ConvertRecordType` to be
called again with other types.
5. Create final `DICompositeTypeAttr`
6. Replace the `ty_self` in the cache with one created in step 5 end


The purpose of creating `ty_self` is to handle cases where a member may
have reference to parent type.

Now consider the code below:

```
type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type
```

While processing t1, we could have a structure like below. `t1 -> t2 ->
t1_self`

The `t2` created during handling of `t1` cant be cached on its own as it
contains a place holder reference. It will fail an assert in MLIR if it
is processed standalone. To avoid this problem, we have a check in the
step 6 above to not cache such types. But this check was not tight
enough. It just checked if a type should not have a place holder
reference to another type. It missed the following case where the place
holder reference can be in a type further down the line.

```
type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type
```

So while processing `t1`, we have to stop caching of not only `t3` but
also of `t2`. This PR improves the check and moves the logic inside
`convertRecordType`.

Please note that this limitation of why a type cant have a placeholder
reference is because of how such references are resolved in the mlir.
Please see the discussion at the end of this
[PR](https://github.com/llvm/llvm-project/pull/106571).

I have to change `getDerivedType` so that it will also get the derived
type for things like `type(t2), pointer :: p1` which are wrapped in
`BoxType`. Happy to move it to a new function or a local helper in case
this change is problematic.

Fixes #122024.
2025-01-20 12:03:59 +00:00
Nikita Popov
a79ae862ab [Clang] Regenerate test checks (NFC)
To reduce diffs in an upcoming change.
2025-01-20 12:46:37 +01:00
David Green
27a2d3d088 [AArch64] Build v2i64 Mul cost out of getArithmeticInstrCost and getVectorInstrCost. NFCI
This should not effect the result, unless the getArithmeticInstrCost and
getVectorInstrCost routines learn to produce different costs (with CostKind =
CodeSize for example). The -1 lanes prevent 0 lanes from (incorrectly) being
marked as free.
2025-01-20 11:43:57 +00:00
Nikita Popov
a4d9a8de08 [Clang] Don't match irrelevant attributes in mips return tests (NFC)
The only thing these tests care about from an ABI perspective is sret,
don't also test all the optimization attributes.
2025-01-20 12:41:02 +01:00
Nikita Popov
bd96295e09 [Clang] Use more liberal pointer attribute wildcard in ms-intrinsics tests (NFC)
Allow arbitrary attributes, including those with arguments.
2025-01-20 12:38:38 +01:00
Nikita Popov
2d6d476ffb
[Polly][CMake] Fix exports (#122123)
If Polly is built with LLVM_POLLY_LINK_INTO_TOOLS=ON (the default for
monorepo builds), then Polly will become a dependency of the
LLVMExtensions component, which is part of LLVMExports. As such, all the
Polly libraries also have to be part of LLVMExports.

However, if Polly is built with LLVM_POLLY_LINK_INTO_TOOLS=OFF, we also
end up adding Polly libraries to LLVMExports. This is undesirable, as it
adds a hard dependency from llvm on polly.

Fix this by only exporting polly libraries from LLVMExports if
LLVM_POLLY_LINK_INTO_TOOLS is enabled.
2025-01-20 12:33:29 +01:00
Kiran Chandramohan
4d21096c20
[Flang] Modify module test to run in a sub-directory (#123364)
This is to avoid race conditions with other tests.
2025-01-20 11:27:55 +00:00
Akshat Oke
3ace18d5c0
[CodeGen] MachineFunctionSplitter: Add missing initializer (#123564)
This registers the pass with PassRegistry so we can use -start-before
and other options for machine-function-splitter.
2025-01-20 16:56:46 +05:30
Fraser Cormack
8b7bfb417a [libclc] Rename include guards. NFC. 2025-01-20 11:26:02 +00:00
Vyacheslav Levytskyy
fe7cb15606
[SPIR-V] Improve portability of the code (#123584)
Adding SPIRV to LLVM_ALL_TARGETS
(https://github.com/llvm/llvm-project/pull/119653) revealed a series of
minor compilation problems and sanitizer complaints. This PR is to
address the problem.
2025-01-20 12:05:15 +01:00
Akshat Oke
96c4f978d0
[AMDGPU][NewPM] Port SIOptimizeExecMasking to NPM (#123572) 2025-01-20 16:34:01 +05:30
Benjamin Kramer
0f8297ae0b [bazel] Fix dependencies for 69d3ba3db922fca8cfc47b5f115b6bea6a737aab 2025-01-20 11:46:44 +01:00
Jacek Caban
a16adafd47
[LLD][COFF] Add support for alternate entry point in CHPE metadata on ARM64X (#123346)
Includes handling for ARM64X relocations relative to a symbol.
2025-01-20 11:38:54 +01:00