100 Commits

Author SHA1 Message Date
Amara Emerson
a4c7c66098
[GlobalISel] Document minimum legality requirements for G_IMPLICIT_DEF. (#117609)
The reason for this change is to clarify an existing technical
restriction of LLVM: there needs to be a way to implicitly define a type
if there is any way to legally define that type by another means.
2024-12-09 22:10:13 -08:00
Thorsten Schütt
fc2cc018ec
[GlobalISel] list undocumented opcodes in docs (#119089) 2024-12-08 16:35:33 +01:00
Thorsten Schütt
148fdc519c
[GlobalISel] Add G_ABDS and G_ABDU instructions (#118122)
The DAG has the same instructions: the signed and unsigned absolute
difference of it's input. For AArch64, they map to uabd and sabd for
Neon and SVE. The Neon and SVE instructions will require custom
patterns.

They are pseudo opcodes and are not imported by the IRTranslator. We
need combines to create them.

PowerPC, ARM, and AArch64 have native instructions.

/// i.e trunc(abs(sext(Op0) - sext(Op1))) becomes abds(Op0, Op1) 
///  or trunc(abs(zext(Op0) - zext(Op1))) becomes abdu(Op0, Op1)

For GlobalISel, we are going to write the combines in MIR patterns.

see:
llvm/test/CodeGen/AArch64/abd-combine.ll

- [ ] combine into abd
- [ ] legalize and add td patterns
2024-12-04 12:53:15 +01:00
Thorsten Schütt
a5d09f4ad9
[GlobalISel] Add G_STEP_VECTOR instruction (#115598)
aka llvm.stepvector Intrinsic
2024-11-11 10:45:02 +01:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
Tex Riddell
139688a699
[SPIRV] Add atan2 function lowering (p2) (#110037)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Add generic opcode for atan2
- Add SPIRV lowering for atan2

Part 2 for Implement the atan2 HLSL Function #70096.
2024-09-26 15:00:59 -07:00
Michael Maitland
ee2add0683
[GISEL] Fix bugs and clarify spec of G_EXTRACT_SUBVECTOR (#108848)
The implementation was missing the fact that `G_EXTRACT_SUBVECTOR`
destination and source vector can be different types.

Also fix a bug in the MIR builder for `G_EXTRACT_SUBVECTOR` to generate
the correct opcode.

Clarify the G_EXTRACT_SUBVECTOR specification.
2024-09-17 10:08:39 -04:00
David Green
feac761f37
[GlobalISel][AArch64] Add G_FPTOSI_SAT/G_FPTOUI_SAT (#96297)
This is an implementation of the saturating fp to int conversions for
GlobalISel. On AArch64 the converstion instrctions work this way,
producing saturating results. LegalizerHelper::lowerFPTOINT_SAT is
ported from SDAG.

AArch64 has a lot of existing tests for fptosi_sat, covering a wide
range of types. I have tried to make most of them work all at once, but
a few fall back due to other missing features such as f128 handling for
min/max.
2024-09-16 10:33:59 +01:00
Thorsten Schütt
bece0d7517
[GlobalIsel] Update MIR gallery (#107903)
add more patterns
clarify wip_match_opcode usage
2024-09-10 09:04:54 +02:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Pierre van Houtryve
972c02929b
[GlobalISel][TableGen] MIR Pattern Variadics (#100563)
Allow for matching & rewriting a variable number of arguments in an
instructions.

Solves #87459
2024-08-01 08:30:20 +02:00
Thorsten Schütt
1cc1072349
[GlobalIsel] Add G_SCMP and G_UCMP instructions (#98894)
https://github.com/llvm/llvm-project/pull/83227
2024-07-18 16:22:37 +02:00
Lawrence Benson
177ce1900f
[LLVM] Add llvm.experimental.vector.compress intrinsic (#92289)
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection mask, i.e., it
moves all selected values (i.e., where `mask[i] == 1`) to consecutive
lanes in the result vector. A `passthru` vector can be provided, from
which remaining lanes are filled.

The main reason for this is that the existing
`@llvm.masked.compressstore` has very strong constraints in that it can
only write values that were selected, resulting in guard branches for
all targets except AVX-512 (and even there the AMD implementation is
_very_ slow). More instruction sets support "compress" logic, but only
within registers. So to store the values, an additional store is needed.
But this combination is likely significantly faster on many target as it
avoids branches.

In follow up PRs, my plan is to add target-specific lowerings for x86,
SVE, and possibly RISCV. I also want to combine this with a store
instruction, as this is probably a common case and we can avoid some
memory writes in that case.

See [discussion in
forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663)
for initial discussion on the design.
2024-07-17 14:24:24 +02:00
Daniil Kovalev
1488fb4153
[PAC][AArch64] Lower ptrauth constants in code (#96879)
This re-applies #94241 after fixing buildbot failure, see
https://lab.llvm.org/buildbot/#/builders/51/builds/570

According to standard, `constexpr` variables and `const` variables
initialized with constant expressions can be used in lambdas w/o
capturing - see https://en.cppreference.com/w/cpp/language/lambda.
However, MSVC used on buildkite seems to ignore that rule and does not
allow using such uncaptured variables in lambdas: we have "error C3493:
'Mask16' cannot be implicitly captured because no default capture mode
has been specified" - see
https://buildkite.com/llvm-project/github-pull-requests/builds/73238

Explicitly capturing such a variable, however, makes buildbot fail with
"error: lambda capture 'Mask16' is not required to be captured for this
use [-Werror,-Wunused-lambda-capture]" - see
https://lab.llvm.org/buildbot/#/builders/51/builds/570.

Fix both cases by using `0xffff` value directly instead of giving a name
to it.

Original PR description below.

Depends on #94240.

Define the following pseudos for lowering ptrauth constants in code:

- non-`extern_weak`:
  - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
  - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-06-28 07:29:38 +03:00
Daniil Kovalev
99251f5a11
Revert "[PAC][AArch64] Lower ptrauth constants in code (#94241)" (#96865)
This reverts #94241.

See buildbot failure
https://lab.llvm.org/buildbot/#/builders/51/builds/570
2024-06-27 11:10:38 +03:00
Daniil Kovalev
b5cc19e572
[PAC][AArch64] Lower ptrauth constants in code (#94241)
Depends on #94240.

Define the following pseudos for lowering ptrauth constants in code:

- non-`extern_weak`:
  - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
    PAC;
  - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
  special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
  during relocation resolving instead of a GOT slot.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-06-27 10:02:17 +03:00
Farzon Lotfi
2ae6889d3f
[SPIRV] Add trig function lowering (#95973)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This is part 2 of 4 PRs. It sets the ground work for adding the
intrinsics.

Add SPIRV  Lower for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`
https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966


There isn't any aarch64 change in this pr, but when you add a target
opcode it is visible in there validaiton tests.
2024-06-20 10:34:23 -04:00
Pierre van Houtryve
7d81062352
[GlobalISel] Refactor Combiner MatchData & Apply C++ Code Handling (#92239)
Combiners that use C++ code in their "apply" pattern only use that. They
never mix it with MIR patterns as that has little added value.

This patch restricts C++ apply code so that if C++ is used, we cannot
use MIR patterns or builtins with it. Adding this restriction allows us
to merge calls to match and apply C++ code together, which in turns
makes it so we can just have MatchData variables on the stack.

So before, we would have
```
  GIM_CheckCxxInsnPredicate // match
  GIM_CheckCxxInsnPredicate // apply
  GIR_Done
```
Alongside a massive C++ struct holding the MatchData of all rules
possible (which was a big space/perf issue).

Now we just have
```
GIR_DoneWithCustomAction
```

And the function being ran just does
```
unsigned SomeMatchData;
if (match(SomeMatchData))
  apply(SomeMatchData)
```

This approach solves multiple issues in one:
- MatchData handling is greatly simplified and more efficient, "don't
pay for what you don't use"
  - We reduce the size of the match table
- Calling C++ code has a certain overhead (we need a switch), and this
overhead is only paid once now.

Handling of C++ code inside PatFrags is unchanged though, that still
emits a `GIM_CheckCxxInsnPredicate`. This is completely fine as they
can't use MatchDatas.
2024-05-16 13:39:00 +02:00
Jay Foad
1650f1b3d7
Fix typo "indicies" (#92232) 2024-05-15 13:10:16 +01:00
Farzon Lotfi
3e82442ff7
[SPIRV] Add tan intrinsic part 3 (#90278)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

If you want an overarching view of how this will all connect see:
https://github.com/llvm/llvm-project/pull/90088
Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
-  `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp` - Map the
`G_FTAN` opcode to the GLSL 4.5 and openCL tan instructions.
- `llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp` - Define `G_FTAN` as a
legal spirv target opcode.
2024-05-08 00:57:39 -04:00
Thorsten Schütt
65fb80beae
[GlobalIsel] Add Gallery to MIR Patterns (#89974)
examples for fold of zext(trunc:nuw)
2024-04-26 07:06:49 +02:00
Diana Picus
3ea9ed471c
[GlobalISel] Expand IRTranslator docs. NFC (#89186)
Add some more details about how calls are lowered and what APIs are
available.
2024-04-23 09:20:35 +02:00
Matt Arsenault
dc9664a8ad
CodeGen: Strengthen definition of F{MIN|MAX}NUM_IEEE nodes (#85195)
Previously these were declared as having the 2008 behavior, with
underspecified signed zero handling. Currently, AMDGPU, PPC and
LoongArch mark these as legal. The AMDGPU and PPC instructions respect
the signed zero behavior. The LoongArch documentation doesn't state, but
I'm assuming it also does.
2024-04-22 10:13:04 +02:00
Michael Maitland
da9f06c9b1
[GISEL] G_SPLAT_VECTOR can take a splat that is larger than the vector element (#86974)
This is what SelectionDAG does. We'd like to reuse SelectionDAG
patterns.
2024-04-01 08:46:22 -04:00
Evgenii Kudriashov
d365a45cb3
[GlobalISel] Introduce G_TRAP, G_DEBUGTRAP, G_UBSANTRAP (#84941)
Here we introduce three new GMIR instructions to cover a set of trap
intrinsics. The idea behind it is that generic intrinsics shouldn't be
used with G_INTRINSIC opcode.

These new instructions can match perfectly with existing trap ISD nodes.
It allows X86, AArch64, RISCV and Mips to reuse SelectionDAG patterns for
selection and avoid manual selection. However AMDGPU is an exception. It
selects traps during legalization regardless SelectionDAG or GlobalISel.

Since there are not many places where traps are used, this change
attempts to clean up all the usages of G_INTRINSIC with trap intrinsics. So,
there is no stage when both G_TRAP and
G_INTRINSIC_W_SIDE_EFFECTS(@llvm.trap) are allowed.
2024-03-23 13:12:44 +01:00
Evgenii Kudriashov
02cadde5ed
[Docs][GlobalISel] Fix a long header in GenericOpcode (NFC) (#84976) 2024-03-14 01:59:49 +01:00
Michael Maitland
2f400a2fd7
[GISEL] Add G_VSCALE instruction (#84542) 2024-03-12 20:22:49 -04:00
Michael Maitland
034cc2f5d0
[GISEL] Add G_INSERT_SUBVECTOR and G_EXTRACT_SUBVECTOR (#84538)
G_INSERT and G_EXTRACT are not sufficient to use to represent both
INSERT/EXTRACT on a subregister and INSERT/EXTRACT on a vector.

We would like to be able to INSERT/EXTRACT on vectors in cases that
INSERT/EXTRACT on vector subregisters are not sufficient, so we add
these opcodes.

I tried to do a patch where we treated G_EXTRACT as both
G_EXTRACT_SUBVECTOR and G_EXTRACT_SUBREG, but ran into an infinite loop
at this
[point](8b5b294ec2/llvm/lib/Target/RISCV/RISCVISelLowering.cpp (L9932))
in the SDAG equivalent code.
2024-03-11 13:47:30 -04:00
Michael Maitland
96049fcf4e [GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
Recommits llvm/llvm-project#80378 which was reverted in
llvm/llvm-project#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
2024-03-07 09:10:03 -08:00
Michael Maitland
552da24843
Revert "[GISEL] Add IRTranslation for shufflevector on scalable vector types" (#84330)
Reverts llvm/llvm-project#80378

causing Buildbot failures that did not show up with check-llvm or CI.
2024-03-07 10:16:31 -05:00
Michael Maitland
2b8aaef09e
[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80307, and
https://github.com/llvm/llvm-project/pull/80306.

ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.

`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
2024-03-07 09:50:29 -05:00
Matt Arsenault
c36cbba6fb Update IEEE-754 2018 draft references to IEEE-754 2019 2024-02-29 10:39:06 +05:30
Pierre van Houtryve
7ec996d4c5
[GlobalISel][TableGen] Support Intrinsics in MIR Patterns (#79278) 2024-02-01 08:53:32 +01:00
Craig Topper
5933589370
[GISel][Docs] Add a little bit of documentation for G_FENCE. (#73722) 2023-11-28 18:57:20 -08:00
Pierre van Houtryve
96e9786414
[TableGen][GlobalISel] Add MIFlags matching & rewriting (#71179)
Also disables generation of MutateOpcode. It's almost never used in
combiners anyway.
If we really want to use it, it needs to be investigated & properly
fixed (see TODO)
    
Fixes #70780
2023-11-08 10:31:49 +01:00
Pierre van Houtryve
b26e6a8eb5
[GlobalISel] Add GITypeOf special type (#66079)
Allows creating a register/immediate that uses the same type as a
matched operand.
2023-10-31 09:57:10 +01:00
Pierre van Houtryve
0841955bf3
[TableGen] Use buildConstant to emit apply pattern immediates (#66077)
Use `MachineIRBuilder::buildConstant` to emit typed immediates in
'apply' MIR patterns.
This adds flexibility, e.g. it allows us to seamlessly handle vector
cases, where a `G_BUILD_VECTOR` is needed to create a splat.
2023-10-17 10:39:59 +02:00
pvanhout
844c0da777 [TableGen][GlobalISel] Add MIR Pattern Builtins
Adds a new feature to MIR patterns: builtin instructions.
They offer some additional capabilities that currently cannot be expressed without falling back to C++ code.
There are two builtins added with this patch, but more can be added later as new needs arise:
 - GIReplaceReg
 - GIEraseRoot

Depends on D158714, D158713

Reviewed By: arsenm, aemerson

Differential Revision: https://reviews.llvm.org/D158975
2023-09-05 08:19:07 +02:00
Kazu Hirata
3a14993fa4 Fix typos in documentation 2023-08-27 00:18:14 -07:00
pvanhout
0a59e1a85c [GlobalIsSel] Allow using PatFrags with multiple defs as the root of a combine rule
I had to tighten the restrictions on PatFrags a bit to make it consistent: instructions that
define the root of a PF can only have one def.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D157700
2023-08-24 09:09:24 +02:00
Aaron Ballman
87a8e22475 Fix LLVM Sphinx build
This addresses issues found by:
https://lab.llvm.org/buildbot/#/builders/30/builds/38760
2023-08-14 07:22:03 -04:00
David Green
a3f2751f78 [AArch64][GISel] Add handling for G_VECREDUCE_FMAXIMUM and G_VECREDUCE_FMINIMUM
This is a lot of copy-pasting for the existing handling of
G_VECREDUCE_FMAX/G_VECREDUCE_FMIN to add handling for
G_VECREDUCE_FMAXIMUM/G_VECREDUCE_FMINIMUM in the same way.

Differential Revision: https://reviews.llvm.org/D156615
2023-08-14 10:03:25 +01:00
pvanhout
63afb70503 [RFC][GlobalISel] Overhauled MIR Patterns Support for Combiners
See https://discourse.llvm.org/t/rfc-overhauled-mir-patterns-for-globalisel-combiners/72264

This is a complete overrhaul of the recently-added GlobalISel Match Table backend which adds
support for MIR patterns for both match and apply patterns.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D156315
2023-08-11 08:54:56 +02:00
Sameer Sahasrabuddhe
d9847cde48 [GlobalISel] convergent intrinsics
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:

- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS

Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154766
2023-07-31 12:15:39 +05:30
Amara Emerson
1c2c668846 [GlobalISel] Introduce G_CONSTANT_FOLD_BARRIER and use it to prevent constant folding
hoisted constants.

The constant hoisting pass tries to hoist large constants into predecessors and also
generates remat instructions in terms of the hoisted constants. These aim to prevent
codegen from rematerializing expensive constants multiple times. So we can re-use
this optimization, we can preserve the no-op bitcasts that are used to anchor
constants to the predecessor blocks.

SelectionDAG achieves this by having the OpaqueConstant node, which is just a
normal constant with an opaque flag set. I've opted to avoid introducing a new
constant generic instruction here. Instead, we have a new G_CONSTANT_FOLD_BARRIER
operation that constitutes a folding barrier.

These are somewhat like the optimization hints, G_ASSERT_ZEXT in that they're
eliminated by the generic instruction selection code.

This change by itself has very minor improvements in -Os CTMark overall. What this
does allow is better optimizations when future combines are added that rely on having
expensive constants remain unfolded.

Differential Revision: https://reviews.llvm.org/D144336
2023-06-09 11:45:06 -07:00
Chen Zheng
6ee2f770ef [PowerPC][GISel] add support for fpconstant
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D133340
2023-02-14 02:39:22 +00:00
Amara Emerson
53445f5b1c [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
We currently have a bug where the legalizer, when dealing with phi operands,
may create instructions in the phi's incoming blocks at points which are effectively
dead due to a possible exception throw.

Say we have:

throwbb:
  EH_LABEL
  x0 = %callarg1
  BL @may_throw_call
  EH_LABEL
  B returnbb

bb:
  %v = phi i1 %true, throwbb, %false....

When legalizing we may need to widen the i1 %true value, and to do that we need
to create new extension instructions in the incoming block. Our insertion point
currently is the MBB::getFirstTerminator() which puts the IP before the unconditional
branch terminator in throwbb. These extensions may never be executed if the call
throws, and therefore we need to emit them before the call (but not too early, since
our new instruction may need values defined within throwbb as well).

throwbb:
  EH_LABEL
  x0 = %callarg1
  BL @may_throw_call
  EH_LABEL
  %true = G_CONSTANT i32 1 ; <<<-- ruh'roh, this never executes if may_throw_call() throws!
  B returnbb

bb:
  %v = phi i32 %true, throwbb, %false....

To fix this, I've added two new instructions. The main idea is that G_INVOKE_REGION_START
is a terminator, which tries to model the fact that in the IR, the original invoke inst
is actually a terminator as well. By using that as the new insertion point, we
make sure to place new instructions on always executing paths.

Unfortunately we still need to make the legalizer use a new insertion point API
that I've added, since the existing `getFirstTerminator()` method does a reverse
walk up the block, and any non-terminator instructions cause it to bail out. To
avoid impacting compile time for all `getFirstTerminator()` uses, I've added a new
method that does a forward walk instead.

Differential Revision: https://reviews.llvm.org/D137905
2022-12-07 10:28:51 -08:00
Serge Pavlov
ec893da990 [GlobalISel] Remove semantic operand of G_IS_FPCLASS
Instruction G_IS_FPCLASS had an operand that represented floating-point
semantics of its first operand. It allowed types that have the same length,
like `bfloat16` and `half`, to be distinguished. Unfortunately, it is
not sufficient, as other operation still cannot distinguish such types.
Solution of this problem must be more general, so now this operand is removed.

Differential Revision: https://reviews.llvm.org/D138004
2022-11-15 15:48:05 +07:00
Matt Arsenault
1ee6ce9bad GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.

The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.

I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
2022-07-08 11:55:08 -04:00
Shilei Tian
1023ddaf77 [LLVM] Add the support for fmax and fmin in atomicrmw instruction
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw`
instruction. For now (at least in this patch), the instruction will be expanded
to CAS loop. There are already a couple of targets supporting the feature. I'll
create another patch(es) to enable them accordingly.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D127041
2022-07-06 10:57:53 -04:00