We currently have a bug where the legalizer, when dealing with phi operands,
may create instructions in the phi's incoming blocks at points which are effectively
dead due to a possible exception throw.
Say we have:
throwbb:
EH_LABEL
x0 = %callarg1
BL @may_throw_call
EH_LABEL
B returnbb
bb:
%v = phi i1 %true, throwbb, %false....
When legalizing we may need to widen the i1 %true value, and to do that we need
to create new extension instructions in the incoming block. Our insertion point
currently is the MBB::getFirstTerminator() which puts the IP before the unconditional
branch terminator in throwbb. These extensions may never be executed if the call
throws, and therefore we need to emit them before the call (but not too early, since
our new instruction may need values defined within throwbb as well).
throwbb:
EH_LABEL
x0 = %callarg1
BL @may_throw_call
EH_LABEL
%true = G_CONSTANT i32 1 ; <<<-- ruh'roh, this never executes if may_throw_call() throws!
B returnbb
bb:
%v = phi i32 %true, throwbb, %false....
To fix this, I've added two new instructions. The main idea is that G_INVOKE_REGION_START
is a terminator, which tries to model the fact that in the IR, the original invoke inst
is actually a terminator as well. By using that as the new insertion point, we
make sure to place new instructions on always executing paths.
Unfortunately we still need to make the legalizer use a new insertion point API
that I've added, since the existing `getFirstTerminator()` method does a reverse
walk up the block, and any non-terminator instructions cause it to bail out. To
avoid impacting compile time for all `getFirstTerminator()` uses, I've added a new
method that does a forward walk instead.
Differential Revision: https://reviews.llvm.org/D137905
Instruction G_IS_FPCLASS had an operand that represented floating-point
semantics of its first operand. It allowed types that have the same length,
like `bfloat16` and `half`, to be distinguished. Unfortunately, it is
not sufficient, as other operation still cannot distinguish such types.
Solution of this problem must be more general, so now this operand is removed.
Differential Revision: https://reviews.llvm.org/D138004
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.
The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.
I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw`
instruction. For now (at least in this patch), the instruction will be expanded
to CAS loop. There are already a couple of targets supporting the feature. I'll
create another patch(es) to enable them accordingly.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127041
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)
TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.
While the end result of discussion may lead back to the current design,
it may also not lead to the current design.
Therefore i take it upon myself
to revert the tree back to last known good state.
This reverts commit 4c4093e6e39fe6601f9c95a95a6bc242ef648cd5.
This reverts commit 0a2b1ba33ae6dcaedb81417f7c4cc714f72a5968.
This reverts commit d9873711cb03ac7aedcaadcba42f82c66e962e6e.
This reverts commit 791006fb8c6fff4f33c33cb513a96b1d3f94c767.
This reverts commit c22b64ef66f7518abb6f022fcdfd86d16c764caf.
This reverts commit 72ebcd3198327da12804305bda13d9b7088772a8.
This reverts commit 5fa6039a5fc1b6392a3c9a3326a76604e0cb1001.
This reverts commit 9efda541bfbd145de90f7db38d935db6246dc45a.
This reverts commit 94d3ff09cfa8d7aecf480e54da9a5334e262e76b.
Basically the same as G_LROUND. Handles the llvm.llround family of intrinsics.
Also add a helper function to the MachineVerifier for checking if all of the
(virtual register) operands of an instruction are scalars. Seems like a useful
thing to have.
Differential Revision: https://reviews.llvm.org/D108429
Add a generic opcode equivalent to the `llvm.isnan` intrinsic +
MachineVerifier support for it.
We need an opcode here because we may want target-specific lowering later on.
Differential Revision: https://reviews.llvm.org/D108222
There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG.
E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain
code that matches a bitfield extract from an and + right shift.
Rather than duplicating code in the same way, this adds two opcodes:
- G_UBFX (unsigned bitfield extract)
- G_SBFX (signed bitfield extract)
They work like this
```
%x = G_UBFX %y, %lsb, %width
```
Where `lsb` and `width` are
- The least-significant bit of the extraction
- The width of the extraction
This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends
the result, while G_SBFX sign-extends the result.
This should allow us to use the combiner to match the bitfield extraction
patterns rather than duplicating pattern-matching code in each target.
Differential Revision: https://reviews.llvm.org/D98464
It is good to have a combined `divrem` instruction when the
`div` and `rem` are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.
Reviewed By: arsenm, paquette
Differential Revision: https://reviews.llvm.org/D96013
This adds a G_ASSERT_SEXT opcode, similar to G_ASSERT_ZEXT. This instruction
signifies that an operation was already sign extended from a smaller type.
This is useful for functions with sign-extended parameters.
E.g.
```
define void @foo(i16 signext %x) {
...
}
```
This adds verifier, regbankselect, and instruction selection support for
G_ASSERT_SEXT equivalent to G_ASSERT_ZEXT.
Differential Revision: https://reviews.llvm.org/D96890
This adds a generic opcode which communicates that a type has already been
zero-extended from a narrower type.
This is intended to be similar to AssertZext in SelectionDAG.
For example,
```
%x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16
```
Signifies that the top 48 bits of %x are known to be 0.
This is useful in cases like this:
```
define i1 @zeroext_param(i8 zeroext %x) {
%cmp = icmp ult i8 %x, -20
ret i1 %cmp
}
```
In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit
value.
If we know that `%x` is already zero-ed out in the relevant high bits, we can
avoid the truncate.
Currently, in GISel, this looks like this:
```
_zeroext_param:
and w8, w0, #0xff ; We don't actually need this!
cmp w8, #236
cset w0, lo
ret
```
While SDAG does not produce the truncation, since it knows that it's
unnecessary:
```
_zeroext_param:
cmp w0, #236
cset w0, lo
ret
```
This patch
- Adds G_ASSERT_ZEXT
- Adds MIRBuilder support for it
- Adds MachineVerifier support for it
- Documents it
It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There
should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.)
This allows us to skip over hints in the legalizer etc. These can then later
be selected like COPY instructions or removed.
Differential Revision: https://reviews.llvm.org/D95564
Some of the lower implementations were relying on this, however the
type was not set depending on which form .lower* helper form you were
using. For instance, if you used an unconditonal lower(), the type was
never set. Most of the lower actions do not benefit from a type
parameter, and just expand in terms of the original operation's types.
However, some lowerings could benefit from an additional type hint to
combine a promotion and an expansion. An example of this is for
add/sub sat. The DAG integer legalization tries to use smarter
expansions directly when promoting the integer type, and doesn't
always produce the same instruction with a wider type.
Treat this as an optional hint argument, that only means something for
specific lower actions. It may be useful to generalize this mechanism
to pass a full list of type indexes and desired types, but I haven't
run into a case like that yet.
Before the fix the build of docs-llvm-html would fail.
The rG8bc03d216824 introduced a reference to an undefined label,
so we have warning as:
llvm-project/llvm/docs/GlobalISel/GenericOpcode.rst:295:\
undefined label: i_intr_llvm_ptrmask (if the link has no\
caption the label must precede a section header)
Confusingly, these were unrelated and had different semantics. The
G_PTR_MASK instruction predates the llvm.ptrmask intrinsic, but has a
different format. G_PTR_MASK only allows clearing the low bits of a
pointer, and only a constant number of bits. The ptrmask intrinsic
allows an arbitrary mask. Replace G_PTR_MASK to match the intrinsic.
Only selects the cases that look like the old instruction. More work
is needed to select the general case. Also new legalization code is
still needed to deal with the case where the incoming mask size does
not match the pointer size, which has a specified behavior in the
langref.
Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values.
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77372
Summary:
Add new generic MIR opcodes G_SADDSAT etc. Add support in IRTranslator
for translating the saturating add/subtract intrinsics to the new
opcodes.
Reviewers: aemerson, dsanders, paquette, arsenm
Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76600
It looks like I pushed an older version of this commit without the review
fixups earlier. This applies the review changes
Differential Revision: https://reviews.llvm.org/D69545
Summary:
Rework the GMIR documentation to focus more on the end user than the
implementation and tie it in to the MIR document. There was also some
out-of-date information which has been removed.
The quality of the GenericOpcode reference is highly variable and drops
sharply as I worked through them all but we've got to start somewhere :-).
It would be great if others could expand on this too as there is an awful
lot to get through.
Also fix a typo in the definition of G_FLOG. Previously, the comments said
we had two base-2's (G_FLOG and G_FLOG2).
Reviewers: aemerson, volkan, rovka, arsenm
Reviewed By: rovka
Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69545
Summary:
This is largely based off of the slides from the keynote
Depends on D69545
Reviewers: volkan, rovka, arsenm
Subscribers: wdng, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69644
The legalizer page was in a fairly good state. I've mostly just inlined
some information as a note and removed a reference to potential future
work that I think is very unlikely to be done (it's very hard to tell if
a pattern or set of patterns fully covers a node due to C++ predicates).
Also added a note that 'selectable' doesn't mean that InstructionSelect
must do it.