This now develops the same problem G_ZEXT/G_ANYEXT have where the
requested type is assumed to be the source type. This will be fixed
separately by creating intermediate merges.
Bitcast only really applies between scalars and vectors. Implement as
an unmerge and remerge. The test needs to tolerate failure since one
of the unmerges currently fails to legalize.
We're planning to remove the shufflemask operand from ShuffleVectorInst
(D72467); fix GlobalISel so it doesn't depend on that Constant.
The change to prelegalizercombiner-shuffle-vector.mir happens because
the input contains a literal "-1" in the mask (so the parser/verifier
weren't really handling it properly). We now treat it as equivalent to
"undef" in all contexts.
Differential Revision: https://reviews.llvm.org/D72663
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.
Recommit notes:
Introduce temporary variables in order to make sure
instructions get inserted into MachineFunction in same order
regardless of compiler used to build llvm.
Differential Revision: https://reviews.llvm.org/D71363
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.
Differential Revision: https://reviews.llvm.org/D71363
G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates
these intrinsics from __builtin_bswap32 and __builtin_bswap64.
Add lower and narrowscalar for G_BSWAP.
Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later.
Differential Revision: https://reviews.llvm.org/D71362
https://reviews.llvm.org/D70922
This adds a hook to allow targets to define exactly what extension
operation should be performed for widening constants. This handles cases
like widening i1 true which would end up becoming -1 which affects code
quality during combines.
Additionally, in order to stay consistent with how DAG is promoting
constants, we now signextend for byte sized types and zero extend
otherwise (by default). Targets can of course override this if
necessary.
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
We need to propagate this information from the IR in order to be able to safely
do tail call optimizations on the intrinsics during legalization. Assuming
it's safe to do tail call opt without checking for the marker isn't safe because
the mem libcall may use allocas from the caller.
This adds an extra immediate operand to the end of the intrinsics and fixes the
legalizer to handle it.
Differential Revision: https://reviews.llvm.org/D68151
llvm-svn: 373140
We currently always set the HasCalls on MFI during translation and legalization if
we're handling a call or legalizing to a libcall. However, if that call is later
optimized to a tail call then we don't need the flag. The flag being set to true
causes frame lowering to always save and restore FP/LR, which adds unnecessary code.
This change does the same thing as SelectionDAG and ports over some code that scans
instructions after selection, using TargetInstrInfo to determine if target opcodes
are known calls.
Code size geomean improvements on CTMark:
-O0 : 0.1%
-Os : 0.3%
Differential Revision: https://reviews.llvm.org/D67868
llvm-svn: 372443
r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.
llvm-svn: 372050
For some reason we sometimes insert new instructions one instruction before
the first non-PHI when legalizing. This can result in having non-PHI
instructions before PHIs, which mean that PHI elimination doesn't catch them.
Differential Revision: https://reviews.llvm.org/D67570
llvm-svn: 371901
Because memory intrinsics are handled differently than other calls, we need to
check them for tail call eligiblity in the legalizer. This allows us to still
inline them when it's beneficial to do so, but also tail call when possible.
This adds simple tail calling support for when the intrinsic is followed by a
return.
It ports the attribute checks from `TargetLowering::isInTailCallPosition` into
a similarly-named function in LegalizerHelper.cpp. The target-specific
`isUsedByReturnOnly` hook is not ported here.
Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call
memory intrinsics.
Update legalize-memcpy-et-al.mir to have a case where we don't tail call.
Differential Revision: https://reviews.llvm.org/D67566
llvm-svn: 371893
Unlike SelectionDAG, treat this as a normally legalizable operation.
In SelectionDAG this is supposed to only ever formed if it's legal,
but I've found that to be restricting. For AMDGPU this is contextually
legal depending on whether denormal flushing is allowed in the use
function.
Technically we currently treat the denormal mode as a subtarget
feature, so custom lowering could be avoided. However I consider this
to be a defect, and this should be contextually dependent on the
controllable rounding mode of the parent function.
llvm-svn: 371800
Similar to the issue with G_ZEXT that was fixed earlier, this is a quick
to fall back if the source type is not exactly half of the dest type.
Fixes the clang-cmake-aarch64-lld bot build.
llvm-svn: 370847
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.
Differential Revision: https://reviews.llvm.org/D66185
llvm-svn: 370823
Add lower for G_FPTOUI. Algorithm is similar to the SDAG version
in TargetLowering::expandFP_TO_UINT.
Lower G_FPTOUI for MIPS32.
Differential Revision: https://reviews.llvm.org/D66929
llvm-svn: 370431
This change moves the actual stack pointer manipulation into the legalizer,
available to targets via lower(). The codegen is slightly different because
we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than
adding a negative amount via G_GEP.
Differential Revision: https://reviews.llvm.org/D66678
llvm-svn: 370104
Main difference is in the way Hi for Long shift (HiL) is made.
G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value.
Differential Revision: https://reviews.llvm.org/D66589
llvm-svn: 370064
Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS
to match names of surrounding variables.
Differential Revision: https://reviews.llvm.org/D66587
llvm-svn: 370062
The x86 tests are now broken (in paticular add-scalar.ll now hits the
DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a
custom lowering for this, so that will need to be implemented.
llvm-svn: 369673
This is necessary for handling <3 x s16> on AMDGPU, assuming this
should be handled as 2 separate legalization actions. The alternative
would be for fewerElementsVector to handle 3->2.
llvm-svn: 369547
Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source.
NarrowScalar G_TRUNC to s32 for MIPS32.
Differential Revision: https://reviews.llvm.org/D66202
llvm-svn: 369509
Again, it's weird that these are allowed. Since lowering support was added in
r368709 we started crashing on compiling the neon intrinsics test in the test
suite. This fixes the lowering to fold the 1 elt src/mask case into copies.
llvm-svn: 369135
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
%1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
%2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
%3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 23
All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.
To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
%1 = G_CONSTANT 16
%2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
%2 = G_SEXT_INREG %0, 16
or:
%1 = G_CONSTANT 16
%2:_(s32) = G_SHL %0, %1
%3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.
Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61289
llvm-svn: 368487
I've now needed to add an extra parameter to this call twice recently. Not only
is the signature getting extremely unwieldy, but just updating all of the
callsites and implementations is a pain. Putting the parameters in a struct
sidesteps both issues.
llvm-svn: 368408