30371 Commits

Author SHA1 Message Date
Stephen Tozer
e64f3ccca3 Reapply "[DebugInfo] Add DWARF emission for DBG_VALUE_LIST"
This reverts commit 429c6ecbb302e2beedd8694378ae5be456206209.
2021-03-10 15:59:24 +00:00
Stephen Tozer
429c6ecbb3 Revert "[DebugInfo] Add DWARF emission for DBG_VALUE_LIST"
This reverts commit 0da27ba56c9f5e3f534a65401962301189eac342.

This revision was causing an error on the sanitizer-x86_64-linux-autoconf build.
2021-03-10 14:35:33 +00:00
gbtozers
0da27ba56c [DebugInfo] Add DWARF emission for DBG_VALUE_LIST
This patch allows DBG_VALUE_LIST instructions to be emitted to DWARF with valid
DW_AT_locations. This change mainly affects DbgEntityHistoryCalculator, which
now tracks multiple registers per value, and DwarfDebug+DwarfExpression, which
can now emit multiple machine locations as part of a DWARF expression.

Differential Revision: https://reviews.llvm.org/D83495
2021-03-10 13:46:20 +00:00
Christudasan Devadasan
4c6ab48fb1 GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM
It is good to have a combined `divrem` instruction when the
`div` and `rem` are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.

Reviewed By: arsenm, paquette

Differential Revision: https://reviews.llvm.org/D96013
2021-03-10 18:46:07 +05:30
Jinzheng Tu
481079e284 [NFC] Unify FIME with FIXME in comments
There are 5 occurrences FIME and 15333 FIXME. All of them should be FIXME.

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D98321
2021-03-10 14:00:51 +01:00
Serguei Katkov
2fccd1b00a [Statepoint Lowering] Fix the crash with gc.relocate in a separate block
If it was decided to relocate derived pointer using the spill its value is
not exported in general case.
When gc.relocate is located in an another block than a statepoint we cannot
get SD for derived value but for spill case it is not required at all.
However implementation of gc.relocate lowering unconditionally request SD value
causing the assert triggering.

The CL fixes this by handling spill case earlier than SD is really required.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98324
2021-03-10 19:51:04 +07:00
gbtozers
7d0cafba96 [DebugInfo] Process DBG_VALUE_LIST in LiveDebugVariables
This patch adds support for DBG_VALUE_LIST in the LiveDebugVariables pass. The
changes are mostly in computeIntervals, extendDef, and addDefsFromCopies; when
extending the def of a DBG_VALUE_LIST the live ranges of every used register
must be considered, and when such a def is killed by more than one of its used
registers being killed at the same time it is necessary to find valid copies of
all of those registers to create a new def with.

The DebugVariableValue class has also been changed to reference multiple
location numbers instead of just one. This has been accomplished by using a
C-style array with a unique_ptr and an array length packed into 6 bits, to
minimize the size of the class (which must be kept low to be used with
IntervalMap). This may not be the most efficient solution possible, and should
be looked at if performance issues arise.

Differential Revision: https://reviews.llvm.org/D83895
2021-03-10 12:37:59 +00:00
Philip Reames
d6394d86ca [cgp] improve robustness of uadd/usub transforms
LSR prefers to schedule iv increments just before the latch.  The recent 80511565 broadened this to moving increments in the original IR.  This pointed out a robustness problem with the CGP transform.

When we have a use of an induction increment outside of the loop (we canonicalize away from this form, but it happens e.g. unanalyzeable loops) we'd avoid performing the uadd/usub transform.  Interestingly, all of these involve moving the increment closer to it's operands, so there's no concern about dominating all uses.  We can handle that case cheaply, resulting in a more robust transform.
2021-03-09 11:52:08 -08:00
Amara Emerson
55e760769b [GlobalISel] Fold away G_BUILD_VECTOR with all elements extracted.
If every element is extracted from a G_BUILD_VECTOR, pass through the source
registers. This is different to the extract(build_vector) combine because this
one tolerates multiple users as long as they're exhaustive.

Differential Revision: https://reviews.llvm.org/D97890
2021-03-09 11:34:26 -08:00
Philip Reames
e85d798b5b [cgp] group related code together [nfc] 2021-03-09 11:23:15 -08:00
Amara Emerson
e60ab72137 [AArch64][GlobalISel] Add combine for extract_vector_elt(build_vector, cst)
Differential Revision: https://reviews.llvm.org/D97835
2021-03-09 11:08:02 -08:00
gbtozers
e2196ddcdb [DebugInfo] Process DBG_VALUE_LIST in LiveDebugValues
This patch implements DBG_VALUE_LIST handling to the LiveDebugValues pass. This
is a substantial change, and makes a few fundamental changes to the existing
logic.

We still use the basic model of a VarLocMap that is indexed by a LocIndex, with
a VarLocSet (a CoalescingBitVector underneath) giving us efficient lookups of
existing variable locations for a given location type. The main change is that
the VarLocMap may contain a given VarLoc multiple times (once for each unique
location operand), so that a VarLoc can be looked up from any of the registers
that it uses. This means that each VarLoc has multiple corresponding LocIndexes;
to allow us to iterate through the set of VarLocs (previously we would iterate
through the VarLocSet), we now also maintain a single entry in the VarLocMap
that contains every VarLoc exactly once.

The VarLoc class itself is also changed; this change is much simpler,
refactoring out location-specific members into a MachineLocation class and
adding a vector of these locations.

Differential Revision: https://reviews.llvm.org/D83890
2021-03-09 18:58:26 +00:00
Nikita Popov
55ae279ba7 [FastISel] Don't trivially kill extractvalues (PR49467)
All extractvalues of the same value at the same index will map to
the same register, so even if one specific extractvalue only has
one use, we should not mark it as a trivial kill, as there may be
more extractvalues later.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49467.

Differential Revision: https://reviews.llvm.org/D98145
2021-03-09 18:46:38 +01:00
gbtozers
df69c69427 [DebugInfo] Handle multiple variable location operands in IR
This patch updates the various IR passes to correctly handle dbg.values with a
DIArgList location. This patch does not actually allow DIArgLists to be produced
by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
Other than that, it should cover every IR pass.

Most of the changes simply extend code that operated on a single debug value to
operate on the list of debug values in the style of any_of, all_of, for_each,
etc. Instances of setOperand(0, ...) have been replaced with with
replaceVariableLocationOp, which takes the value that is being replaced as an
additional argument. In places where this value isn't readily available, we have
to track the old value through to the point where it gets replaced.

Differential Revision: https://reviews.llvm.org/D88232
2021-03-09 16:44:38 +00:00
gbtozers
5491a86f59 [DebugInfo] Emit DBG_VALUE_LIST from ISel
This patch completes ISel support for DIArgList dbg.values by allowing
SDDbgValues with multiple location operands to be emitted as DBG_VALUE_LIST
instructions.

The primary change of this patch is refactoring EmitDbgValue by pulling location
operand emission out to the new function AddDbgValueLocationOps, which is used
for both DIArgList and single value dbg.values. Outside of that, the only
behaviour change is that the scheduler has a lambda added, HasUnknownVReg, to
prevent us from attempting to emit a DBG_VALUE_LIST before all of its used VRegs
have become available.

Differential Revision: https://reviews.llvm.org/D88592
2021-03-09 12:17:39 +00:00
John Brawn
bf3a271960 [CodeGen] Report a normal instead of fatal error for label redefinition
A symbol being redefined as a label is something that can happen as a result of
ordinary input, so it shouldn't cause a fatal error. Also adjust the error
message to match the one you get when a symbol is redefined as a variable.

Differential Revision: https://reviews.llvm.org/D98181
2021-03-09 10:54:41 +00:00
Cullen Rhodes
2750f3ed31 [IR] Introduce llvm.experimental.vector.splice intrinsic
This patch introduces a new intrinsic @llvm.experimental.vector.splice
that constructs a vector of the same type as the two input vectors,
based on a immediate where the sign of the immediate distinguishes two
variants. A positive immediate specifies an index into the first vector
and a negative immediate specifies the number of trailing elements to
extract from the first vector.

For example:

  @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E>  ; index
  @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count

These intrinsics support both fixed and scalable vectors, where the
former is lowered to a shufflevector to maintain existing behaviour,
although while marked as experimental the recommended way to express
this operation for fixed-width vectors is to use shufflevector. For
scalable vectors where it is not possible to express a shufflevector
mask for this operation, a new ISD node has been implemented.

This is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].

Patch by Paul Walker and Cullen Rhodes.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D94708
2021-03-09 10:44:22 +00:00
gbtozers
93b170ea24 [DebugInfo] Handle dbg.values with multiple variable location operands in ISel
This patch adds partial support in Instruction Selection for dbg.values that use
a DIArgList. This patch does not add support for producing DBG_VALUE_LIST, but
adds the logic for processing DIArgLists within the ISel pass. This change is
largely focused on handleDebugValue and some of the functions that it calls.
Outside of this, salvageDebugInfo and transferDbgValues have been modified to
replace individual operands instead of the entire value; dangling debug info for
variadic debug values is not currently supported (but may be added later).

Differential Revision: https://reviews.llvm.org/D88589
2021-03-09 09:48:03 +00:00
Ta-Wei Tu
cf82700af8 [CodeGenPrepare] Fix isIVIncrement (PR49466)
In the NFC commit 8d835f42a57f15c0b9053bd7c41ea95821a40e5f, the check for `!L` is
moved to a separate function `getIVIncrement` which, instead of using `BO->getParent()`,
uses `PN->getParent()`. However, these two basic blocks are not necessarily the same.

https://bugs.llvm.org/show_bug.cgi?id=49466 demonstrates a case where `PN` is contained in
a loop while `BO` is not, causing the null-pointer dereference in `L->getLoopLatch()`.

This patch checks whether both `BO` and `PN` belong to the same loop before entering `getIVIncrement`.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D98144
2021-03-09 13:32:34 +08:00
Jessica Paquette
f7d73a6b9e [SelectionDAG] Don't scalarize vector fpround sources that don't need it.
Similar to the workaround code in ScalarizeVecRes_UnaryOp, ScalarizeVecRes_SETCC
, ScalarizeVecRes_VSELECT, etc.

If we have a case like this:

```
define <1 x half> @func(<1 x float> %x) {
  %tmp = fptrunc <1 x float> %x to <1 x half>
  ret <1 x half> %tmp
}
```

On AArch64, the <1 x float> is legal. So, this will crash if we call
GetScalarizedVector on it.

Differential Revision: https://reviews.llvm.org/D98208
2021-03-08 14:37:33 -08:00
Jessica Paquette
5c26be214d [AArch64][GlobalISel] Lower G_BUILD_VECTOR -> G_DUP
If we have

```
%vec = G_BUILD_VECTOR %reg, %reg, ..., %reg
```

Then lower it to

```
%vec = G_DUP %reg
```

Also update the selector to handle constant splats on G_DUP.

This will not combine when the splat is all zeros or ones. Tablegen-imported
patterns rely on these being G_BUILD_VECTOR.

Minor code size improvements on CTMark at -Os.

Also adds some utility functions to make it a bit easier to recognize splats,
and an AArch64-specific splat helper.

Differential Revision: https://reviews.llvm.org/D97731
2021-03-08 13:01:10 -08:00
Min-Yih Hsu
6dcc325ce0 [M68k][MIR](2/8) Changes in the target-independent MIR part
- Add new callback in `TargetInstrInfo` --
  `isPCRelRegisterOperandLegal` -- to query whether pc-rel
   register MachineOperand is legal.
 - Add new function to search DebugLoc in a reverse ordering

Authors: myhsu, m4yers, glaubitz

Differential Revision: https://reviews.llvm.org/D88386
2021-03-08 12:30:57 -08:00
Stephen Tozer
c0450af559 Fix: [DebugInfo] Support representation of multiple location operands in SDDbgValue
Removes a "default" label from a fully covered switch, causing errors on
-Wcovered-switch-default builds.
2021-03-08 19:14:12 +00:00
gbtozers
9525af7b91 [DebugInfo] Support representation of multiple location operands in SDDbgValue
This patch modifies the class that represents debug values during ISel,
SDDbgValue, to support multiple location operands (to represent a dbg.value that
uses a DIArgList). Part of this class's functionality has been split off into a
new class, SDDbgOperand.

The new class SDDbgOperand represents a single value, corresponding to an SSA
value or MachineOperand in the IR and MIR respectively. Members of SDDbgValue
that were previously related to that specific value (as opposed to the
variable or DIExpression), such as the Kind enum, have been moved to
SDDbgOperand. SDDbgValue now contains an array of SDDbgOperand instead, allowing
it to hold more than one of these values.

All changes outside SDDbgValue are simply updates to use the new interface.

Differential Revision: https://reviews.llvm.org/D88585
2021-03-08 18:45:17 +00:00
gbtozers
e5d958c456 [DebugInfo] Support DIArgList in DbgVariableIntrinsic
This patch updates DbgVariableIntrinsics to support use of a DIArgList for the
location operand, resulting in a significant change to its interface. This patch
does not update all IR passes to support multiple location operands in a
dbg.value; the only change is to update the DbgVariableIntrinsic interface and
its uses. All code outside of the intrinsic classes assumes that an intrinsic
will always have exactly one location operand; they will still support
DIArgLists, but only if they contain exactly one Value.

Among other changes, the setOperand and setArgOperand functions in
DbgVariableIntrinsic have been made private. This is to prevent code from
setting the operands of these intrinsics directly, which could easily result in
incorrect/invalid operands being set. This does not prevent these functions from
being called on a debug intrinsic at all, as they can still be called on any
CallInst pointer; it is assumed that any code directly setting the operands on a
generic call instruction is doing so safely. The intention for making these
functions private is to prevent DIArgLists from being overwritten by code that's
naively trying to replace one of the Values it points to, and also to fail fast
if a DbgVariableIntrinsic is updated to use a DIArgList without a valid
corresponding DIExpression.
2021-03-08 14:36:13 +00:00
serge-sans-paille
08d9e2ceec [NFC] Avoid useless BitVector move 2021-03-08 15:16:23 +01:00
Craig Topper
0eb405c3b8 [SelectionDAG] Add computeKnownBits support for ISD::USUBSAT.
The result of ISD::USUBSAT will never be larger than the LHS. We
can use this to put a bound on the number of leading zeros.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D98133
2021-03-07 09:48:42 -08:00
LemonBoy
2ec43e4167 [LegalizeDAG] Implement promotion rules for SELECT_CC
Implement the promotion rule for SELECT_CC nodes by upcasting all the parameters and downcasting the result.
The AArch64 target makes use of this rule and, since it was not implemented, in some cases the instruction selector would hit an assertion upon encountering the illegal node.

This patch requires D97840, the included test cases hit both problems.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97859
2021-03-05 18:22:55 +01:00
Stephen Tozer
f677413071 Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
Rewrites test to use correct architecture triple; fixes incorrect
reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour
so as to not be dependent on changes elsewhere in the patch stack.

This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.
2021-03-05 12:32:05 +00:00
Petar Avramovic
d44f61f81c Reland [GlobalISel] Combine zext(trunc x) to x
Recommit 4112299ee761a9b6a309c8ff4a7e75f8c8d8851b. Depends on
4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4 which was reverted.

Combine zext(trunc x) to x when truncated bits are known to be zero.

Differential Revision: https://reviews.llvm.org/D96031
2021-03-05 11:05:37 +01:00
Craig Topper
ad532be012 [SelectionDAG] Assert that operands to SelectionDAG::getNode are not DELETED_NODE to catch issues like PR49393 earlier.
I'm not sure this would catch all such issues, but it would catch some.

The problem for PR49393 was that we were holding a reference to a node that
wasn't connect edto the DAG across a function that could delete unused nodes. In
this particular case we managed to try to use the deleted node while it was in
the deleted state before its memory got recycled.

It could also happen that we delete the node, something allocates a new node
which recycles the memory. Then  we try to use the reference we were holding and
it is now a completely different node with different valid opcode. This patch
would not catch that.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97969
2021-03-04 23:05:32 -08:00
Craig Topper
74e6030bcb [TargetLowering] Use HandleSDNodes to prevent nodes from being deleted by recursive calls in getNegatedExpression.
For binary or ternary ops we call getNegatedExpression multiple
times and then compare costs. While we're doing this we need to
hold a node from the first call across the second call, but its
not yet attached to the DAG. Its possible the second call creates
an identical node and then decides it didn't need it so will try
to delete it if it has no uses. This can cause a reference to the
node we're holding further up the call stack to become invalidated.

To prevent this, we can use a HandleSDNode to artifically give
the node a use without connecting it to the DAG.

I've used a std::list of HandleSDNodes so we can create handles
only when we have a node to hold. HandleSDNode does not have
default constructor and cannot be copied or moved.

Fixes PR49393.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97914
2021-03-04 22:48:25 -08:00
Chen Zheng
87bbf3d1f8 [XCOFF][DebugInfo] support DWARF for XCOFF for assembly output.
Reviewed By: jasonliu

Differential Revision: https://reviews.llvm.org/D95518
2021-03-04 21:07:52 -05:00
Heejin Ahn
561abd83ff [WebAssembly] Disable uses of __clang_call_terminate
Background:

Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses
Itanium-based libraries and ABIs with some modifications.

`__clang_call_terminate` is a wrapper generated in Clang's Itanium C++
ABI implementation. It contains this code, in C-style pseudocode:
```
void __clang_call_terminate(void *exn) {
  __cxa_begin_catch(exn);
  std::terminate();
}
```
So this function is a wrapper to call `__cxa_begin_catch` on the
exception pointer before termination.

In Itanium ABI, this function is called when another exception is thrown
while processing an exception. The pointer for this second, violating
exception is passed as the argument of this `__clang_call_terminate`,
which calls `__cxa_begin_catch` with that pointer and calls
`std::terminate` to terminate the program.

The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch`
says,
```
When the personality routine encounters a termination condition, it
will call __cxa_begin_catch() to mark the exception as handled and then
call terminate(), which shall not return to its caller.
```

In wasm EH's Clang implementation, this function is called from
cleanuppads that terminates the program, which we also call terminate
pads. Cleanuppads normally don't access the thrown exception and the
wasm backend converts them to `catch_all` blocks. But because we need
the exception pointer in this cleanuppad, we generate
`wasm.get.exception` intrinsic (which will eventually be lowered to
`catch` instruction) as we do in the catchpads. But because terminate
pads are cleanup pads and should run even when a foreign exception is
thrown, so what we have been doing is:
1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure
terminate pads are in this simple shape:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
```
2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the
pipeline, we attach a `catch_all` to terminate pads, so they will be in
this form:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
catch_all
call @std::terminate()
unreachable
```
In `catch_all` part, we don't have the exception pointer, so we call
`std::terminate()` directly. The reason we ran HandleEHTerminatePads at
the end of the pipeline, separate from LateEHPrepare, was it was
convenient to assume there was only a single `catch` part per `try`
during CFGSort and CFGStackify.

---

Problem:

While it thinks terminate pads could have been possibly split or calls
to `__clang_call_terminate` could have been duplicated,
`WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate
pads contain no more than calls to `__clang_call_terminate` and
`unreachable` instruction. I assumed that because in LLVM very limited
forms of transformations are done to catchpads and cleanuppads to
maintain the scoping structure. But it turned out to be incorrect;
passes can merge cleanuppads into one, including terminate pads, as long
as the new code has a correct scoping structure. One pass that does this
I observed was `SimplifyCFG`, but there can be more. After this
transformation, a single cleanuppad can contain any number of other
instructions with the call to `__clang_call_terminate` and can span many
BBs. It wouldn't be practical to duplicate all these BBs within the
cleanuppad to generate the equivalent `catch_all` blocks, only with
calls to `__clang_call_terminate` replaced by calls to `std::terminate`.

Unless we do more complicated transformation to split those calls to
`__clang_call_terminate` into a separate cleanuppad, it is tricky to
solve.

---

Solution (?):

This CL just disables the generation and use of `__clang_call_terminate`
and calls `std::terminate()` directly in its place.

The possible downside of this approach can be, because the Itanium ABI
intended to "mark" the violating exception handled, we don't do that
anymore. What `__cxa_begin_catch` actually does is increment the
exception's handler count and decrement the uncaught exception count,
which in my opinion do not matter much given that we are about to
terminate the program anyway. Also it does not affect info like stack
traces that can be possibly shown to developers.

And while we use a variant of Itanium EH ABI, we can make some
deviations if we choose to; we are already different in that in the
current version of the EH spec we don't support two-phase unwinding. We
can possibly consider a more complicated transformation later to
reenable this, but I don't think that has high priority.

Changes in this CL contains:
- In Clang, we don't generate a call to `wasm.get.exception()` intrinsic
  and `__clang_call_terminate` function in terminate pads anymore; we
  simply generate calls to `std::terminate()`, which is the default
  implementation of `CGCXXABI::emitTerminateForUnexpectedException`.
- Remove `WebAssembly::ensureSingleBBTermPads() function and
  `WebAssemblyHandleEHTerminatePads` pass, because terminate pads are
  already `catch_all` now (because they don't need the exception
  pointer) and we don't need these transformations anymore.
- Change tests to use `std::terminate` directly. Also removes tests that
  tested `LateEHPrepare::ensureSingleBBTermPads` and
  `HandleEHTerminatePads` pass.
- Drive-by fix: Add some function attributes to EH intrinsic
  declarations

Fixes https://github.com/emscripten-core/emscripten/issues/13582.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97834
2021-03-04 14:26:35 -08:00
Petar Avramovic
d7834556b7 Reland [GlobalISel] Start using vectors in GISelKnownBits
This is recommit of 4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4.
MIR in one unit test had mismatched types.

For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D96122
2021-03-04 21:47:13 +01:00
Akira Hatanaka
1900503595 [ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of
explicitly emitting retainRV or claimRV calls in the IR

This reapplies ed4718eccb12bd42214ca4fb17d196d49561c0c7, which was reverted
because it was causing a miscompile. The bug that was causing the miscompile
has been fixed in 75805dce5ff874676f3559c069fcd6737838f5c0.

Original commit message:

Background:

This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
  which indicates the call is implicitly followed by a marker
  instruction and an implicit retainRV/claimRV call that consumes the
  call result. In addition, it emits a call to
  @llvm.objc.clang.arc.noop.use, which consumes the call result, to
  prevent the middle-end passes from changing the return type of the
  called function. This is currently done only when the target is arm64
  and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  claimRV is attached to the call since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since the ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if retainRV is attached to the call and
  does nothing if claimRV is attached to it.

- SCCP refrains from replacing the return value of a call with a
  constant value if the call has the operand bundle. This ensures the
  call always has at least one user (the call to
  @llvm.objc.clang.arc.noop.use).

- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
  multiple operand bundles of the same kind were being added to a call.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-03-04 11:22:30 -08:00
Daniel Sanders
9fc2be6f28 [mir] Fix confusing MIR when MMO's value is nullptr but offset is non-zero
:: (store 1 + 4, addrspace 1)
->
:: (store 1 into undef + 4, addrspace 1)

An offset without a base isn't terribly useful but it's convenient to update
the offset without checking the value. For example, when breaking apart
stores into smaller units

Differential Revision: https://reviews.llvm.org/D97812
2021-03-04 10:34:30 -08:00
Philip Reames
6af94d22f7 [cgp] Defer lazy domtree usage to last possible point
This is a compile time optimization for d9e93e8e5. Not sure this matters or not, but why not do it just in case.

This does involve querying TLI with a potentially invalid addressing mode for the using instruction, but since we don't actually pass the using instruction to the TLI callback, that should be fine.
2021-03-04 10:19:45 -08:00
Philip Reames
e0cfd45171 [CGP] Lazily compute domtree only when needed during address matching
This is a compile time optimization for d9e93e8e5.  As pointed out in post dommit review on the original review (D96399), there was a moderately large compile time regression with this patch and the eager computation of domtree on matcher construction is the first obvious candidate for why.
2021-03-04 09:32:57 -08:00
Nico Weber
59beb1ef6d Revert "[GlobalISel] Combine zext(trunc x) to x"
This reverts commit 4112299ee761a9b6a309c8ff4a7e75f8c8d8851b.
Seems to depend on 4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4 which
is being reverted.
2021-03-04 10:13:40 -05:00
Nico Weber
4b1015361c Revert "[GlobalISel] Start using vectors in GISelKnownBits"
This reverts commit 4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4.
Breaks check-llvm everywhere, see https://reviews.llvm.org/D96122
2021-03-04 10:13:40 -05:00
Petar Avramovic
4112299ee7 [GlobalISel] Combine zext(trunc x) to x
Combine zext(trunc x) to x when truncated bits are known to be zero.

Differential Revision: https://reviews.llvm.org/D96031
2021-03-04 15:05:23 +01:00
Petar Avramovic
4c8fb7ddd6 [GlobalISel] Start using vectors in GISelKnownBits
For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D96122
2021-03-04 15:05:23 +01:00
Jann Horn
91c9dee3fb [CodeGenPrepare] Eliminate llvm.expect before removing empty blocks
CodeGenPrepare currently first removes empty blocks, then in a loop
performs other optimizations. One of those optimizations is the removal
of call instructions that invoke @llvm.assume, which can create new
empty blocks.

This means that when a branch only contains a call to __builtin_assume(),
the empty branch will survive into MIR, and will then only be
half-removed by MIR-level optimizations (e.g. removing the branch but
leaving the condition intact).

Fix it by eliminating @llvm.expect builtin calls before removing empty
blocks.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D97848
2021-03-04 14:48:26 +01:00
Simon Pilgrim
7d3d9fe8cd [DAG] TargetLowering::BuildUDIV - use APInt as const ref. NFCI.
Fixes clang-tidy warning.
2021-03-04 12:15:08 +00:00
Stephen Tozer
d2000b45d0 Revert "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
This reverts commit d07f106f4a48b6e941266525b6f7177834d7b74e.
2021-03-04 11:59:21 +00:00
gbtozers
d07f106f4a [DebugInfo] Add new instruction and DIExpression operator for variadic debug values
This patch adds a new instruction that can represent variadic debug values,
DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set
of basic code changes in MachineInstr and a few adjacent areas, but does not
correctly handle variadic debug values outside of these areas, nor does it
generate them at any point.

The new instruction is similar to the existing DBG_VALUE instruction, with the
following differences: the operands are in a different order, any number of
values may be used in the instruction following the Variable and Expression
operands (these are referred to in code as “debug operands”) and are indexed
from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a
DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the
expression.

The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a
DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the
index given by the argument onto the Expression stack. For example the
sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at
index 0 onto the expression stack.”

Differential Revision: https://reviews.llvm.org/D82363
2021-03-04 11:45:35 +00:00
Max Kazantsev
9d5af55589 [X86][CodeGenPrepare] Try to reuse IV's incremented value instead of adding the offset, part 2
This patch enables the case where we do not completely eliminate offset.
Supposedly in this case we reduce live range overlap that never harms, but
since there are doubts this is true, this goes as a separate change.

Differential Revision: https://reviews.llvm.org/D96399
Reviewed By: reames
2021-03-04 16:47:43 +07:00
Max Kazantsev
d9e93e8e57 [X86][CodeGenPrepare] Try to reuse IV's incremented value instead of adding the offset, part 1
While optimizing the memory instruction, we sometimes need to add
offset to the value of `IV`. We could avoid doing so if the `IV.next` is
already defined at the point of interest. In this case, we may get two
possible advantages from this:

- If the `IV` step happens to match with the offset, we don't need to add
  the offset at all;
- We reduce overlap of live ranges of `IV` and `IV.next`. They may stop overlapping
  and it will lead to better register allocation. Even if the overlap will preserve,
  we are not introducing a new overlap, so it should be a neutral transform (Disabled
  this patch, will come with follow-up).

Currently I've only added support for IVs that get decremented using `usub`
intrinsic. We could also support `AddInstr`, however there is some weird
interaction with some other transform that may lead to infinite compilation
in this case (seems like same transform is done and undone over and over).
I need to investigate why it happens, but generally we could do that too.

The first part only handles case where this reuse fully elimiates the offset.

Differential Revision: https://reviews.llvm.org/D96399
Reviewed By: reames
2021-03-04 15:22:55 +07:00
Craig Topper
90b7825598 [LegalizeVectorTypes] Remove a tautological compare. 2021-03-03 23:26:00 -08:00