163 Commits

Author SHA1 Message Date
Matt Arsenault
8d0383eb69 CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
2022-07-18 17:23:41 -04:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Nico Weber
a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille
7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Kazu Hirata
3a8c51480f [CodeGen] Use = default (NFC)
Identified with modernize-use-equals-default
2022-02-06 10:54:44 -08:00
serge-sans-paille
ffe8720aa0 Reduce dependencies on llvm/BinaryFormat/Dwarf.h
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.

More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.

This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].

As a consequence of that patch, downstream user may need to manually some extra
files:

llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h

In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.

$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after:   10978519
before:  11245451

Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions

Differential Revision: https://reviews.llvm.org/D118781
2022-02-04 11:44:03 +01:00
John Brawn
94843ea7d7 [AArch64] Make machine combiner patterns preserve MIFlags
This is mainly done so that we don't lose the nofpexcept flag once we
start emitting it.

Differential Revision: https://reviews.llvm.org/D118621
2022-02-03 11:58:59 +00:00
Fangrui Song
a6a07a514b [MachineOutliner] Don't outline functions starting with PATCHABLE_FUNCTION_ENTER/FENTRL_CALL
MachineOutliner may outline a "patchable-function-entry" function whose body has
a TargetOpcode::PATCHABLE_FUNCTION_ENTER MachineInstr. This is incorrect because
the special code sequence must stay unchanged to be used at run-time.
Avoid outlining PATCHABLE_FUNCTION_ENTER. While here, avoid outlining FENTRY_CALL too
(which doesn't reproduce currently) to allow phase ordering flexibility.

Fixes #52635

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D115614
2021-12-13 13:24:29 -08:00
Mircea Trofin
4afae6f7c7 [NFC] Rename MachineFunction::cloneMachineInstrBundle (coding style) 2021-12-08 21:12:54 -08:00
Ties Stuij
f5f28d5b0c [ARM] Implement BTI placement pass for PACBTI-M
This patch implements a new MachineFunction in the ARM backend for
placing BTI instructions. It is similar to the existing AArch64
aarch64-branch-targets pass.

BTI instructions are inserted into basic blocks that:
- Have their address taken
- Are the entry block of a function, if the function has external
  linkage or has its address taken
- Are mentioned in jump tables
- Are exception/cleanup landing pads

Each BTI instructions is placed in the beginning of a BB after the
so-called meta instructions (e.g. exception handler labels).

Each outlining candidate and the outlined function need to be in agreement about
whether BTI placement is enabled or not. If branch target enforcement is
disabled for a function, the outliner should not covertly enable it by emitting
a call to an outlined function, which begins with BTI.

The cost mode of the outliner is adjusted to account for the extra BTI
instructions in the outlined function.

The ARM Constant Islands pass will maintain the count of the jump tables, which
reference a block. A `BTI` instruction is removed from a block only if the
reference count reaches zero.

PAC instructions in entry blocks are replaced with PACBTI instructions (tests
for this case will be added in a later patch because the compiler currently does
not generate PAC instructions).

The ARM Constant Island pass is adjusted to handle BTI
instructions correctly.

Functions with static linkage that don't have their address taken can
still be called indirectly by linker-generated veneers and thus their
entry points need be marked with BTI or PACBTI.

The changes are tested using "LLVM IR -> assembly" tests, jump tables
also have a MIR test. Unfortunately it is not possible add MIR tests
for exception handling and computed gotos because of MIR parser
limitations.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Mikhail Maltsev
- Momchil Velikov
- Ties Stuij

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D112426
2021-12-01 12:54:05 +00:00
Kazu Hirata
259cd6f893 [llvm] Use range-based for loops (NFC) 2021-11-25 22:17:10 -08:00
Kazu Hirata
d243cbf8ea [llvm] Use isa instead of dyn_cast (NFC) 2021-11-14 19:40:46 -08:00
Stanislav Mekhanoshin
08d7eec06e Revert "Allow rematerialization of virtual reg uses"
Reverted due to two distcint performance regression reports.

This reverts commit 92c1fd19abb15bc68b1127a26137a69e033cdb39.
2021-09-24 10:26:11 -07:00
Stanislav Mekhanoshin
92c1fd19ab Allow rematerialization of virtual reg uses
Currently isReallyTriviallyReMaterializableGeneric() implementation
prevents rematerialization on any virtual register use on the grounds
that is not a trivial rematerialization and that we do not want to
extend liveranges.

It appears that LRE logic does not attempt to extend a liverange of
a source register for rematerialization so that is not an issue.
That is checked in the LiveRangeEdit::allUsesAvailableAt().

The only non-trivial aspect of it is accounting for tied-defs which
normally represent a read-modify-write operation and not rematerializable.

The test for a tied-def situation already exists in the
/CodeGen/AMDGPU/remat-vop.mir,
test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve.

The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets
where I more or less understand the asm it seems to reduce spilling
(as expected) or be neutral. However, it needs a review by all targets'
specialists.

Differential Revision: https://reviews.llvm.org/D106408
2021-08-24 11:09:02 -07:00
Petr Hosek
2d4470ab89 Revert "Allow rematerialization of virtual reg uses"
This reverts commit 877572cc193a470f310eec46a7ce793a6cc97c2f which
introduced PR51516.
2021-08-18 00:12:41 -07:00
Stanislav Mekhanoshin
877572cc19 Allow rematerialization of virtual reg uses
Currently isReallyTriviallyReMaterializableGeneric() implementation
prevents rematerialization on any virtual register use on the grounds
that is not a trivial rematerialization and that we do not want to
extend liveranges.

It appears that LRE logic does not attempt to extend a liverange of
a source register for rematerialization so that is not an issue.
That is checked in the LiveRangeEdit::allUsesAvailableAt().

The only non-trivial aspect of it is accounting for tied-defs which
normally represent a read-modify-write operation and not rematerializable.

The test for a tied-def situation already exists in the
/CodeGen/AMDGPU/remat-vop.mir,
test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve.

The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets
where I more or less understand the asm it seems to reduce spilling
(as expected) or be neutral. However, it needs a review by all targets'
specialists.

Differential Revision: https://reviews.llvm.org/D106408
2021-08-16 12:42:42 -07:00
Kazu Hirata
276be84d0a [CodeGen] Remove computeDefOperandLatency (NFC)
The last use was removed on Oct 9, 2016 in commit
5c924d71173afc93aa0f0d115bd445a7496f4294.
2021-08-06 08:26:55 -07:00
Serguei Katkov
0057ec8034 [Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC
Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99875
2021-04-06 11:44:37 +07:00
Fangrui Song
5d44c92bf8 Change void getNoop(MCInst &NopInst) to MCInst getNop()
Prefer (self-documenting) return values to output parameters (which are
liable to be used).
While here, rename Noop to Nop which is more widely used and improves
consistency with hasEmitNops/setEmitNops/emitNop/etc.
2021-03-15 12:05:34 -07:00
Chen Zheng
4830d458dd [MachineCombiner][NFC] Add MustReduceRegisterPressure goal
add a new goal MustReduceRegisterPressure for machine combiner pass.

PowerPC will use this new goal to do some register pressure related optimization.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D92068
2020-12-14 00:02:42 -05:00
Austin Kerbow
37d907899f [HazardRec] Allow inserting multiple wait-states simultaneously
If a target can encode multiple wait-states into a noop allow emitting such
instructions directly.

Reviewed By: rampitec, dmgreen

Differential Revision: https://reviews.llvm.org/D89753
2020-10-20 17:03:47 -07:00
Denis Antrushin
248a67f144 [Statepoint] Turn assert into check in foldPatchpoint.
Original D81646 had check for tied regs in foldPatchpoint().
Due to unfortunate miscommunication with review comments and
adressing some comments post commit, it turned into assertion.

We had an offline talk and agreed that with current implementation
this path is possible, so I'm changing it back to check.

Note that this is workaround until ussues described in PR46917 are
resolved.
2020-08-28 20:00:23 +07:00
Philip Reames
a96fc4638b Remove deopt and gc transition arguments from gc.statepoint intrinsic
(Forgot to land this a couple of weeks back.)

In a recent series of changes, I've introduced support for using the respective operand bundle kinds on the statepoint. At the moment, code supports either/or, but there's no need to keep the old support around. For the moment, I am simply changing the specification and verifier to require zero length argument sets in the intrinsic.

The intrinsic itself is experimental. Given that, there's no forward serialization needed. The in tree uses and generation have already been updated to use the new operand bundle based forms, the only folks broken by the change will be those with frontends generating statepoints directly and the updates should be easy.

Why not go ahead and just remove the arguments entirely? Well, I plan to. But while working on this I've found that almost all of the arguments to the statepoint can be expressed via operand bundles or attributes. Given that, I'm planning a radical simplification of the arguments and figured I'd do one update not several small ones.

Differential Revision: https://reviews.llvm.org/D80892
2020-08-14 16:07:40 -07:00
Denis Antrushin
d21ce40821 [Statepoints] Operand folding in presense of tied registers.
Implement proper folding of statepoint meta operands (deopt and GC)
when statepoint uses tied registers.
For deopt operands it is just about properly preserving tiedness
in new instruction.
For tied GC operands folding is a little bit more tricky.
We can fold tied GC operands only from InlineSpiller, because it knows
how to properly reload tied def after it was turned into memory operand.
Other users (e.g. peephole) cannot properly fold such operands as they
do not know how (or when) to reload them from memory.
We do this by un-tieing operand we want to fold in InlineSpiller
and allowing to fold only untied operands in foldPatchpoint.
2020-08-05 20:18:28 +07:00
James Y Knight
4b0aa5724f Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while also
supporting outputs causes some trouble, as the physreg->vreg COPY
operations cannot be in the same block.

Modeling it as a non-terminator allows it to be handled the same way
as invoke is handled already.

Most of the changes here were created by auditing all the existing
users of MachineBasicBlock::isEHPad() and
MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.

Reviewed By: nickdesaulniers, void

Differential Revision: https://reviews.llvm.org/D79794
2020-07-01 12:51:50 -04:00
Wang, Pengfei
6eb9eae010 [MS] Copy the symbols assigned to the former instruction when memory folding.
The memory folding raplaced the old instruction without copying the symbols assigned. Which will resulted in built fail due to the lost symbols.

Reviewed by craig.topper

Differential Revision: https://reviews.llvm.org/D78471
2020-06-10 15:38:32 +08:00
hsmahesha
0ed2c04636 [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes
Summary:
While clustering mem ops, AMDGPU target needs to consider number of clustered bytes
to decide on max number of mem ops that can be clustered. This patch adds support to pass
number of clustered bytes to target mem ops clustering logic.

Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar

Reviewed By: foad

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80545
2020-06-01 22:52:34 +05:30
Sam McCall
d10c995b4d std::isspace -> llvm::isSpace (where locale should be ignored)
I've left out some cases where I wasn't totally sure this was right or
whether the include was ok (compiler-rt) or idiomatic (flang).
2020-05-02 15:36:04 +02:00
Konstantin Schwarz
1a3e89aa2b [MIR] Add comments to INLINEASM immediate flag MachineOperands
Summary:
The INLINEASM MIR instructions use immediate operands to encode the values of some operands.
The MachineInstr pretty printer function already handles those operands and prints human readable annotations instead of the immediates. This patch adds similar annotations to the output of the MIRPrinter, however uses the new MIROperandComment feature.

Reviewers: SjoerdMeijer, arsenm, efriedma

Reviewed By: arsenm

Subscribers: qcolombet, sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78088
2020-04-16 13:46:14 +02:00
Matt Arsenault
30ebafaa56 CodeGen: Convert some TII hooks to use Register 2020-04-03 14:52:54 -04:00
Guillaume Chatelet
b9810988b2 [Alignment][NFC] Transitionning more getMachineMemOperand call sites
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77127
2020-03-31 11:04:10 +00:00
Vedant Kumar
0368b42295 [entry values] ARM: Add a describeLoadedValue override (PR45025)
As a narrow stopgap for the assertion failure described in PR45025, add
a describeLoadedValue override to ARMBaseInstrInfo and use it to detect
copies in which the forwarding reg is a super/sub reg of the copy
destination. For the moment this is unsupported.

Several follow ups are possible:

1) Handle VORRq. At the moment, we do not, because isCopyInstrImpl
   returns early when !MI.isMoveReg().

2) In the case where forwarding reg is a super-reg of the copy
   destination, we should be able to describe the forwarding reg as a
   subreg within the copy destination. I'm not 100% sure about this, but
   it looks like that's what's done in AArch64InstrInfo.

3) In the case where the forwarding reg is a sub-reg of the copy
   destination, maybe we could describe the forwarding reg using the
   copy destinaion and a DW_OP_LLVM_fragment (I guess this should be
   possible after D75036).

https://bugs.llvm.org/show_bug.cgi?id=45025
rdar://59772698

Differential Revision: https://reviews.llvm.org/D75273
2020-02-28 14:30:40 -08:00
Sanjay Patel
90fd859f51 [x86] use instruction-level fast-math-flags to drive MachineCombiner
The code changes here are hopefully straightforward:

1. Use MachineInstruction flags to decide if FP ops can be reassociated
   (use both "reassoc" and "nsz" to be consistent with IR transforms;
   we probably don't need "nsz", but that's a safer interpretation of
   the FMF).
2. Check that both nodes allow reassociation to change instructions.
   This is a stronger requirement than we've usually implemented in
   IR/DAG, but this is needed to solve the motivating bug (see below),
   and it seems unlikely to impede optimization at this late stage.
3. Intersect/propagate MachineIR flags to enable further reassociation
   in MachineCombiner.

We managed to make MachineCombiner flexible enough that no changes are
needed to that pass itself. So this patch should only affect x86
(assuming no other targets have implemented the hooks using MachineIR
flags yet).

The motivating example in PR43609 is another case of fast-math transforms
interacting badly with special FP ops created during lowering:
https://bugs.llvm.org/show_bug.cgi?id=43609
The special fadd ops used for converting int to FP assume that they will
not be altered, so those are created without FMF.

However, the MachineCombiner pass was being enabled for FP ops using the
global/function-level TargetOption for "UnsafeFPMath". We managed to run
instruction/node-level FMF all the way down to MachineIR sometime in the
last 1-2 years though, so we can do better now.

The test diffs require some explanation:

1. llvm/test/CodeGen/X86/fmf-flags.ll - no target option for unsafe math was
   specified here, so MachineCombiner kicks in where it did not previously;
   to make it behave consistently, we need to specify a CPU schedule model,
   so use the default model, and there are no code diffs.
2. llvm/test/CodeGen/X86/machine-combiner.ll - replace the target option for
   unsafe math with the equivalent IR-level flags, and there are no code diffs;
   we can't remove the NaN/nsz options because those are still used to drive
   x86 fmin/fmax codegen (special SDAG opcodes).
3. llvm/test/CodeGen/X86/pow.ll - similar to #1
4. llvm/test/CodeGen/X86/sqrt-fastmath.ll - similar to #1, but MachineCombiner
   does some reassociation of the estimate sequence ops; presumably these are
   perf wins based on latency/throughput (and we get some reduction of move
   instructions too); I'm not sure how it affects numerical accuracy, but the
   test reflects reality better now because we would expect MachineCombiner to
   be enabled if the IR was generated via something like "-ffast-math" with clang.
5. llvm/test/CodeGen/X86/vec_int_to_fp.ll - this is the test added to model PR43609;
   the fadds are not reassociated now, so we should get the expected results.
6. llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll - similar to #1
7. llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll - similar to #1

Differential Revision: https://reviews.llvm.org/D74851
2020-02-27 15:19:37 -05:00
Djordje Todorovic
016d91ccbd [CallSiteInfo] Handle bundles when updating call site info
This will address the issue: P8198 and P8199 (from D73534).

The methods was not handle bundles properly.

Differential Revision: https://reviews.llvm.org/D74904
2020-02-27 13:57:06 +01:00
Sander de Smalen
8fbc925807 Add OffsetIsScalable to getMemOperandWithOffset
Summary:
Making `Scale` a `TypeSize` in AArch64InstrInfo::getMemOpInfo,
has the effect that all places where this information is used
(notably, TargetInstrInfo::getMemOperandWithOffset) will need
to consider Scale - and derived, Offset - possibly being scalable.

This patch adds a new operand `bool &OffsetIsScalable` to
TargetInstrInfo::getMemOperandWithOffset and fixes up all
the places where this function is used, to consider the
offset possibly being scalable.

In most cases, this means bailing out because the algorithm does not
(or cannot) support scalable offsets in places where it does some
form of alias checking for example.

Reviewers: rovka, efriedma, kristof.beyls

Reviewed By: efriedma

Subscribers: wuzish, kerbowa, MatzeB, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, javed.absar, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72758
2020-02-18 15:53:29 +00:00
Djordje Todorovic
68908993eb [CSInfo] Use isCandidateForCallSiteEntry() when updating the CSInfo
Use the isCandidateForCallSiteEntry().
This should mostly be an NFC, but there are some parts ensuring
the moveCallSiteInfo() and copyCallSiteInfo() operate with call site
entry candidates (both Src and Dest should be the call site entry
candidates).

Differential Revision: https://reviews.llvm.org/D74122
2020-02-10 10:03:14 +01:00
Djordje Todorovic
de90d73e03 [DebugInfo] Avoid the call site param for mem instrs with multiple defs
We currently only handle mem instructions with a single define.
Avoid the call site parameter debug info when we find the case with
multiple defs, rather than throwing an assert.

Differential Revision: https://reviews.llvm.org/D73954
2020-02-05 10:03:14 +01:00
Jay Foad
e0f0d0e55c [MachineScheduler] Allow clustering mem ops with complex addresses
The generic BaseMemOpClusterMutation calls into TargetInstrInfo to
analyze the address of each load/store instruction, and again to decide
whether two instructions should be clustered. Previously this had to
represent each address as a single base operand plus a constant byte
offset. This patch extends it to support any number of base operands.

The old target hook getMemOperandWithOffset is now a convenience
function for callers that are only prepared to handle a single base
operand. It calls the new more general target hook
getMemOperandsWithOffset.

The only requirements for the base operands returned by
getMemOperandsWithOffset are:
- they can be sorted by MemOpInfo::Compare, such that clusterable ops
  get sorted next to each other, and
- shouldClusterMemOps knows what they mean.

One simple follow-on is to enable clustering of AMDGPU FLAT instructions
with both vaddr and saddr (base register + offset register). I've left
a FIXME in the code for this case.

Differential Revision: https://reviews.llvm.org/D71655
2020-01-22 14:28:24 +00:00
David Green
b891490ceb [Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use ScheduleDAGMI. NFC
All the callers of this function will be ScheduleDAGMI from the
MachineScheduler. This allows us to use the extra info available in
ScheduleDAGMI without resorting to awkward casts.
2020-01-15 07:21:44 +00:00
David Green
90555d9253 [Scheduler] Remove superfluous casts. NFC 2020-01-13 16:34:13 +00:00
David Stenberg
6965f835b4 [DebugInfo] Make describeLoadedValue() reg aware
Summary:
Currently the describeLoadedValue() hook is assumed to describe the
value of the instruction's first explicit define. The hook will not be
called for instructions with more than one explicit define.

This commit adds a register parameter to the describeLoadedValue() hook,
and invokes the hook for all registers in the worklist.

This will allow us to for example describe instructions which produce
more than two parameters' values; e.g. Hexagon's various combine
instructions.

This also fixes situations in our downstream target where we may pass
smaller parameters in the high part of a register. If such a parameter's
value is produced by a larger copy instruction, we can't describe the
call site value using the super-register, and we instead need to know
which sub-register that should be used.

This also allows us to handle cases like this:

  $ebx = [...]
  $rdi = MOVSX64rr32 $ebx
  $esi = MOV32rr $edi
  CALL64pcrel32 @call

The hook will first be invoked for the MOV32rr instruction, which will
say that @call's second parameter (passed in $esi) is described by $edi.
As $edi is not preserved it will be added to the worklist. When we get
to the MOVSX64rr32 instruction, we need to describe two values; the
sign-extended value of $ebx -> $rdi for the first parameter, and $ebx ->
$edi for the second parameter, which is now possible.

This commit modifies the dbgcall-site-lea-interpretation.mir test case.
In the test case, the values of some 32-bit parameters were produced
with LEA64r. Perhaps we can in general cases handle such by emitting
expressions that AND out the lower 32-bits, but I have not been able to
land in a case where a LEA64r is used for a 32-bit parameter instead of
LEA64_32 from C code.

I have not found a case where it would be useful to describe parameters
using implicit defines, so in this patch the hook is still only invoked
for explicit defines of forwarding registers.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: djtodoro, vsk

Subscribers: ormris, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D70431
2019-12-09 10:47:49 +01:00
David Stenberg
f3696533f2 Revert "[DebugInfo] Make describeLoadedValue() reg aware"
This reverts commit 3cd93a4efcdeabeb20cb7bec9fbddcb540d337a1.
I'll recommit with a well-formatted arcanist commit message.
2019-12-09 10:45:13 +01:00
David Stenberg
3cd93a4efc [DebugInfo] Make describeLoadedValue() reg aware
Currently the describeLoadedValue() hook is assumed to describe the
value of the instruction's first explicit define. The hook will not be
called for instructions with more than one explicit define.

This commit adds a register parameter to the describeLoadedValue() hook,
and invokes the hook for all registers in the worklist.

This will allow us to for example describe instructions which produce
more than two parameters' values; e.g. Hexagon's various combine
instructions.

This also fixes a case in our downstream target where we may pass
smaller parameters in the high part of a register. If such a parameter's
value is produced by a larger copy instruction, we can't describe the
call site value using the super-register, and we instead need to know
which sub-register that should be used.

This also allows us to handle cases like this:

  $ebx = [...]
  $rdi = MOVSX64rr32 $ebx
  $esi = MOV32rr $edi
  CALL64pcrel32 @call

The hook will first be invoked for the MOV32rr instruction, which will
say that @call's second parameter (passed in $esi) is described by $edi.
As $edi is not preserved it will be added to the worklist. When we get
to the MOVSX64rr32 instruction, we need to describe two values; the
sign-extended value of $ebx -> $rdi for the first parameter, and $ebx ->
$edi for the second parameter, which is now possible.

This commit modifies the dbgcall-site-lea-interpretation.mir test case.
In the test case, the values of some 32-bit parameters were produced
with LEA64r. Perhaps we can in general cases handle such by emitting
expressions that AND out the lower 32-bits, but I have not been able to
land in a case where a LEA64r is used for a 32-bit parameter instead of
LEA64_32 from C code.

I have not found a case where it would be useful to describe parameters
using implicit defines, so in this patch the hook is still only invoked
for explicit defines of forwarding registers.
2019-12-09 10:44:17 +01:00
Vedant Kumar
ba71ca3720 [DebugInfo] Describe size of spilled values in call site params
A call site parameter description of a memory operand needs to
unambiguously convey the size of the operand to prevent incorrect entry
value evaluation.

Thanks for David Stenberg for pointing this issue out!
2019-11-19 12:03:52 -08:00
Vedant Kumar
1ee84e5ab2 [DebugInfo] Allow spill slots in call site parameter descriptions
Allow call site paramter descriptions to reference spill slots. Spill
slots are not visible to high-level LLVM IR, so they can safely be
referenced during entry value evaluation (as they cannot be clobbered by
some other function).

This gives a 5% increase in the number of call site parameter DIEs in an
LTO x86_64 build of the xnu kernel.

This reverts commit eb4c98ca3d2590bad9f6542afbf3a7824d2b53fa (
[DebugInfo] Exclude memory location values as parameter entry values),
effectively reintroducing the portion of D60716 which dealt with memory
locations (authored by Djordje, Nikola, Ananth, and Ivan).

This partially addresses llvm.org/PR43343. However, not all memory
operands forwarded to callees live in spill slots. In the xnu build, it
may be possible to use an escape analysis to increase the number of call
site parameter by another 15% (more details in PR43343).

Differential Revision: https://reviews.llvm.org/D70254
2019-11-14 12:48:51 -08:00
Djordje Todorovic
8d2ccd1ac3 Reland: [TII] Use optional destination and source pair as a return value; NFC
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.

Patch by Nikola Prica

Differential Revision: https://reviews.llvm.org/D69622
2019-11-08 13:00:39 +01:00
Simon Pilgrim
3842b94c4e Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional destination and source pair as a return value; NFC
This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375
2019-10-31 18:00:29 +00:00
Djordje Todorovic
57ee0435bd [TII] Use optional destination and source pair as a return value; NFC
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.

Patch by Nikola Prica

Differential Revision: https://reviews.llvm.org/D69622
2019-10-31 15:34:49 +01:00
Djordje Todorovic
532815dd5c [ARM][AArch64][DebugInfo] Improve call site instruction interpretation
Extend the describeLoadedValue() with support for target specific ARM and
AArch64 instructions interpretation. The patch provides specialization for
ADD and SUB operations that include a register and an immediate/offset
operand. Some of the instructions can operate with global string addresses
or constant pool indexes but such cases are omitted since we currently lack
flexible support for processing such operands at DWARF production stage.

Patch by Nikola Prica

Differential Revision: https://reviews.llvm.org/D67556
2019-10-30 13:58:14 +01:00
David Stenberg
74a72e6848 [DebugInfo] Stop describing imms in TargetInstrInfo's describeLoadedValue() impl
Summary:
The default implementation of the describeLoadedValue() hook uses the
MoveImm property to determine if an instruction moves an immediate. If
an instruction has that property the function returns the second
operand, assuming that that is the immediate value the instruction
moves. As far as I can tell, the MoveImm property does not imply that
the second operand is the immediate value, nor that any other operand
necessarily holds the immediate value; it just means that the
instruction moves some immediate value.

One example where the second operand is not the immediate is SystemZ's
LZER instruction, which moves a zero immediate implicitly: $f0S = LZER.

That case triggered an out-of-bound assertion when getting the operand.
I have added a test case for that instruction.

Another example is ARM's MVN instruction, which holds the logical
bitwise NOT'd value of the immediate that is moved. For the following
reproducer:

  extern void foo(int);
  int main() { foo(-11); }

an incorrect call site value would be emitted:

  $ clang --target=arm foo.c -O1 -g -Xclang -femit-debug-entry-values \
      -c -o - | ./build/bin/llvm-dwarfdump  - | \
      grep -A2 call_site_parameter

  0x00000058:       DW_TAG_GNU_call_site_parameter
                      DW_AT_location (DW_OP_reg0 R0)
                      DW_AT_GNU_call_site_value (DW_OP_lit10)

Another example is the A2_combineii instruction on Hexagon which moves
two immediates to a super-register: $d0 = A2_combineii 20, 10.

Perhaps these are rare exceptions, and most MoveImm instructions hold
the immediate in the second operand, but in my opinion the default
implementation of the hook should only describe values that it can, by
some contract, guarantee are safe to describe, rather than leaving it up
to the targets to override the exceptions, as that can silently result
in incorrect call site values.

This patch adds X86's relevant move immediate instructions to the
target's hook implementation, so this commit should be a NFC for that
target. We need to do the same for ARM and AArch64.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: vsk

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D69109
2019-10-23 11:41:29 +02:00