2010-10-07 18:41:20 +00:00
|
|
|
//===-- CodeGen.cpp -------------------------------------------------------===//
|
|
|
|
//
|
2019-01-19 08:50:56 +00:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
2010-10-07 18:41:20 +00:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// This file implements the common initialization routines for the
|
|
|
|
// CodeGen library.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2017-06-06 11:49:48 +00:00
|
|
|
#include "llvm/InitializePasses.h"
|
2014-01-07 11:48:04 +00:00
|
|
|
#include "llvm/PassRegistry.h"
|
2010-10-07 18:41:20 +00:00
|
|
|
|
|
|
|
using namespace llvm;
|
|
|
|
|
|
|
|
/// initializeCodeGen - Initialize all passes linked into the CodeGen library.
|
|
|
|
void llvm::initializeCodeGen(PassRegistry &Registry) {
|
[Assignment Tracking][Analysis] Add analysis pass
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Add initial revision of assignment tracking analysis pass
---------------------------------------------------------
This patch squashes five individually reviewed patches into one:
#1 https://reviews.llvm.org/D136320
#2 https://reviews.llvm.org/D136321
#3 https://reviews.llvm.org/D136325
#4 https://reviews.llvm.org/D136331
#5 https://reviews.llvm.org/D136335
Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The
two subsequent patches modify those files only. Patch #4 plumbs the analysis
into SelectionDAG, and patch #5 is a collection of tests for the analysis as
a whole.
The analysis was broken up into smaller chunks for review purposes but for the
most part the tests were written using the whole analysis. It would be possible
to break up the tests for patches #1 through #3 for the purpose of landing the
patches seperately. However, most them would require an update for each
patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is
required by all of the tests.
If there is build-bot trouble, we might try a different landing sequence.
Analysis problem and goal
-------------------------
Variables values can be stored in memory, or available as SSA values, or both.
Using the Assignment Tracking metadata, it's not possible to determine a
variable location just by looking at a debug intrinsic in
isolation. Instructions without any metadata can change the location of a
variable. The meaning of dbg.assign intrinsics changes depending on whether
there are linked instructions, and where they are relative to those
instructions. So we need to analyse the IR and convert the embedded information
into a form that SelectionDAG can consume to produce debug variable locations
in MIR.
The solution is a dataflow analysis which, aiming to maximise the memory
location coverage for variables, outputs a mapping of instruction positions to
variable location definitions.
API usage
---------
The analysis is named `AssignmentTrackingAnalysis`. It is added as a required
pass for SelectionDAGISel when assignment tracking is enabled.
The results of the analysis are exposed via `getResults` using the returned
`const FunctionVarLocs *`'s const methods:
const VarLocInfo *single_locs_begin() const;
const VarLocInfo *single_locs_end() const;
const VarLocInfo *locs_begin(const Instruction *Before) const;
const VarLocInfo *locs_end(const Instruction *Before) const;
void print(raw_ostream &OS, const Function &Fn) const;
Debug intrinsics can be ignored after running the analysis. Instead, variable
location definitions that occur between an instruction `Inst` and its
predecessor (or block start) can be found by looping over the range:
locs_begin(Inst), locs_end(Inst)
Similarly, variables with a memory location that is valid for their lifetime
can be iterated over using the range:
single_locs_begin(), single_locs_end()
Further detail
--------------
For an explanation of the dataflow implementation and the integration with
SelectionDAG, please see the reviews linked at the top of this commit message.
Reviewed By: jmorse
2022-12-09 15:43:56 +00:00
|
|
|
initializeAssignmentTrackingAnalysisPass(Registry);
|
2024-02-25 18:42:22 +05:30
|
|
|
initializeAtomicExpandLegacyPass(Registry);
|
2023-10-27 21:49:39 -07:00
|
|
|
initializeBasicBlockPathCloningPass(Registry);
|
2020-08-05 17:11:48 -07:00
|
|
|
initializeBasicBlockSectionsPass(Registry);
|
2012-02-08 21:22:48 +00:00
|
|
|
initializeBranchFolderPassPass(Registry);
|
2016-10-06 15:38:53 +00:00
|
|
|
initializeBranchRelaxationPass(Registry);
|
2023-04-14 14:05:03 +07:00
|
|
|
initializeBreakFalseDepsPass(Registry);
|
2023-02-16 17:45:50 -08:00
|
|
|
initializeCallBrPreparePass(Registry);
|
Add Windows Control Flow Guard checks (/guard:cf).
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
2019-10-28 13:22:19 +00:00
|
|
|
initializeCFGuardLongjmpPass(Registry);
|
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CGA
aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.
This pass takes advantage of the constraints taht LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is
complete and self-contained, i.e. CSR restore instructions (and the
corresponding CFI instructions are not split across two or more
blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or
has not executed any epilogue
- "does not have a call frame", if the function has not executed the
prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO
traversal.
From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions,
which come in two flavours:
- CFI instructions, which reset the unwind table state to the
initial one. This is done by a target specific hook and is
expected to be trivial to implement, for example it could be:
```
.cfi_def_cfa <sp>, 0
.cfi_same_value <rN>
.cfi_same_value <rN-1>
...
```
where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one
created by the function prologue. These are the sequence:
```
.cfi_restore_state
.cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
2022-04-11 12:08:26 +01:00
|
|
|
initializeCFIFixupPass(Registry);
|
2018-04-24 10:32:08 +00:00
|
|
|
initializeCFIInstrInserterPass(Registry);
|
2020-12-16 20:23:29 -08:00
|
|
|
initializeCheckDebugMachineModulePass(Registry);
|
2024-01-08 22:32:59 -08:00
|
|
|
initializeCodeGenPrepareLegacyPassPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeDeadMachineInstructionElimPass(Registry);
|
2020-04-03 16:18:45 -07:00
|
|
|
initializeDebugifyMachineModulePass(Registry);
|
2016-04-28 03:07:16 +00:00
|
|
|
initializeDetectDeadLanesPass(Registry);
|
2021-01-01 19:06:44 +03:00
|
|
|
initializeDwarfEHPrepareLegacyPassPass(Registry);
|
2024-10-16 13:22:57 +05:30
|
|
|
initializeEarlyIfConverterLegacyPass(Registry);
|
2019-08-20 15:54:59 +00:00
|
|
|
initializeEarlyIfPredicatorPass(Registry);
|
2018-01-19 06:46:10 +00:00
|
|
|
initializeEarlyMachineLICMPass(Registry);
|
2024-10-30 11:48:40 +05:30
|
|
|
initializeEarlyTailDuplicateLegacyPass(Registry);
|
2022-05-25 12:19:28 +01:00
|
|
|
initializeExpandLargeDivRemLegacyPassPass(Registry);
|
[X86] Add ExpandLargeFpConvert Pass and enable for X86
As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
`compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`,
and etc.
Corner cases:
1. For fp16: as there is no related builtins added in compliler-rt. So I
mainly utilized the fp32 <-> fp16 lib calls to implement.
2. For fp80: as this pass is soft fp emulation and no fp80 instructions can
help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.
3. For bf16: as clang FE currently doesn't support bf16 algorithm operations
(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.
4. For unsigned FPToI: since both default hardware behaviors and libgcc are
ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M
The end-to-end tests are uploaded at https://reviews.llvm.org/D138261
Reviewed By: LuoYuanke, mgehre-amd
Differential Revision: https://reviews.llvm.org/D137241
2022-12-01 13:47:25 +08:00
|
|
|
initializeExpandLargeFpConvertLegacyPassPass(Registry);
|
2023-12-13 16:18:24 +08:00
|
|
|
initializeExpandMemCmpLegacyPassPass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializeExpandPostRAPass(Registry);
|
2017-01-31 17:00:27 +00:00
|
|
|
initializeFEntryInserterPass(Registry);
|
2019-06-19 00:25:39 +00:00
|
|
|
initializeFinalizeISelPass(Registry);
|
2017-03-18 05:05:32 +00:00
|
|
|
initializeFinalizeMachineBundlesPass(Registry);
|
2020-04-09 18:40:53 +07:00
|
|
|
initializeFixupStatepointCallerSavedPass(Registry);
|
2015-09-17 20:45:18 +00:00
|
|
|
initializeFuncletLayoutPass(Registry);
|
2012-02-08 21:23:13 +00:00
|
|
|
initializeGCMachineCodeAnalysisPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeGCModuleInfoPass(Registry);
|
2023-02-13 09:12:12 +00:00
|
|
|
initializeHardwareLoopsLegacyPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeIfConverterPass(Registry);
|
2017-03-18 05:05:32 +00:00
|
|
|
initializeImplicitNullChecksPass(Registry);
|
2023-12-13 16:13:17 +08:00
|
|
|
initializeIndirectBrExpandLegacyPassPass(Registry);
|
2024-02-26 12:12:31 +00:00
|
|
|
initializeInitUndefPass(Registry);
|
2018-11-19 14:26:10 +00:00
|
|
|
initializeInterleavedLoadCombinePass(Registry);
|
2016-05-19 20:08:32 +00:00
|
|
|
initializeInterleavedAccessPass(Registry);
|
2022-02-10 15:10:48 -08:00
|
|
|
initializeJMCInstrumenterPass(Registry);
|
2017-03-18 05:05:32 +00:00
|
|
|
initializeLiveDebugValuesPass(Registry);
|
2010-11-30 02:17:10 +00:00
|
|
|
initializeLiveDebugVariablesPass(Registry);
|
2024-07-10 19:34:48 +08:00
|
|
|
initializeLiveIntervalsWrapperPassPass(Registry);
|
Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 304371
2017-05-31 23:25:25 +00:00
|
|
|
initializeLiveRangeShrinkPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeLiveStacksPass(Registry);
|
2024-07-09 10:50:43 +08:00
|
|
|
initializeLiveVariablesWrapperPassPass(Registry);
|
2012-02-08 21:23:13 +00:00
|
|
|
initializeLocalStackSlotPassPass(Registry);
|
2022-03-23 17:37:07 -07:00
|
|
|
initializeLowerGlobalDtorsLegacyPassPass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializeLowerIntrinsicsPass(Registry);
|
2021-11-22 14:03:32 -08:00
|
|
|
initializeMIRAddFSDiscriminatorsPass(Registry);
|
2018-07-26 00:27:49 +00:00
|
|
|
initializeMIRCanonicalizerPass(Registry);
|
2019-09-05 20:44:33 +00:00
|
|
|
initializeMIRNamerPass(Registry);
|
2021-11-22 14:03:32 -08:00
|
|
|
initializeMIRProfileLoaderPassPass(Registry);
|
2024-07-12 15:45:01 +08:00
|
|
|
initializeMachineBlockFrequencyInfoWrapperPassPass(Registry);
|
Implement a block placement pass based on the branch probability and
block frequency analyses. This differs substantially from the existing
block-placement pass in LLVM:
1) It operates on the Machine-IR in the CodeGen layer. This exposes much
more (and more precise) information and opportunities. Also, the
results are more stable due to fewer transforms ocurring after the
pass runs.
2) It uses the generalized probability and frequency analyses. These can
model static heuristics, code annotation derived heuristics as well
as eventual profile loading. By basing the optimization on the
analysis interface it can work from any (or a combination) of these
inputs.
3) It uses a more aggressive algorithm, both building chains from tho
bottom up to maximize benefit, and using an SCC-based walk to layout
chains of blocks in a profitable ordering without O(N^2) iterations
which the old pass involves.
The pass is currently gated behind a flag, and not enabled by default
because it still needs to grow some important features. Most notably, it
needs to support loop aligning and careful layout of loop structures
much as done by hand currently in CodePlacementOpt. Once it supports
these, and has sufficient testing and quality tuning, it should replace
both of these passes.
Thanks to Nick Lewycky and Richard Smith for help authoring & debugging
this, and to Jakob, Andy, Eric, Jim, and probably a few others I'm
forgetting for reviewing and answering all my questions. Writing
a backend pass is *sooo* much better now than it used to be. =D
llvm-svn: 142641
2011-10-21 06:46:38 +00:00
|
|
|
initializeMachineBlockPlacementPass(Registry);
|
2011-11-02 07:17:12 +00:00
|
|
|
initializeMachineBlockPlacementStatsPass(Registry);
|
2022-09-22 10:49:46 +05:30
|
|
|
initializeMachineCFGPrinterPass(Registry);
|
2024-09-04 18:54:07 +05:30
|
|
|
initializeMachineCSELegacyPass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializeMachineCombinerPass(Registry);
|
|
|
|
initializeMachineCopyPropagationPass(Registry);
|
2021-12-10 14:36:43 +05:30
|
|
|
initializeMachineCycleInfoPrinterPassPass(Registry);
|
|
|
|
initializeMachineCycleInfoWrapperPassPass(Registry);
|
2024-06-11 21:27:14 +08:00
|
|
|
initializeMachineDominatorTreeWrapperPassPass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializeMachineFunctionPrinterPassPass(Registry);
|
2022-12-05 08:14:40 -06:00
|
|
|
initializeMachineLateInstrsCleanupPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeMachineLICMPass(Registry);
|
2024-07-09 09:11:18 +08:00
|
|
|
initializeMachineLoopInfoWrapperPassPass(Registry);
|
2019-09-30 17:54:50 +00:00
|
|
|
initializeMachineModuleInfoWrapperPassPass(Registry);
|
2017-02-24 07:42:35 +00:00
|
|
|
initializeMachineOptimizationRemarkEmitterPassPass(Registry);
|
2017-03-06 21:31:18 +00:00
|
|
|
initializeMachineOutlinerPass(Registry);
|
2016-07-29 16:44:44 +00:00
|
|
|
initializeMachinePipelinerPass(Registry);
|
2022-10-17 15:13:56 +02:00
|
|
|
initializeMachineSanitizerBinaryMetadataPass(Registry);
|
2019-09-03 08:20:31 +00:00
|
|
|
initializeModuloScheduleTestPass(Registry);
|
2024-06-12 14:29:22 +08:00
|
|
|
initializeMachinePostDominatorTreeWrapperPassPass(Registry);
|
2017-02-18 00:41:16 +00:00
|
|
|
initializeMachineRegionInfoPassPass(Registry);
|
2012-02-08 21:23:13 +00:00
|
|
|
initializeMachineSchedulerPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeMachineSinkingPass(Registry);
|
2022-12-20 06:49:30 +05:30
|
|
|
initializeMachineUniformityAnalysisPassPass(Registry);
|
|
|
|
initializeMachineUniformityInfoPrinterPassPass(Registry);
|
2024-07-15 12:42:44 +08:00
|
|
|
initializeMachineVerifierLegacyPassPass(Registry);
|
2022-10-02 13:20:21 -07:00
|
|
|
initializeObjCARCContractLegacyPassPass(Registry);
|
2024-10-23 16:55:21 +05:30
|
|
|
initializeOptimizePHIsLegacyPass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializePEIPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializePHIEliminationPass(Registry);
|
2017-03-18 05:05:32 +00:00
|
|
|
initializePatchableFunctionPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializePeepholeOptimizerPass(Registry);
|
2013-12-28 21:56:51 +00:00
|
|
|
initializePostMachineSchedulerPass(Registry);
|
2016-04-22 14:43:50 +00:00
|
|
|
initializePostRAHazardRecognizerPass(Registry);
|
[CodeGen] Add a new pass for PostRA sink
Summary:
This pass sinks COPY instructions into a successor block, if the COPY is not
used in the current block and the COPY is live-in to a single successor
(i.e., doesn't require the COPY to be duplicated). This avoids executing the
the copy on paths where their results aren't needed. This also exposes
additional opportunites for dead copy elimination and shrink wrapping.
These copies were either not handled by or are inserted after the MachineSink
pass. As an example of the former case, the MachineSink pass cannot sink
COPY instructions with allocatable source registers; for AArch64 these type
of copy instructions are frequently used to move function parameters (PhyReg)
into virtual registers in the entry block..
For the machine IR below, this pass will sink %w19 in the entry into its
successor (%bb.1) because %w19 is only live-in in %bb.1.
```
%bb.0:
%wzr = SUBSWri %w1, 1
%w19 = COPY %w0
Bcc 11, %bb.2
%bb.1:
Live Ins: %w19
BL @fun
%w0 = ADDWrr %w0, %w19
RET %w0
%bb.2:
%w0 = COPY %wzr
RET %w0
```
As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be
able to see %bb.0 as a candidate.
With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64.
Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz
Reviewed By: sebpop
Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D41463
llvm-svn: 328237
2018-03-22 20:06:47 +00:00
|
|
|
initializePostRAMachineSinkingPass(Registry);
|
2012-02-08 21:23:13 +00:00
|
|
|
initializePostRASchedulerPass(Registry);
|
2016-06-24 20:13:42 +00:00
|
|
|
initializePreISelIntrinsicLoweringLegacyPassPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeProcessImplicitDefsPass(Registry);
|
2017-06-02 22:46:26 +00:00
|
|
|
initializeRABasicPass(Registry);
|
2016-11-14 21:50:13 +00:00
|
|
|
initializeRAGreedyPass(Registry);
|
2018-07-26 00:27:49 +00:00
|
|
|
initializeRegAllocFastPass(Registry);
|
2018-07-26 00:27:51 +00:00
|
|
|
initializeRegUsageInfoCollectorPass(Registry);
|
|
|
|
initializeRegUsageInfoPropagationPass(Registry);
|
2011-06-26 22:34:10 +00:00
|
|
|
initializeRegisterCoalescerPass(Registry);
|
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:
https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850
This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).
Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
|
|
|
initializeRemoveLoadsIntoFakeUsesPass(Registry);
|
[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs
This new MIR pass removes redundant DBG_VALUEs.
After the register allocator is done, more precisely, after
the Virtual Register Rewriter, we end up having duplicated
DBG_VALUEs, since some virtual registers are being rewritten
into the same physical register as some of existing DBG_VALUEs.
Each DBG_VALUE should indicate (at least before the LiveDebugValues)
variables assignment, but it is being clobbered for function
parameters during the SelectionDAG since it generates new DBG_VALUEs
after COPY instructions, even though the parameter has no assignment.
For example, if we had a DBG_VALUE $regX as an entry debug value
representing the parameter, and a COPY and after the COPY,
DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets
rewritten into $regX, we'd end up having redundant DBG_VALUE.
This breaks the definition of the DBG_VALUE since some analysis passes
might be built on top of that premise..., and this patch tries to fix
the MIR with the respect to that.
This first patch performs bacward scan, by trying to detect a sequence of
consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one
variable but the last one:
For example:
(1) DBG_VALUE $edi, !"var1", ...
(2) DBG_VALUE $esi, !"var2", ...
(3) DBG_VALUE $edi, !"var1", ...
...
in this case, we can remove (1).
By combining the forward scan that will be introduced in the next patch
(from this stack), by inspecting the statistics, the RemoveRedundantDebugValues
removes 15032 instructions by using gdb-7.11 as a testbed.
Differential Revision: https://reviews.llvm.org/D105279
2021-06-28 05:15:31 -07:00
|
|
|
initializeRemoveRedundantDebugValuesPass(Registry);
|
2016-05-31 22:38:06 +00:00
|
|
|
initializeRenameIndependentSubregsPass(Registry);
|
2017-05-10 00:39:22 +00:00
|
|
|
initializeSafeStackLegacyPassPass(Registry);
|
2022-05-13 22:29:21 +00:00
|
|
|
initializeSelectOptimizePass(Registry);
|
2021-07-07 14:25:24 -07:00
|
|
|
initializeShadowStackGCLoweringPass(Registry);
|
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.
As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b) {
%tmp = alloca i32, align 4
%tmp2 = icmp slt i32 %a, %b
br i1 %tmp2, label %true, label %false
true:
store i32 %a, i32* %tmp, align 4
%tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
br label %false
false:
%tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
ret i32 %tmp.0
}
On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f: ; @f
; BB#0:
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
LBB0_2: ; %false
mov sp, x29
ldp x29, x30, [sp], #16
ret
With shrink-wrapping we could generate:
_f: ; @f
; BB#0:
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
add sp, x29, #16 ; =16
ldp x29, x30, [sp], #16
LBB0_2: ; %false
ret
Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.
The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
llvm-svn: 236507
2015-05-05 17:38:16 +00:00
|
|
|
initializeShrinkWrapPass(Registry);
|
2020-03-10 17:39:11 +01:00
|
|
|
initializeSjLjEHPreparePass(Registry);
|
2024-07-09 12:09:11 +08:00
|
|
|
initializeSlotIndexesWrapperPassPass(Registry);
|
2024-10-14 19:23:34 +05:30
|
|
|
initializeStackColoringLegacyPass(Registry);
|
2023-01-13 23:20:56 +00:00
|
|
|
initializeStackFrameLayoutAnalysisPassPass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializeStackMapLivenessPass(Registry);
|
|
|
|
initializeStackProtectorPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeStackSlotColoringPass(Registry);
|
2020-04-08 10:27:17 -07:00
|
|
|
initializeStripDebugMachineModulePass(Registry);
|
2024-10-30 11:48:40 +05:30
|
|
|
initializeTailDuplicateLegacyPass(Registry);
|
2012-02-04 02:56:45 +00:00
|
|
|
initializeTargetPassConfigPass(Registry);
|
2024-07-15 15:11:06 +08:00
|
|
|
initializeTwoAddressInstructionLegacyPassPass(Registry);
|
2023-01-03 14:42:25 +00:00
|
|
|
initializeTypePromotionLegacyPass(Registry);
|
2012-02-08 21:23:13 +00:00
|
|
|
initializeUnpackMachineBundlesPass(Registry);
|
2016-07-08 03:32:49 +00:00
|
|
|
initializeUnreachableBlockElimLegacyPassPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
initializeUnreachableMachineBlockElimPass(Registry);
|
2024-10-22 15:15:56 +05:30
|
|
|
initializeVirtRegMapWrapperLegacyPass(Registry);
|
2012-06-08 23:44:45 +00:00
|
|
|
initializeVirtRegRewriterPass(Registry);
|
2018-05-31 22:02:34 +00:00
|
|
|
initializeWasmEHPreparePass(Registry);
|
2015-03-09 22:45:16 +00:00
|
|
|
initializeWinEHPreparePass(Registry);
|
2017-03-18 05:05:32 +00:00
|
|
|
initializeXRayInstrumentationPass(Registry);
|
2010-10-07 18:41:20 +00:00
|
|
|
}
|