236 Commits

Author SHA1 Message Date
Freddy Ye
89f36dd8f3 [X86] Add ExpandLargeFpConvert Pass and enable for X86
As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
`compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`,
and etc.

Corner cases:
1. For fp16: as there is no related builtins added in compliler-rt. So I
mainly utilized the fp32 <-> fp16 lib calls to implement.
2. For fp80: as this pass is soft fp emulation and no fp80 instructions can
help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.
3. For bf16: as clang FE currently doesn't support bf16 algorithm operations
(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.
4. For unsigned FPToI: since both default hardware behaviors and libgcc are
ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M

The end-to-end tests are uploaded at https://reviews.llvm.org/D138261

Reviewed By: LuoYuanke, mgehre-amd

Differential Revision: https://reviews.llvm.org/D137241
2022-12-01 13:47:43 +08:00
Marco Elver
b95646fe70 Revert "Use-after-return sanitizer binary metadata"
This reverts commit d3c851d3fc8b69dda70bf5f999c5b39dc314dd73.

Some bots broke:

- https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8796062278266465473/overview
- https://lab.llvm.org/buildbot/#/builders/124/builds/5759/steps/7/logs/stdio
2022-11-30 23:35:50 +01:00
Dmitry Vyukov
d3c851d3fc Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-30 14:50:22 +01:00
Dmitry Vyukov
0aedf9d714 Revert "Use-after-return sanitizer binary metadata"
This reverts commit e6aea4a5db09c845276ece92737a6aac97794100.

Broke tests:
https://lab.llvm.org/buildbot/#/builders/16/builds/38992
2022-11-30 09:38:56 +01:00
Dmitry Vyukov
e6aea4a5db Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-30 09:14:19 +01:00
Kazu Hirata
dbb1130966 Revert "Use-after-return sanitizer binary metadata"
This reverts commit a1255dc467f7ce57a966efa76bbbb4ee91d9115a.

This patch results in:

  llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error: no member
  named 'size' in 'llvm::MDTuple'
2022-11-29 09:04:00 -08:00
Dmitry Vyukov
a1255dc467 Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-29 17:37:36 +01:00
Kazu Hirata
d6f0ab47a7 [CodeGen] Use std::optional in TargetPassConfig.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:12:11 -08:00
Matthias Gehre
2090e85fee [llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
This adds the ExpandLargeDivRem to the default pass pipeline.
The limit at which it expands div/rem instructions is configured
via a new TargetTransformInfo hook (default: no expansion)
X86, Arm and AArch64 backends implement this hook to expand div/rem
instructions with more than 128 bits.

Differential Revision: https://reviews.llvm.org/D130076
2022-09-06 15:32:04 +01:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Luo, Yuanke
5cb0979870 [X86][AMX] Split greedy RA for tile register
When we fill the shape to tile configure memory, the shape is gotten
from AMX pseudo instruction. However the register for the shape may be
split or spilled by greedy RA. That cause we fill the shape to config
memory after ldtilecfg is executed, so that the shape configuration
would be wrong.
This patch is to split the tile register allocation from greedy register
allocation, so that after tile registers are allocated the shape
registers are still virtual register. The shape register only may be
redefined or multi-defined by phi elimination pass, two address pass.
That doesn't affect tile register configuration.

Differential Revision: https://reviews.llvm.org/D128584
2022-06-29 10:35:43 +08:00
Fangrui Song
95a134254a Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 01:07:51 -07:00
Rahman Lavaee
08cc058518 Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554.

The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests.

Differential Revision: https://reviews.llvm.org/D126518
2022-05-26 19:53:14 -07:00
Rahman Lavaee
3aa249329f Revert "[Propeller] Promote functions with propeller profiles to .text.hot."
This reverts commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
2022-05-26 18:45:40 -07:00
Rahman Lavaee
4d8d2580c5 [Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).

This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.

The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.

Differential Revision: https://reviews.llvm.org/D122930
2022-05-26 16:23:21 -07:00
Sotiris Apostolakis
ca7c307d18 [SelectOpti][1/5] Setup new select-optimize pass
This is the first commit for the cmov-vs-branch optimization pass.
The goal is to develop a new profile-guided and target-independent cost/benefit analysis
for selecting conditional moves over branches when optimizing for performance.

Initially, this new pass is expected to be enabled only for instrumentation-based PGO.

RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D120230
2022-05-19 16:31:10 +00:00
Momchil Velikov
b4ad28da19 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CGA
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints taht LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-11 13:27:26 +01:00
Muhammad Omair Javaid
0320115c16 Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.

This commit had failing tests with clang crashing across various
AArch64/Linux buildots.

https://lab.llvm.org/buildbot/#/builders/179/builds/3346

Differential Revision: https://reviews.llvm.org/D114545
2022-04-05 13:12:30 +05:00
Momchil Velikov
980c3e6dd2 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CFG
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints that LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

In order to accommodate backends which do not generate unwind info in
epilogues we compute an additional property "strong no call frame on
entry" which is set for the entry point of the function and for every
block reachable from the entry along a path that does not execute the
prologue. If this property holds, it takes precedence over the "has a
call frame" property.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-04 14:38:22 +01:00
Julian Lettner
64902d335c Reland "Lower @llvm.global_dtors using __cxa_atexit on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-23 18:36:55 -07:00
Zequan Wu
581dc3c729 Revert "Lower @llvm.global_dtors using __cxa_atexit on MachO"
This reverts commit 22570bac694396514fff18dec926558951643fa6.
2022-03-23 16:11:54 -07:00
Julian Lettner
22570bac69 Lower @llvm.global_dtors using __cxa_atexit on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-17 10:47:13 -07:00
Shengchen Kan
ac64d0d230 [NFC][CodeGen] Remove redundant if clause in TargetPassConfig::addPass 2022-03-16 22:14:23 +08:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Simon Pilgrim
7262eacd41 Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower @llvm.global_dtors using __cxa_atexit on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
2022-03-15 13:01:35 +00:00
Julian Lettner
9c542a5a4e Lower @llvm.global_dtors using __cxa_atexit on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Nico Weber
a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille
7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Xiang1 Zhang
c31014322c TLS loads opimization (hoist)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D120000
2022-03-10 09:29:06 +08:00
Xiang1 Zhang
65588a0776 Revert "TLS loads opimization (hoist)"
Revert for more reviews

This reverts commit 30e612ebdfb0f243eb63d93487790a53c26ae873.
2022-03-02 14:10:11 +08:00
Xiang1 Zhang
30e612ebdf TLS loads opimization (hoist)
Reviewed By: Wang Pheobe, Topper Craig

Differential Revision: https://reviews.llvm.org/D120000
2022-03-02 10:37:24 +08:00
Rong Xu
52d981a4c1 [SampleFDO] Enable FSAFDO loading passes if --enable-fs-discriminator is enabled
FSAFDO profile loader is currently disabled even --enable-fs-discriminator is enabled.
They need to be turned on by options which makes it cumbersome for experiments.

This patch changes the FSAFDO profile loader enabled by default.  Since they are
guarded by EnableFSDiscriminator, they will only be turned on if
--enable-fs-discriminator is enabled. Note that --enable-fs-discriminator is
still disabled by default.

Differential Revision: https://reviews.llvm.org/D119033
2022-02-05 22:37:09 -08:00
Mircea Trofin
e67430cca4 [MLGO] ML Regalloc Eviction Advisor
The bulk of the implementation is common between 'release' mode (==AOT-ed
model) and 'development' mode (for training), the main difference is
that in development mode, we may also log features (for training logs),
inject scoring information (currently after the Virtual Register
Rewriter) and then produce the log file.

This patch also introduces the score injection pass, 'Register
Allocation Pass Scoring', which is trivially just logging the score in
development mode.

Differential Revision: https://reviews.llvm.org/D117147
2022-01-19 11:00:32 -08:00
Kazu Hirata
d09a284dfb [CodeGen] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2021-12-28 00:38:11 -08:00
Jay Foad
012248b0bc Remove the verifyAfter mechanism that was replaced by D111397
Differential Revision: https://reviews.llvm.org/D111872
2021-10-18 10:26:46 +01:00
Jay Foad
b84d9d299e [TargetPassConfig] Remove an obsolete FIXME comment
The "coloring with register" functionality it refers to was removed ten
years ago in svn r144481 "Remove the -color-ss-with-regs option".
2021-10-08 07:34:25 +01:00
Jay Foad
097339b1ca [TargetPassConfig] Enable machine verification after miscellaneous passes
In a couple of places machine verification was disabled for no apparent
reason, probably just because an "addPass(..., false)" line was cut and
pasted from elsewhere.

After this patch the only remaining place where machine verification is
disabled in the generic TargetPassConfig code, is after addPreEmitPass.
2021-10-07 21:24:50 +01:00
Jay Foad
27c57e791a [TwoAddressInstruction] Enable machine verification after this pass
Differential Revision: https://reviews.llvm.org/D111007
2021-10-07 20:04:51 +01:00
Jay Foad
3ff0a5747d [PHIElimination] Enable machine verification after this pass
Differential Revision: https://reviews.llvm.org/D111006
2021-10-07 20:04:51 +01:00
Jay Foad
0bd4365445 [LiveIntervals] Fix verification of early-clobbered segments
Enable verification of live intervals immediately after computing them
(when -early-live-intervals is used) and fix a problem that that
provokes: currently the verifier insists that a segment that ends at an
early-clobber slot must be followed by another segment starting at the
same slot. But before TwoAddressInstruction runs, the equivalent
condition is: a segment that ends at an early-clobber slot must have its
last use tied to an early-clobber def. That condition is harder to check
here, so for now just disable this check until tied operands have been
rewritten.

Differential Revision: https://reviews.llvm.org/D111065
2021-10-05 08:17:56 +01:00
Jay Foad
31c92d515d [MachineLoopInfo] Enable machine verification after this pass
Enabling this does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.

Differential Revision: https://reviews.llvm.org/D110703
2021-10-01 18:15:57 +01:00
Jay Foad
04787239c9 [LiveVariables] Skip verification of kills inside bundles
LiveVariables does not examine the contents of bundles, so
MachineVerifier should not expect it to know about kill flags on
operands of instructions inside a bundle.

With this fix we can enable machine verification after running the
LiveVariables analysis. Doing this does not show any problems in
check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build.

Differential Revision: https://reviews.llvm.org/D110700
2021-10-01 18:15:57 +01:00
Jay Foad
08d41f75d9 [UnreachableMachineBlockElim] Enable machine verification after this pass
Enabling this does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.

Differential Revision: https://reviews.llvm.org/D110697
2021-10-01 18:15:57 +01:00
Jay Foad
2bfe777a45 [ProcessImplicitDefs] Enable machine verification after this pass
Enabling this does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.

Differential Revision: https://reviews.llvm.org/D110695
2021-10-01 18:15:56 +01:00
Jay Foad
fd8e99700d [DetectDeadLanes] Enable machine verification after this pass
Machine verification after DetectDeadLanes has been disabled since the
pass was first added in D18427, but I guess this was just due to copy-
and-paste. Enabling it does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.

Differential Revision: https://reviews.llvm.org/D110689
2021-10-01 18:15:56 +01:00
Jay Foad
27179b39f9 [RemoveRedundantDebugValues] Enable machine verification after this pass
Machine verification after RemoveRedundantDebugValues has been disabled
since the pass was first added in D105279, but I guess this was just due
to copy-and-paste. Enabling it does not show any problems in check-llvm
in an LLVM_ENABLE_EXPENSIVE_CHECKS build.

Differential Revision: https://reviews.llvm.org/D110688
2021-09-29 10:44:35 +01:00
Hongtao Yu
d9b511d8e8 [CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110209
2021-09-22 09:09:48 -07:00
Rong Xu
5fdaaf7fd8 [SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
This patch implements Flow Sensitive Sample FDO (FSAFDO) profile
loader. We have two profile loaders for FS profile,
one before RegAlloc and one before BlockPlacement.

To enable it, when -fprofile-sample-use=<profile> is specified,
add "-enable-fs-discriminator=true \
     -disable-ra-fsprofile-loader=false \
     -disable-layout-fsprofile-loader=false"
to turn on the FS profile loaders.

Differential Revision: https://reviews.llvm.org/D107878
2021-08-18 18:37:35 -07:00
Rong Xu
4c5909ba83 [SampleFDO] Add two passes of MIRAddFSDiscriminatorsPass
This patch adds Pass1 of MIRADDFSDiscriminatorsPass before register
allocation, and Pass2 of MIRAddFSDiscriminatorsPass before
Block-Placement. This is still under --enable-fs-discrmininator
option (default false).

This would reduce the turn-around time for FSAFDO transition.

Differential Revision: https://reviews.llvm.org/D104579
2021-08-11 11:11:04 -07:00
Djordje Todorovic
df686842bc [RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs
This new MIR pass removes redundant DBG_VALUEs.

After the register allocator is done, more precisely, after
the Virtual Register Rewriter, we end up having duplicated
DBG_VALUEs, since some virtual registers are being rewritten
into the same physical register as some of existing DBG_VALUEs.
Each DBG_VALUE should indicate (at least before the LiveDebugValues)
variables assignment, but it is being clobbered for function
parameters during the SelectionDAG since it generates new DBG_VALUEs
after COPY instructions, even though the parameter has no assignment.
For example, if we had a DBG_VALUE $regX as an entry debug value
representing the parameter, and a COPY and after the COPY,
DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets
rewritten into $regX, we'd end up having redundant DBG_VALUE.

This breaks the definition of the DBG_VALUE since some analysis passes
might be built on top of that premise..., and this patch tries to fix
the MIR with the respect to that.

This first patch performs bacward scan, by trying to detect a sequence of
consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one
variable but the last one:

For example:

(1) DBG_VALUE $edi, !"var1", ...
(2) DBG_VALUE $esi, !"var2", ...
(3) DBG_VALUE $edi, !"var1", ...
 ...

in this case, we can remove (1).

By combining the forward scan that will be introduced in the next patch
(from this stack), by inspecting the statistics, the RemoveRedundantDebugValues
removes 15032 instructions by using gdb-7.11 as a testbed.

Differential Revision: https://reviews.llvm.org/D105279
2021-07-14 04:29:42 -07:00