Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D89423
This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:
| Address of the function | -> 8 bytes (pointer size)
| Number of basic blocks in this function (>0) | -> ULEB128
The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:
| Offset of the basic block relative to function begin | -> ULEB128
| Binary size of the basic block | -> ULEB128
| BB metadata | -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ]
The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.
For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html
Reviewed By: MaskRay, snehasish
Differential Revision: https://reviews.llvm.org/D85408
This patch changes the functionality of AsmPrinter to name the basic block end labels as LBB_END${i}_${j}, with ${i} being the identifier for the function and ${j} being the identifier for the basic block. The new naming scheme is consistent with how basic block labels are named (.LBB${i}_{j}), and how function end symbol are named (.Lfunc_end${i}) and helps to write stronger tests for the upcoming patch for BB-Info section (as proposed in https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). The end label is used with basicblock-labels (BB-Info section in future) and basicblock-sections to compute the size of basic blocks and basic block sections, respectively. For BB sections, the section containing the entry basic block will not have a BB end label since it already gets the function end-label.
This label is cached for every basic block (CachedEndMCSymbol) like the label for the basic block (CachedMCSymbol).
Differential Revision: https://reviews.llvm.org/D83885
Common up some existing MBB name printing logic into a single place.
Note that basic block dumping now prints the same set of attributes as
the MIRPrinter.
Change-Id: I8f022bbd922e831bc96d63143d7472c03282530b
Differential Revision: https://reviews.llvm.org/D83253
Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while also
supporting outputs causes some trouble, as the physreg->vreg COPY
operations cannot be in the same block.
Modeling it as a non-terminator allows it to be handled the same way
as invoke is handled already.
Most of the changes here were created by auditing all the existing
users of MachineBasicBlock::isEHPad() and
MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.
Reviewed By: nickdesaulniers, void
Differential Revision: https://reviews.llvm.org/D79794
Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block, such as invoke, and
potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in
particular would be problematic, because its successor blocks are not
distinct from "normal" successors, as EHPads are.)
Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.
Unfortunately, the goal of decoupling the behavior of end-of-block
jumps from the successor list has not been fully accomplished in this
patch, as there is currently no other way to determine whether a block
is intended to fall-through, or end as unreachable. Further work is
needed there.
Differential Revision: https://reviews.llvm.org/D79605
CorrectExtraCFGEdges function.
The latter was a workaround for "Various pieces of code" leaving bogus
extra CFG edges in place. Where by "various" it meant only
IfConverter::MergeBlocks, which failed to clear all of the successors
of dead blocks it emptied out. This wouldn't matter a whole lot,
except that the dead blocks remained listed as predecessors of
still-useful blocks, inhibiting optimizations.
This fix slightly changed two thumb tests, because the correct CFG
successors allowed for the "diamond" if-conversion pattern to be
detected, when it could only use "simple" before.
Additionally, the removal of a now-redundant call to analyzeBranch
(with AllowModify=true) in BranchFolder::OptimizeFunction caused a
later check for an empty block in BranchFolder::OptimizeBlock to
fail. Correct this by moving the call to analyzeBranch in
OptimizeBlock higher.
Differential Revision: https://reviews.llvm.org/D79527
Today symbol names generated for machine basic block sections use a
unary encoding to reduce bloat. This is essential when every basic block
in the binary is assigned a symbol however with basic block clusters
(rG05192e585ce175b55f2a26b83b4ed7882785c8e6) when we only need to
generate a few non-temporary symbols we can assign more descriptive
names making them more user friendly. With this change -
Cold cluster section for function foo is named "foo.cold"
Exception cluster section for function foo is named "foo.eh"
Other cluster sections identified by their ids are named "foo.ID"
Using this format works well with existing tools. It will demangle as
expected and works with existing symbolizers, profilers and debuggers
out of the box.
$ c++filt _Z3foov.cold
foo() [clone .cold]
$ c++filt _Z3foov.eh
foo() [clone .eh]
$c++filt _Z3foov.1234
foo() [clone 1234]
Tests for basicblock-sections are updated with some cleanup where
appropriate.
Differential Revision: https://reviews.llvm.org/D79221
Summary:
These helpers are exercised by follow-up commits in this patch series,
which is all about removing CodeGen differences with vs. without debug
info in the AArch64 backend.
Reviewers: fhahn, aprantl, jpaquette, paquette
Subscribers: kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78260
in the same section.
This allows specifying BasicBlock clusters like the following example:
!foo
!!0 1 2
!!4
This places basic blocks 0, 1, and 2 in one section in this order, and
places basic block #4 in a single section of its own.
This is the second patch in a series of patches to enable basic block
sections support.
This patch adds support for:
* Creating direct jumps at the end of basic blocks that have fall
through instructions.
* New pass, bbsections-prepare, that analyzes placement of basic blocks
in sections.
* Actual placing of a basic block in a unique section with special
handling of exception handling blocks.
* Supports placing a subset of basic blocks in a unique section.
* Support for MIR serialization and deserialization with basic block
sections.
Parent patch : D68063
Differential Revision: https://reviews.llvm.org/D73674
Summary:
Terminators in LLVM aren't prohibited from returning values. This means that
the "callbr" instruction, which is used for "asm goto", can support "asm goto
with outputs."
This patch removes all restrictions against "callbr" returning values. The
heavy lifting is done by the code generator. The "INLINEASM_BR" instruction's
a terminator, and the code generator doesn't allow non-terminator instructions
after a terminator. In order to correctly model the feature, we need to copy
outputs from "INLINEASM_BR" into virtual registers. Of course, those copies
aren't terminators.
To get around this issue, we split the block containing the "INLINEASM_BR"
right before the "COPY" instructions. This results in two cheats:
- Any physical registers defined by "INLINEASM_BR" need to be marked as
live-in into the block with the "COPY" instructions. This violates an
assumption that physical registers aren't marked as "live-in" until after
register allocation. But it seems as if the live-in information only
needs to be correct after register allocation. So we're able to get away
with this.
- The indirect branches from the "INLINEASM_BR" are moved to the "COPY"
block. This is to satisfy PHI nodes.
I've been told that MLIR can support this handily, but until we're able to
use it, we'll have to stick with the above.
Reviewers: jyknight, nickdesaulniers, hfinkel, MaskRay, lattner
Reviewed By: nickdesaulniers, MaskRay, lattner
Subscribers: rriddle, qcolombet, jdoerfert, MatzeB, echristo, MaskRay, xbolva00, aaron.ballman, cfe-commits, JonChesterfield, hiraditya, llvm-commits, rnk, craig.topper
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D69868
This is a compile-time optimization for PHIElimination (splitting of critical
edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As
discussed there, the way to remedy the slowdowns with huge functions is to
pre-compute the live-in registers for each MBB in an efficient way in
PHIElimination.cpp and then pass that information along to
LiveVariabless::addNewBlock().
In all the huge test programs where this slowdown has been noticable, it has
dissapeared entirely with this patch.
Review: Björn Pettersson, Quentin Colombet.
Differential Revision: https://reviews.llvm.org/D73152
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the old names.
This change is NFC.
Summary:
Remove the restrictions that preventing "asm goto" from returning non-void
values. The values returned by "asm goto" are only valid on the "fallthrough"
path.
Reviewers: jyknight, nickdesaulniers, hfinkel
Reviewed By: jyknight, nickdesaulniers
Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69876
analyzePhysReg does not really fit into the iterator and moving it
makes it easier to change the base iterator.
Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D70559
If there are debug instructions before the stopping point,
we need to skip over them before checking for begin in order
to avoid having the debug instructions effect behavior.
Fixes PR43758.
Differential Revision: https://reviews.llvm.org/D69606
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:
- `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
- `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
- `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,
Reviewers: lattner, thegameg, courbet
Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65945
llvm-svn: 371045
Summary:
Found a couple of places in the code where all the PHI nodes
of a MBB is updated, replacing references to one MBB by
reference to another MBB instead.
This patch simply refactors the code to use a common helper
(MachineBasicBlock::replacePhiUsesWith) for such PHI node
updates.
Reviewers: t.p.northover, arsenm, uabelho
Subscribers: wdng, hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66750
llvm-svn: 370463
Summary:
When we print the IR with --print-after/before-*,
SlotIndexes will be printed whenever available (We haven't freed it).
This introduces some noises when we try to compare the IR
among different optimizations.
eg:
-print-before=machine-cp will print SlotIndexes for 1st machine-cp
pass, but NOT for 2nd machine-cp;
-print-after=machine-cp will NOT print SlotIndexes for both
machine-cp passes.
So SlotIndexes in 1st pass introduce noises when differing these IRs.
This patch introduces an option to hide indexes.
Reviewers: stoklund, thegameg, qcolombet
Reviewed By: thegameg
Subscribers: hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66500
llvm-svn: 369650
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
Summary:
This was mostly an experiment to assess the feasibility of completely
eliminating a problematic implicit conversion case in D61321 in advance of
landing that* but it also happens to align with the goal of propagating the
use of Register/MCRegister instead of unsigned so I believe it makes sense
to commit it.
The overall process for eliminating the implicit conversions from
Register/MCRegister -> unsigned was to:
1. Add an explicit conversion to support genuinely required conversions to
unsigned. For example, using them as an index for IndexedMap. Sadly it's
not possible to have an explicit and implicit conversion to the same
type and only deprecate the implicit one so I called the explicit
conversion get().
2. Temporarily annotate the implicit conversion to unsigned with
LLVM_ATTRIBUTE_DEPRECATED to make them visible
3. Eliminate implicit conversions by propagating Register/MCRegister/
explicit-conversions appropriately
4. Remove the deprecation added in 2.
* My conclusion is that it isn't feasible as there's too much code to
update in one go.
Depends on D65678
Reviewers: arsenm
Subscribers: MatzeB, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65685
llvm-svn: 368643
MSVC finds ambiguity where clang doesn't and it looks like it's not going to be an easy fix
Reverting while I figure out how to fix it
This reverts r367916 (git commit aa15ec3c231717826e3c262b5ef9813d2fb5cadb)
This reverts r367920 (git commit 5d14efe279b5db9f4746ff834ab5c70e249d3871)
llvm-svn: 367932
Revision r365061 changed a skip of debug instructions for a skip
of meta instructions. This is not safe, as IMPLICIT_DEF is classed
as a meta instruction.
llvm-svn: 365198
Summary:
Previously no client of ilist traits has needed to know about transfers
of nodes within the same list, so as an optimization, ilist doesn't call
transferNodesFromList in that case. However, now there are clients that
want to use ilist traits to cache instruction ordering information to
optimize dominance queries of instructions in the same basic block.
This change updates the existing ilist traits users to detect in-list
transfers and do nothing in that case.
After this change, we can start caching instruction ordering information
in LLVM IR data structures. There are two main ways to do that:
- by putting an order integer into the Instruction class
- by maintaining order integers in a hash table on BasicBlock
I plan to implement and measure both, but I wanted to commit this change
first to enable other out of tree ilist clients to implement this
optimization as well.
Reviewers: lattner, hfinkel, chandlerc
Subscribers: hiraditya, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D57120
llvm-svn: 351992
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The scan was incorrectly skipping the first instruction, so a register
could appear to be dead when it was actually live. This eventually leads
to a machine verifier failure and miscompile in arm-ldst-opt.
Differential Revision: https://reviews.llvm.org/D54491
llvm-svn: 346821
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.
Differential Revision: https://reviews.llvm.org/D52605
llvm-svn: 343297
Summary:
For the loop that used MCRegAliasIterator this should be NFC.
For the loop that previously used MCSubRegIterator we should
now detect more cases where the register is actually live out that
we previously missed.
Reviewers: MatzeB, arsenm
Reviewed By: MatzeB
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D52410
llvm-svn: 342944
https://reviews.llvm.org/D52127
This patch adds the ability to watch for insertions/deletions of
MachineInstructions similar to MachineRegisterInfo.
llvm-svn: 342696
In computeRegisterLiveness, the max instructions to search
was counting dbg_value instructions, which could potentially
cause an observable codegen change from the presence of debug
info.
llvm-svn: 341028
If there is an unused def, this would previously
report that the register was live. Check for uses
first so that it is reported as dead if never used.
llvm-svn: 341027