This reapplies commit 1911a50fae8a441b445eb835b98950710d28fc88 with a
minor fix in lld/ELF/LTO.cpp which sets Options.BBAddrMap when
`--lto-basic-block-sections=labels` is passed.
This feature is supported via the newer option
`-fbasic-block-address-map`. Using the old option still works by
delegating to the newer option, while a warning is printed to show
deprecation.
The dominator tree gained an optimization to use block numbers instead
of a DenseMap to store blocks. Given that machine basic blocks already
have numbers, expose these via appropriate GraphTraits. For debugging,
block number epochs are added to MachineFunction -- this greatly helps
in finding uses of block numbers after RenumberBlocks().
In a few cases where dominator trees are preserved across renumberings,
the dominator tree is updated to use the new numbers.
Some types of machine function and machine basic block are unsafe to
split on AArch64: basic blocks that contain jump table dispatch or
targets (D157124), and blocks that contain inline ASM GOTO blocks
or their targets (D158647) all cause issues and have been excluded
from Machine Function Splitting on AArch64.
These issues are caused by any transformation pass that places
same-function basic blocks in different text sections
(MachineFunctionSplitter and BasicBlockSections) and must be
special-cased in both passes.
Today `-split-machine-functions` and `-fbasic-block-sections={all,list}`
cannot be combined with `-basic-block-sections=labels` (the labels
option will be ignored).
The inconsistency comes from the way basic block address map -- the
underlying mechanism for basic block labels -- encodes basic block
addresses
(https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html).
Specifically, basic block offsets are computed relative to the function
begin symbol. This relies on functions being contiguous which is not the
case for MFS and basic block section binaries. This means Propeller
cannot use binary profiles collected from these binaries, which limits
the applicability of Propeller for iterative optimization.
To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section
binaries, we propose modifying the encoding of this section as follows.
First let us review the current encoding which emits the address of each
function and its number of basic blocks, followed by basic block entries
for each basic block.
| | |
|--|--|
| Address of the function | Function Address |
| Number of basic blocks in this function | NumBlocks |
| BB entry 1
| BB entry 2
| ...
| BB entry #NumBlocks
To make this work for basic block sections, we treat each basic block
section similar to a function, except that basic block sections of the
same function must be encapsulated in the same structure so we can map
all of them to their single function.
We modify the encoding to first emit the number of basic block sections
(BB ranges) in the function. Then we emit the address map of each basic
block section section as before: the base address of the section, its
number of blocks, and BB entries for its basic block. The first section
in the BB address map is always the function entry section.
| | |
|--|--|
| Number of sections for this function | NumBBRanges |
| Section 1 begin address | BaseAddress[1] |
| Number of basic blocks in section 1 | NumBlocks[1] |
| BB entries for Section 1
|..................|
| Section #NumBBRanges begin address | BaseAddress[NumBBRanges] |
| Number of basic blocks in section #NumBBRanges |
NumBlocks[NumBBRanges] |
| BB entries for Section #NumBBRanges
The encoding of basic block entries remains as before with the minor
change that each basic block offset is now computed relative to the
begin symbol of its containing BB section.
This patch adds a new boolean codegen option `-basic-block-address-map`.
Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD
flag `--lto-basic-block-address-map` are introduced.
Analogously, we add a new TargetOption field `BBAddrMap`. This means BB
address maps are either generated for all functions in the compiling
unit, or for none (depending on `TargetOptions::BBAddrMap`).
This patch keeps the functionality of the old
`-fbasic-block-sections=labels` option but does not remove it. A
subsequent patch will remove the obsolete option.
We refactor the `BasicBlockSections` pass by separating the BB address
map and BB sections handing to their own functions (named
`handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers
basic blocks and places them in their assigned sections.
`handleBBAddrMap` is invoked after `handleBBSections` (if requested) and
only renumbers the blocks.
- New tests added:
- Two tests basic-block-address-map-with-basic-block-sections.ll and
basic-block-address-map-with-mfs.ll to exercise the combination of
`-basic-block-address-map` with `-basic-block-sections=list` and
'-split-machine-functions`.
- A driver sanity test for the `-fbasic-block-address-map` option
(basic-block-address-map.c).
- An LLD test for testing the `--lto-basic-block-address-map` option.
This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`.
- Renamed and modified the two existing codegen tests for basic block
address map (`basic-block-sections-labels-functions-sections.ll` and
`basic-block-sections-labels.ll`)
- Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of
`SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2
will happen in a separate PR in a few months.
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #75380
Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)"
Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #64560
Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
28b9126879
introduced the path cloning format in the basic-block-sections profile.
This PR validates and applies path clonings.
A path cloning is valid if all of these conditions hold:
1. All bb ids in the path are mapped to existing blocks.
2. Each two consecutive bb ids in the path have a successor relationship
in the CFG.
3. The path does not include a block with indirect branches, except
possibly as the last block.
Applying a path cloning involves cloning all blocks in the path (except
the first one) and setting up their branches.
Once all clonings are applied, the cluster information is used to guide
block layout in the modified function.
Following up on prior RFC
(https://lists.llvm.org/pipermail/llvm-dev/2020-September/145357.html)
we can now improve above our highly-optimized basic-block-sections
binary (e.g., 2% for clang) by applying path cloning. Cloning can
improve performance by reducing taken branches.
This patch prepares the profile format for applying cloning actions.
The basic block cloning profile format extends the basic block sections
profile in two ways.
1. Specifies the cloning paths with a 'p' specifier. For example, `p 1 4
5` specifies that blocks with BB ids 4 and 5 must be cloned along the
edge 1 --> 4.
2. For each cloned block, it will appear in the cluster info as
`<bb_id>.<clone_id>` where `clone_id` is the id associated with this
clone.
For example, the following profile specifies one cloned block (2) and
determines its cluster position as well.
```
f foo
p 1 2
c 0 1 2.1 3 2 5
```
This patch keeps backward-compatibility (retains the behavior for old
profile formats). This feature is only introduced for profile version >=
1.
Reverted in 4c8d056f50342d5401f5930ed60e5e48b211c3fb because it broke
buildbot `llvm-clang-x86_64-expensive-checks-debian` due to the AArch64
test generating invalid code. The issue still exists, but it's fixed in
D156767, so the AArch64 test should be added there.
Differential Revision: https://reviews.llvm.org/D158674
If an end section basic block ends in an unconditional branch to its
fallthrough, BasicBlockSections will duplicate the unconditional branch.
This doesn't break x86, but it is a (slight) size optimization and more
importantly prevents AArch64 builds from breaking.
Ex:
```
bb1 (bbsections Hot):
jmp bb2
bb2 (bbsections Cold):
/* do work... */
```
After running sortBasicBlocksAndUpdateBranches():
```
bb1 (bbsections Hot):
jmp bb2
jmp bb2
bb2 (bbsections Cold):
/* do work... */
```
Differential Revision: https://reviews.llvm.org/D158674
This patch removes the `getBBIDOrNumber` which was introduced to allow emitting version 1.
Reviewed By: shenhan
Differential Revision: https://reviews.llvm.org/D158299
Refactor BasicBlockSections to use the target-specific noop insertion
hook from TargetInstrInfo instead of building it ourselves. Using the
TII hook is both cleaner and makes it easier to extend BBSections to
non-X86 targets.
Differential Revision: https://reviews.llvm.org/D158303
Basic block sections profiles are ingested based on the function name. However, conflicts may occur when internal linkage functions with the same symbol name are linked into the binary (for instance static functions defined in different modules). Currently, these functions cannot be optimized unless we use `-funique-internal-linkage-names` (D89617) to enforce unique symbol names.
However, we have found that `-funique-internal-linkage-names` does not play well with inline assembly code which refers to the symbol via its symbol name. For example, the Linux kernel does not build with this option.
This patch implements a new feature which allows differentiating profiles based on the debug info filenames associated with each function. When specified, the given path is compared against the debug info filename of the matching function and profile is ingested only when the debug info filenames match. Backward-compatibility is guaranteed as omitting the specifiers from the profile would allow them to be matched by function name only. Also specifiers can be included for a subset of functions only.
Reviewed By: shenhan
Differential Revision: https://reviews.llvm.org/D146770
Annotation metadata supports adding singular annotation strings to annotation block. This patch adds the ability to insert a tuple of strings into the metadata array.
The idea here is that each tuple of strings represents a piece of information that can be all related. It makes it easier to parse through related metadata information given it will be contained in one tuple.
For example in remarks any pass that implements annotation remarks can have different type of remarks and pass additional information for each.
The original behaviour of annotation remarks is preserved here and we can mix tuple annotations and single annotations for the same instruction.
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D148328
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows.
- Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
- Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
- While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
- The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
- In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
- Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows.
- Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
- Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
- While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
- The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
- In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
- Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections.
Detailed description:
The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart.
```
.section .text.split.foo10,"ax",@progbits
foo10.cold: # %lpad
.cfi_startproc
.cfi_personality 3, __gxx_personality_v0
.cfi_lsda 3, .Lexception5
.cfi_def_cfa %rsp, 16
.Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section
movq %rax, %rdi <--- first instruction is at offest 0 from LPStart
callq _Unwind_Resume@PLT
```
This will cause landing pad entries to become zero (.Ltmp11-foo10.cold)
```
.Lcst_begin4:
.uleb128 .Ltmp9-.Lfunc_begin2 # >> Call Site 1 <<
.uleb128 .Ltmp10-.Ltmp9 # Call between .Ltmp9 and .Ltmp10
.uleb128 .Ltmp11-foo10.cold <---This is zero # jumps to .Ltmp11
.byte 3 # On action: 2
.uleb128 .Ltmp10-.Lfunc_begin2 # >> Call Site 2 <<
.uleb128 .Lfunc_end9-.Ltmp10 # Call between .Ltmp10 and .Lfunc_end9
.byte 0 # has no landing pad
.byte 0 # On action: cleanup
.p2align 2
```
The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output:
```
.section .text.split.foo10,"ax",@progbits
foo10.cold: # %lpad
.cfi_startproc
.cfi_personality 3, __gxx_personality_v0
.cfi_lsda 3, .Lexception5
.cfi_def_cfa %rsp, 16
nop <--- new instruction that is added
.Ltmp11:
movq %rax, %rdi
callq _Unwind_Resume@PLT
```
Reviewed By: modimo, snehasish, rahmanl
Differential Revision: https://reviews.llvm.org/D130133
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.
Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121346
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests.
Differential Revision: https://reviews.llvm.org/D126518
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).
This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.
The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.
Differential Revision: https://reviews.llvm.org/D122930
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
This reverts commit 029283c1c0d8d06fbf000f5682c56b8595a1101f.
The code in `ELFFile::decodeBBAddrMap` was not changed in the submitted patch.
Differential Revision: https://reviews.llvm.org/D120457
Conceptually, the new encoding emits the offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, the offsets must be aggregated along with basic block sizes to calculate the final relative-to-function offsets of basic blocks.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 25% reduction
in the size of the bb-address-map section (reduction from about 9MB to 7MB).
Reviewed By: tmsriram, jhenderson
Differential Revision: https://reviews.llvm.org/D106421
Prefer (self-documenting) return values to output parameters (which are
liable to be used).
While here, rename Noop to Nop which is more widely used and improves
consistency with hasEmitNops/setEmitNops/emitNop/etc.
Source Drift happens when the sources are updated after profiling the binary
but before building the final optimized binary. If the source has changed since
the profiles were obtained, optimizing basic blocks might be sub-optimal. This
only applies to BasicBlockSection::List as it creates clusters of basic blocks
using basic block ids. Source drift can invalidate these groupings leading to
sub-optimal code generation with regards to performance.
PGO source drift for a particular function can be detected using function
metadata added in D95495.
When source drift is deected, disable basic block clusters by default
which can be re-enabled with -mllvm option
bbsections-detect-source-drift=false.
Differential Revision: https://reviews.llvm.org/D95593
After using this for a while, we find that it is generally useful to
have it set to .text.split. by default, removing the need for an
additional -mllvm option.
Differential Revision: https://reviews.llvm.org/D88997
This patch lets the bb_addr_map (renamed to __llvm_bb_addr_map) section use a special section type (SHT_LLVM_BB_ADDR_MAP) instead of SHT_PROGBITS. This would help parsers, dumpers and other tools to use the sh_type ELF field to identify this section rather than relying on string comparison on the section name.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D88199
This is part of the Propeller framework to do post link code layout optimizations. Please see the RFC here: https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the detailed RFC doc here: https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf
This patch provides exception support for basic block sections by splitting the call-site table into call-site ranges corresponding to different basic block sections. Still all landing pads must reside in the same basic block section (which is guaranteed by the the core basic block section patch D73674 (ExceptionSection) ). Each call-site table will refer to the landing pad fragment by explicitly specifying @LPstart (which is omitted in the normal non-basic-block section case). All these call-site tables will share their action and type tables.
The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. In the case of basic block section where one section contains all the landing pads, the landing pad offset relative to LPStart could actually be zero. Thus, we avoid zero-offset landing pads by inserting a **nop** operation as the first non-CFI instruction in the exception section.
**Background on Exception Handling in C++ ABI**
https://github.com/itanium-cxx-abi/cxx-abi/blob/master/exceptions.pdf
Compiler emits an exception table for every function. When an exception is thrown, the stack unwinding library queries the unwind table (which includes the start and end of each function) to locate the exception table for that function.
The exception table includes a call site table for the function, which is used to guide the exception handling runtime to take the appropriate action upon an exception. Each call site record in this table is structured as follows:
| CallSite | --> Position of the call site (relative to the function entry)
| CallSite length | --> Length of the call site.
| Landing Pad | --> Position of the landing pad (relative to the landing pad fragment’s begin label)
| Action record offset | --> Position of the first action record
The call site records partition a function into different pieces and describe what action must be taken for each callsite. The callsite fields are relative to the start of the function (as captured in the unwind table).
The landing pad entry is a reference into the function and corresponds roughly to the catch block of a try/catch statement. When execution resumes at a landing pad, it receives an exception structure and a selector value corresponding to the type of the exception thrown, and executes similar to a switch-case statement. The landing pad field is relative to the beginning of the procedure fragment which includes all the landing pads (@LPStart). The C++ ABI requires all landing pads to be in the same fragment. Nonetheless, without basic block sections, @LPStart is the same as the function @Start (found in the unwind table) and can be omitted.
The action record offset is an index into the action table which includes information about which exception types are caught.
**C++ Exceptions with Basic Block Sections**
Basic block sections break the contiguity of a function fragment. Therefore, call sites must be specified relative to the beginning of the basic block section. Furthermore, the unwinding library should be able to find the corresponding callsites for each section. To do so, the .cfi_lsda directive for a section must point to the range of call-sites for that section.
This patch introduces a new **CallSiteRange** structure which specifies the range of call-sites which correspond to every section:
`struct CallSiteRange {
// Symbol marking the beginning of the precedure fragment.
MCSymbol *FragmentBeginLabel = nullptr;
// Symbol marking the end of the procedure fragment.
MCSymbol *FragmentEndLabel = nullptr;
// LSDA symbol for this call-site range.
MCSymbol *ExceptionLabel = nullptr;
// Index of the first call-site entry in the call-site table which
// belongs to this range.
size_t CallSiteBeginIdx = 0;
// Index just after the last call-site entry in the call-site table which
// belongs to this range.
size_t CallSiteEndIdx = 0;
// Whether this is the call-site range containing all the landing pads.
bool IsLPRange = false;
};`
With N basic-block-sections, the call-site table is partitioned into N call-site ranges.
Conceptually, we emit the call-site ranges for sections sequentially in the exception table as if each section has its own exception table. In the example below, two sections result in the two call site ranges (denoted by LSDA1 and LSDA2) placed next to each other. However, their call-sites will refer to records in the shared Action Table. We also emit the header fields (@LPStart and CallSite Table Length) for each call site range in order to place the call site ranges in separate LSDAs. We note that with -basic-block-sections, The CallSiteTableLength will not actually represent the length of the call site table, but rather the reference to the action table. Since the only purpose of this field is to locate the action table, correctness is guaranteed.
Finally, every call site range has one @LPStart pointer so the landing pads of each section must all reside in one section (not necessarily the same section). To make this easier, we decide to place all landing pads of the function in one section (hence the `IsLPRange` field in CallSiteRange).
| @LPStart | ---> Landing pad fragment ( LSDA1 points here)
| CallSite Table Length | ---> Used to find the action table.
| CallSites |
| … |
| … |
| @LPStart | ---> Landing pad fragment ( LSDA2 points here)
| CallSite Table Length |
| CallSites |
| … |
| … |
…
…
| Action Table |
| Types Table |
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D73739