953 Commits

Author SHA1 Message Date
Nathan Sidwell
6ec467297d
[BOLT][NFC] Adjust misleading comment & formatting (#88409)
This originally dealt with tbss, but now handles any bss-like section.
So the comment is inaccurate. Also, the `{}` on the messaging seem
unnecessary.
2024-04-12 08:34:43 -04:00
Maksim Panchenko
43d0891d3b
[BOLT] Fix handling of trailing entries in jump tables (#88444)
If a jump table has entries at the end that are a result of
__builtin_unreachable() targets, BOLT can confuse them with function
pointers. In such case, we should exclude these targets from the table
as we risk incorrectly updating the function pointers. It is safe to
exclude them as branching on such targets is considered an undefined
behavior.
2024-04-11 16:11:00 -07:00
Amir Ayupov
3997f0eb81
[BOLT] Cover all call sites in writeBATYAML
Call site information setting was conditioned on branch information
presence for a given block. However, it's possible to have sampled
profile lacking one or the other for a given basic block.

Iterate over branch profiles and call profiles independently to cover
all recorded profile data.

Depends on https://github.com/llvm/llvm-project/pull/87569

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87743
2024-04-11 21:15:04 +02:00
Amir Ayupov
8840992667
[BOLT][BAT] Fix handling of split functions
Move BAT parent function lookup outside `getLocationName`, to the
scope where we retrieve `FuncBranchData` linked with the function.

Previously DataAggregator would store branch profile recorded in the
split fragment in `FuncBranchData` associated with the fragment, and
perform name translation in `getLocationName` for symbol name only.
This works for fdata profile which is printed out as-is, but doesn't
work with BAT YAML profile writer which requires a combined profile.

The issue necessitated `fixupBATProfile` which partially addressed the
issue (reassigned inter-fragment calls back into intra-function
branches). However, `fixupBATProfile` fails to address disjoint
profiles (i.e. doesn't merge `FuncBranchData` for fragments back
into parent). This diff eliminates the need for `fixupBATProfile` by
removing the root cause of the issue.

Test Plan: NFC for existing tests

Reviewers: ayermolo, dcci, rafaelauler, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87569
2024-04-11 21:07:36 +02:00
Nathan Sidwell
5bed6afc21
[BOLT][NFC] Remove unneeded if (#88322)
No need need to special-case zero. Section 0 will map to section 0.
2024-04-11 14:44:11 -04:00
Nathan Sidwell
e2d4823959
[BOLT][NFC] Make RepRet X86-specific (#88286)
Bolt's RepRet pass is x86-specific, no need to add it for non-x86
targets.
2024-04-11 06:35:28 -04:00
Nathan Sidwell
364963a0a3
[BOLT][NFC] Do not assume text section name in more places (#88303)
Fixes a couple more places where ".text" is presumed for the main
code section name.
2024-04-11 06:29:51 -04:00
Nathan Sidwell
4308c7422d
[BOLT][NFC] Refactor relocation arch selection (#87829)
Convert the relocation routines to switch on architecture and have an explicit unreachable default.
2024-04-08 09:01:28 -04:00
Amir Ayupov
e64eede0dc
[BOLT][BAT] Fix encoded NumBasicBlocks
Emit the recorded number of blocks, not the number of basic block
hashes. There might be differences in corner cases (openssl
BN_BLINDING_convert_ex function).

Test Plan:
Updated openssl.test in https://github.com/rafaelauler/bolt-tests/pull/31

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: ayermolo

Pull Request: https://github.com/llvm/llvm-project/pull/87830
2024-04-05 17:54:07 -07:00
Amir Ayupov
0227623915
[BOLT][BAT] Support multi-way split functions
BAT writeMaps encoded the assumption that functions are only split into
two fragments (hot and cold). However, BOLT supports splitting into
arbitrary number of fragments. Relax that assumption and look up primary
(hot) fragment explicitly.

Depends on: https://github.com/llvm/llvm-project/pull/86219

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, rafaelauler, maksfb, dcci

Reviewed By: maksfb, dcci

Pull Request: https://github.com/llvm/llvm-project/pull/87123
2024-04-05 17:50:50 -07:00
Amir Ayupov
2d3c827c05
[BOLT] Use BAT for YAML profile call target information
Provide a mechanism to resolve call target information for calls from non-BAT
functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for
future use in BAT-to-BAT calls.

Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: ayermolo, maksfb, rafaelauler, dcci

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/86219
2024-04-05 16:08:59 -07:00
Amir Ayupov
fd38366e45
[BOLT][NFC] Clean includes, add license headers (#87200) 2024-03-31 19:29:45 -07:00
Amir Ayupov
d12e45ad16
[BOLT][NFC] Split out DomTree construction from BF::calculateLoopInfo (#87181) 2024-03-31 06:24:19 -07:00
Amir Ayupov
c0febca3a6
[BOLT][NFC] Refactor BC::createBinaryContext for #81346 (#87172) 2024-03-30 20:43:23 -07:00
Maksim Panchenko
7de82ca369
[BOLT] Don't terminate on trap instruction for Linux kernel (#87021)
Under normal circumstances, we terminate basic blocks on a trap
instruction. However, Linux kernel may resume execution after hitting a
trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will
specify if the trap instruction should terminate the control flow. The
option is on by default except for the Linux kernel mode when it's off.
2024-03-29 16:41:15 -07:00
Maksim Panchenko
35e7d458c9
[BOLT] Add rewriting support for Linux kernel __bug_table (#86908)
Update instruction locations in the __bug_table section after new code
is emitted. If an instruction with associated bug ID was deleted,
overwrite its location with zero.
2024-03-28 10:30:27 -07:00
Amir Ayupov
385e3e26c1
[BOLT] Set EntryDiscriminator in YAML profile for indirect calls
Indirect call handling missed setting an `EntryDiscriminator` while it's
set for direct calls and tail calls.

Improve YAML profile accuracy by unifying the destination setting
between direct and indirect calls into `setCSIDestination` method.

Depends on: https://github.com/llvm/llvm-project/pull/86848

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, maksfb, rafaelauler

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/82128
2024-03-27 16:40:38 -07:00
Amir Ayupov
d8fe2e4bb0
[BOLT] Fix enumeration of secondary entry points
Make them start with 1 instead of 0 (reserved for primary entry point).

Test Plan:
```
bin/llvm-lit -a tools/bolt/test/X86/yaml-secondary-entry-discriminator.s
```

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/86848
2024-03-27 15:23:49 -07:00
Amir Ayupov
213eda157a
[BOLT] Add CallSiteInfo entries in YAMLBAT (#76896)
Attach call counters to YAML profile, covering inter-function control
flow.

Depends on: https://github.com/llvm/llvm-project/pull/86218

Test Plan: 
Updated bolt/test/X86/bolt-address-translation-yaml.test
2024-03-25 16:23:21 -07:00
Amir Ayupov
1b763f230a
[BOLT] Add secondary entry points to BAT
Provide secondary entry points for `EntryDiscriminator` call info field
in YAML profile.

Increases BAT section size to:
- large binary: 39655300 bytes (1.03x the original),
- medium binary: 3834328 bytes (0.65x),
- small binary: 924 bytes (0.64x).

Depends on: https://github.com/llvm/llvm-project/pull/76911

Test Plan:
- Updated bolt-address-translation{,-yaml}.test
- Added openssl test: https://github.com/rafaelauler/bolt-tests/pull/30

Reviewers: dcci, rafaelauler, maksfb, ayermolo

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/86218
2024-03-25 15:14:33 -07:00
Amir Ayupov
d7d2f7ca62
[BOLT] Emit intra-function control flow in YAMLBAT
Attach branch counters to YAML profile, covering intra-function control
flow.

Depends on: https://github.com/llvm/llvm-project/pull/86353

Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76911
2024-03-23 19:11:49 -07:00
Amir Ayupov
a91cd53de3
[BOLT][NFC] Refactor BAT metadata data structures
Hide the implementations of `FuncHashes` and `BBHashMap` classes,
getting rid of `at` accessors that could throw an exception.

Test Plan: NFC

Reviewers: ayermolo, maksfb, dcci, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/86353
2024-03-23 16:08:31 -07:00
Maksim Panchenko
51268a57fd
[BOLT] Enable --keep-nops option for Linux kernel by default (#86349)
Preserve nop instructions in the Linux kernel since they could be used
for runtime patching.
2024-03-22 15:29:26 -07:00
Maksim Panchenko
56197d732e
[BOLT] Skip functions with unsupported Linux kernel features (#86345)
Do not overwrite functions with alternative and paravirtual instructions
until a proper update support is implemented.
2024-03-22 15:28:54 -07:00
Alexander Yermolovich
f3cfe016c5
[BOLT][DWARF] Add support for cross-cu references for debug-names (#86015)
The DW_AT_abstract_origin can be a cross-cu reference as a by-product of
LTO. On IR level for absolute references an address is stored, vs a DIE
for relative references. Added a map to keep track of cross-cu
referenced DIEs to use when we add an Entry.
2024-03-22 13:48:49 -07:00
Alexander Yermolovich
105feb9ac6
[BOLT][DWARF] Fix handling of DW_TAG_label (#86182)
For DWARF5 BOLT was not retreiving address and instead was setting an
index.
Changed so that an address is used, and added DWARF4 test because it was
missing.
2024-03-22 13:41:27 -07:00
Amir Ayupov
ceba3a38e8
[BOLT] Add number of basic blocks to BAT
YAML profile reader checks the number of basic blocks in regular,
no-stale-matching mode. Add it to BAT.

This increases the size of BAT section to:
- large binary: 39583080 bytes (1.02x of the original),
- medium binary: 3816492 bytes (0.64x),
- small binary: 920 bytes (0.64x, no change due to alignment).

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/86045
2024-03-22 08:46:48 -07:00
Amir Ayupov
b0e23639c5 [BOLT] Add BB index to BAT
Add input basic block index to BAT metadata. This addresses the case
where some basic blocks are eliminated, and output index is not equal
to the input block index. These indices are used in non-stale-matching
mode.

Increases BAT section size to:
- large binary: 39521512 bytes (1.02x original),
- medium binary: 3799988 bytes (0.64x),
- small binary: 920 bytes (0.64x).

Test Plan:
Updated bolt-address-translation{,-yaml}.test

Pull Request: https://github.com/llvm/llvm-project/pull/86044
2024-03-22 08:42:58 -07:00
Amir Ayupov
f66d631bf8 Revert "[BOLT] Add BB index to BAT (#86044)"
This reverts commit 3b3de48fd84b8269d5f45ee0a9dc6b7448368424.
2024-03-22 08:38:40 -07:00
Amir Ayupov
3b3de48fd8
[BOLT] Add BB index to BAT (#86044) 2024-03-22 06:07:17 -07:00
Kazu Hirata
4865dab04c [BOLT] Fix unused variable warnings
This patch fixes:

  bolt/lib/Rewrite/LinuxKernelRewriter.cpp:1664:20: error: unused
  variable 'TargetAddress' [-Werror,-Wunused-variable]

  bolt/lib/Rewrite/LinuxKernelRewriter.cpp:1666:20: error: unused
  variable 'KeyAddress' [-Werror,-Wunused-variable]
2024-03-21 20:21:12 -07:00
Amir Ayupov
6280681137
[BOLT] Output basic YAML profile in BAT mode
Relax assumptions that YAML output is not supported in BAT mode.
Set up basic infrastructure for emitting YAML for functions not covered
by BAT, such as from `.bolt.org.text` section (code identical to input binary
sans external refs), or non-rewritten functions in non-relocation mode (where
the function stays in the same section but BAT mapping is not emitted).

This diff only produces YAML profile for non-BAT functions (skipped,
non-simple). YAML profile for BAT functions is added in follow-up diffs:
- https://github.com/llvm/llvm-project/pull/76911 emits YAML profile with
  internal control flow information only (branch profile),
- https://github.com/llvm/llvm-project/pull/76896 adds cross-function profile
  (calls profile).

Test Plan: Added bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76910
2024-03-21 14:32:13 -07:00
Maksim Panchenko
6b1cf00400
[BOLT] Add support for Linux kernel static keys jump table (#86090)
Runtime code modification used by static keys is the most ubiquitous
self-modifying feature of the Linux kernel. The idea is to to eliminate
the condition check and associated conditional jump on a hot path if
that condition (based on a boolean value of a static key) does not
change often. Whenever they condition changes, the kernel runtime
modifies all code paths associated with that key flipping the code
between nop and (unconditional) jump.
2024-03-21 14:05:21 -07:00
Kazu Hirata
aa7e4ba3ca [BOLT] Fix an unused variable warning
This patch fixes:

  bolt/lib/Profile/BoltAddressTranslation.cpp:26:12: error: unused
  variable 'HotFuncAddress' [-Werror,-Wunused-variable]
2024-03-20 17:46:02 -07:00
Amir Ayupov
ad00e7e5ed
[BOLT] Write and parse BF/BB hashes in BAT
This increases BAT section size to:
- large binary: 34832976 bytes (0.90x original),
- medium binary: 3586800 bytes (0.60x original),
- small binary: 816 bytes (0.57x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76907
2024-03-20 16:24:21 -07:00
Amir Ayupov
de0abc0983
[BOLT][NFC] Simplify YAMLProfileWriter::convert
Use `getAnnotationWithDefault` instead of testing if the annotation is
set. If the default value is used, and `CSI.Count` is set to zero, the
target is discarded by a check below.

Test Plan: NFC

Reviewers: maksfb, dcci, rafaelauler, ayermolo

Reviewed By: ayermolo

Pull Request: https://github.com/llvm/llvm-project/pull/82129
2024-03-20 14:39:28 -07:00
Amir Ayupov
061b408964
[BOLT][NFC] Expose YAMLProfileWriter::convert function
The function is to be used by YAML profile emission in BAT mode for
BinaryFunctions not covered by BAT tables (same as in original binary).

Test Plan: NFC

Reviewers: rafaelauler, ayermolo, dcci, maksfb

Reviewed By: dcci

Pull Request: https://github.com/llvm/llvm-project/pull/76909
2024-03-20 14:35:47 -07:00
Jonas Devlieghere
41283403f5
[BOLT] Update DWARFRewriter for 32a6e9d66945 2024-03-19 12:45:31 -07:00
Kazu Hirata
8af3f74294 Revert "[BOLT] Update DIEStreamer (#85818)"
This reverts commit e4f9175d23950ecaef32db075ed47dafe3be555c.

  commit 3176c157190c80b4279dec86c4b9b84472d8ccac
  Author: Andres Villegas <andresvi@google.com>
  Date:   Tue Mar 19 10:58:31 2024 -0700

reverted 43a2ec483fe08064b53a6293682e9bab97df61a0.
2024-03-19 12:37:32 -07:00
Kazu Hirata
e4f9175d23
[BOLT] Update DIEStreamer (#85818)
commit 43a2ec483fe08064b53a6293682e9bab97df61a0
  Author: Jonas Devlieghere <jonas@devlieghere.com>
  Date:   Tue Mar 19 08:30:47 2024 -0700

removed parameter Translator from the constructor of DwarfStreamer.
This patch fixes the build by updating the constructor of DIEStreamer
accordingly.
2024-03-19 12:21:18 -07:00
Alexander Yermolovich
4841858862
[BOLT][DWARF] Add support to debug_names for DW_AT_abstract_origin/DW_AT_specification (#85485)
According to the DWARF spec a DIE that has DW_AT_specification or
DW_AT_abstract_origin can be part of .debug_name if a DIE those
attribute points to has DW_AT_name or DW_AT_linkage_name.
2024-03-18 15:28:01 -07:00
Alexander Yermolovich
a4610c7182
[BOLT][DWARF] Add support for DW_IDX_parent (#85285)
This adds support for DW_IDX_parent. If DIE has a parent then
DW_IDX_parent in Entry will point to Entry for that parent DIE.
Otherwise it will have DW_FORM_flag_present in abbrev. Which takes zero
space in Entry.

This came from

https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151
2024-03-15 13:52:45 -07:00
Amir Ayupov
b431546d41
[BOLT] Check BF state in stale matching (#85339)
Only apply stale matching if the binary function is in CFG state, i.e.
has basic blocks.

Test Plan:
Updated bolt/test/X86/reader-stale-yaml.test
2024-03-15 10:55:53 -07:00
Maksim Panchenko
49b8a99a0f
[BOLT] Add createCondBranch() and createLongUncondBranch() (#85315)
Add MCPlusBuilder interface for creating two new branch types.
2024-03-14 15:28:22 -07:00
Maksim Panchenko
bba790db47
[BOLT] Refactor instruction creation interface. NFCI (#85292)
Refactor MCPlusBuilder's create{Instruction}() functions that used to
return bool. We almost never check the return value as we rely on
llvm_unreachable() to detect unimplemented functionality. There were a
couple of cases that checked the return value, but they would hit the
unreachable condition first (at least in debug builds) before the return
value gets checked.
2024-03-14 13:17:17 -07:00
Maksim Panchenko
59ab86bb2f
[BOLT] Clear operands when creating new instructions. NFCI (#85191)
Reset operand list whenever we create a new instruction via a parameter
passed by reference. Most functions were already doing this, but there
are several places missing the reset. Potentially, if we don not clear
the list it could lead to invalid instruction operands. But the existing
code is unaffected.
2024-03-14 11:00:08 -07:00
Maksim Panchenko
fd32e744a5
[BOLT] Add support for Linux kernel PCI fixup section (#84982)
.pci_fixup section contains a table with entries allowing to invoke a
fixup hook whenever a problem is encountered with a PCI device. The
hookup code typically points to the start of a function. As we are not
relocating functions in the kernel (at least not yet), verify this
assumption while reading the table and ignore any functions with a fixup
code in the middle.
2024-03-12 15:52:27 -07:00
Alexander Yermolovich
6d4aa9d70e
[BOLT][DWWARF] Fix foreign TU index with local TUs (#84594)
The foreign TU list immediately follows the local TU list and they both
use the same index, so that if there are N local TU entries, the index
for the first foreign TU is N.

Changed so that the size of local TU is accounted for when setting
foreign TU index.
2024-03-11 12:20:25 -07:00
Maksim Panchenko
a9b0d7590b
[BOLT] Properly propagate Cursor errors (#84378)
Handle out-of-bounds reading errors correctly in LinuxKernelRewriter.
2024-03-07 15:29:38 -08:00
Maksim Panchenko
143afb405a
[BOLT] Add reading support for Linux kernel .altinstructions section (#84283)
Read .altinstructions and annotate instructions that have alternative
sequences with "AltInst" annotation. Note that some instructions may
have more than one alternatives, in which case they will have multiple
annotations in the form "AltInst", "AltInst2", "AltInst3", etc.
2024-03-07 13:04:02 -08:00