41 Commits

Author SHA1 Message Date
Amir Ayupov
dc1da93958
[BOLT][BAT] Add support for three-way split functions (#93760)
In three-way split functions, if only .warm fragment is present, BAT
incorrectly overwrites the map for .warm fragment by empty .cold
fragment.

Test Plan: updated register-fragments-bolt-symbols.s
2024-07-05 15:18:49 -07:00
Kazu Hirata
8901f718ea
Use StringRef::starts_with (NFC) (#94886) 2024-06-09 08:14:51 -07:00
Amir Ayupov
d1d9545ed3
[BOLT][BAT] Add entries for deleted basic blocks
Deleted basic blocks are required for correct mapping of branches
modified by SCTC.

Increases BAT size, bytes:
- large binary: 8622496 -> 8703244.
- small binary (X86/bolt-address-translation.test): 928 -> 940.

Test Plan: updated bb-with-two-tail-calls.s

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/91906
2024-05-23 19:19:07 -07:00
Amir Ayupov
5057463814
[BOLT][NFC] Make BAT methods const (#91823) 2024-05-22 10:39:27 -07:00
Amir Ayupov
7c5c8b2f47
[BOLT][NFC] Move BAT::fetchParentAddress to header (#93061)
Unbreak shared build after
https://github.com/llvm/llvm-project/pull/91683
2024-05-22 09:14:10 -07:00
Kazu Hirata
8927ac8657 [BOLT] Fix a warning
This patch fixes:

  bolt/lib/Profile/BoltAddressTranslation.cpp:380:37: error: operator
  '<<' has lower precedence than '+'; '+' will be evaluated first
  [-Werror,-Wshift-op-parentheses]
2024-04-15 08:56:57 -07:00
Amir Ayupov
b79b6f9cf0
[BOLT] Use offset deduplication for cold fragments
Apply deduplication for uniformity and BAT section size reduction.

Changes BAT section size to:
- large binary: 39541552 bytes (1.02x original),
- medium binary: 3828996 bytes (0.64x),
- small binary: 928 bytes (0.65x).

Test Plan: Updated bolt-address-translation.test

Reviewers: rafaelauler, dcci, ayermolo, JDevlieghere, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87853
2024-04-15 09:50:12 +02:00
Amir Ayupov
3997f0eb81
[BOLT] Cover all call sites in writeBATYAML
Call site information setting was conditioned on branch information
presence for a given block. However, it's possible to have sampled
profile lacking one or the other for a given basic block.

Iterate over branch profiles and call profiles independently to cover
all recorded profile data.

Depends on https://github.com/llvm/llvm-project/pull/87569

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87743
2024-04-11 21:15:04 +02:00
Amir Ayupov
e64eede0dc
[BOLT][BAT] Fix encoded NumBasicBlocks
Emit the recorded number of blocks, not the number of basic block
hashes. There might be differences in corner cases (openssl
BN_BLINDING_convert_ex function).

Test Plan:
Updated openssl.test in https://github.com/rafaelauler/bolt-tests/pull/31

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: ayermolo

Pull Request: https://github.com/llvm/llvm-project/pull/87830
2024-04-05 17:54:07 -07:00
Amir Ayupov
0227623915
[BOLT][BAT] Support multi-way split functions
BAT writeMaps encoded the assumption that functions are only split into
two fragments (hot and cold). However, BOLT supports splitting into
arbitrary number of fragments. Relax that assumption and look up primary
(hot) fragment explicitly.

Depends on: https://github.com/llvm/llvm-project/pull/86219

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, rafaelauler, maksfb, dcci

Reviewed By: maksfb, dcci

Pull Request: https://github.com/llvm/llvm-project/pull/87123
2024-04-05 17:50:50 -07:00
Amir Ayupov
2d3c827c05
[BOLT] Use BAT for YAML profile call target information
Provide a mechanism to resolve call target information for calls from non-BAT
functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for
future use in BAT-to-BAT calls.

Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: ayermolo, maksfb, rafaelauler, dcci

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/86219
2024-04-05 16:08:59 -07:00
Amir Ayupov
213eda157a
[BOLT] Add CallSiteInfo entries in YAMLBAT (#76896)
Attach call counters to YAML profile, covering inter-function control
flow.

Depends on: https://github.com/llvm/llvm-project/pull/86218

Test Plan: 
Updated bolt/test/X86/bolt-address-translation-yaml.test
2024-03-25 16:23:21 -07:00
Amir Ayupov
1b763f230a
[BOLT] Add secondary entry points to BAT
Provide secondary entry points for `EntryDiscriminator` call info field
in YAML profile.

Increases BAT section size to:
- large binary: 39655300 bytes (1.03x the original),
- medium binary: 3834328 bytes (0.65x),
- small binary: 924 bytes (0.64x).

Depends on: https://github.com/llvm/llvm-project/pull/76911

Test Plan:
- Updated bolt-address-translation{,-yaml}.test
- Added openssl test: https://github.com/rafaelauler/bolt-tests/pull/30

Reviewers: dcci, rafaelauler, maksfb, ayermolo

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/86218
2024-03-25 15:14:33 -07:00
Amir Ayupov
a91cd53de3
[BOLT][NFC] Refactor BAT metadata data structures
Hide the implementations of `FuncHashes` and `BBHashMap` classes,
getting rid of `at` accessors that could throw an exception.

Test Plan: NFC

Reviewers: ayermolo, maksfb, dcci, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/86353
2024-03-23 16:08:31 -07:00
Amir Ayupov
ceba3a38e8
[BOLT] Add number of basic blocks to BAT
YAML profile reader checks the number of basic blocks in regular,
no-stale-matching mode. Add it to BAT.

This increases the size of BAT section to:
- large binary: 39583080 bytes (1.02x of the original),
- medium binary: 3816492 bytes (0.64x),
- small binary: 920 bytes (0.64x, no change due to alignment).

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/86045
2024-03-22 08:46:48 -07:00
Amir Ayupov
b0e23639c5 [BOLT] Add BB index to BAT
Add input basic block index to BAT metadata. This addresses the case
where some basic blocks are eliminated, and output index is not equal
to the input block index. These indices are used in non-stale-matching
mode.

Increases BAT section size to:
- large binary: 39521512 bytes (1.02x original),
- medium binary: 3799988 bytes (0.64x),
- small binary: 920 bytes (0.64x).

Test Plan:
Updated bolt-address-translation{,-yaml}.test

Pull Request: https://github.com/llvm/llvm-project/pull/86044
2024-03-22 08:42:58 -07:00
Amir Ayupov
f66d631bf8 Revert "[BOLT] Add BB index to BAT (#86044)"
This reverts commit 3b3de48fd84b8269d5f45ee0a9dc6b7448368424.
2024-03-22 08:38:40 -07:00
Amir Ayupov
3b3de48fd8
[BOLT] Add BB index to BAT (#86044) 2024-03-22 06:07:17 -07:00
Kazu Hirata
aa7e4ba3ca [BOLT] Fix an unused variable warning
This patch fixes:

  bolt/lib/Profile/BoltAddressTranslation.cpp:26:12: error: unused
  variable 'HotFuncAddress' [-Werror,-Wunused-variable]
2024-03-20 17:46:02 -07:00
Amir Ayupov
ad00e7e5ed
[BOLT] Write and parse BF/BB hashes in BAT
This increases BAT section size to:
- large binary: 34832976 bytes (0.90x original),
- medium binary: 3586800 bytes (0.60x original),
- small binary: 816 bytes (0.57x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76907
2024-03-20 16:24:21 -07:00
Amir Ayupov
d2c9a19dd8
[BOLT][NFC] Pass BF/BB hashes to BAT
Test Plan: NFC

Reviewers: dcci, rafaelauler, maksfb, ayermolo

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76906
2024-02-15 12:49:43 -08:00
Amir Ayupov
52cf07116b
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Amir Ayupov
df7d2b2f90
[BOLT] Deduplicate equal offsets in BAT (#76905)
Encode BRANCHENTRY bits as bitmask for deduplicated entries.

Reduces BAT section size:
- large binary: to 11834216 bytes (0.31x original),
- medium binary: to 1565584 bytes (0.26x original),
- small binary: to 336 bytes (0.23x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-25 15:37:47 -08:00
Amir Ayupov
8f1d94aaea
[BOLT] Use continuous output addresses in delta encoding in BAT
Make output function addresses be delta-encoded wrt last offset in the
previous function. This reduces the deltas in function start addresses.

Test Plan:
Reduces BAT section size to:
- large binary: 12218860 bytes (0.32x original),
- medium binary: 1606580 bytes (0.27x original),
- small binary: 404 bytes (0.28x original),

Reviewers: rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76904
2024-01-18 13:49:44 -08:00
Amir Ayupov
dcba077146
[BOLT] Embed cold mapping info into function entry in BAT (#76903)
Reduces BAT section size:
- large binary: to 12283500 bytes (0.32x original size),
- medium binary: to 1616020 bytes (0.27x original size),
- small binary: to 404 bytes (0.28x original size).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-12 13:02:32 -08:00
Amir Ayupov
8fb8ad66c9
[BOLT] Delta-encode function start addresses in BAT (#76902)
Further reduce the size of BAT section:
- large binary: to 12716312 bytes (0.33x original),
- medium binary: to 1649472 bytes (0.28x original),
- small binary: to 428 bytes (0.30x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 14:35:37 -08:00
Amir Ayupov
bbe07989d7
[BOLT] Delta-encode offsets in BAT (#76900)
This change further reduces the size of BAT:
- large binary: to 13073904 bytes (0.34x original),
- medium binary: to 1703116 bytes (0.29x original),
- small binary: to 436 bytes (0.30x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 14:29:46 -08:00
Amir Ayupov
565f40d66b [BOLT] Encode BAT using ULEB128 (#76899)
Reduces BAT section size, bytes:
- large binary: 38676872 -> 23262524 (0.60x),
- medium binary (trunk clang): 5938004 -> 3213504 (0.54x),
- small binary (X86/bolt-address-translation.test): 1436 -> 680 (0.47x).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 12:16:30 -08:00
Job Noorman
23c8d38258 [BOLT] Calculate input to output address map using BOLTLinker
BOLT uses MCAsmLayout to calculate the output values of basic blocks.
This means output values are calculated based on a pre-linking state and
any changes to symbol values during linking will cause incorrect values
to be used.

This issue was first addressed in D154604 by adding all basic block
symbols to the symbol table for the linker to resolve them. However, the
runtime overhead of handling this huge symbol table turned out to be
prohibitively large.

This patch solves the issue in a different way. First, a temporary
section containing [input address, output symbol] pairs is emitted to the
intermediary object file. The linker will resolve all these references
so we end up with a section of [input address, output address] pairs.
This section is then parsed and used to:
- Replace BinaryBasicBlock::OffsetTranslationTable
- Replace BinaryFunction::InputOffsetToAddressMap
- Update BinaryBasicBlock::OutputAddressRange

Note that the reason this is more performant than the previous attempt
is that these symbol references do not cause entries to be added to the
symbol table. Instead, section-relative references are used for the
relocations.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D155604
2023-08-21 10:36:20 +02:00
Amir Ayupov
3d573fdbb4 [BOLT][NFC] Use std::optional in BAT 2022-12-11 22:13:46 -08:00
Kazu Hirata
e324a80fab [BOLT] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 23:12:38 -08:00
Kazu Hirata
1fa870b1bd Use None consistently (NFC)
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.

In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.

Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:

  using NoneType = std::nullopt_t;
  inline constexpr std::nullopt_t None = std::nullopt;

to ease the migration from llvm::Optional to std::optional.

Differential Revision: https://reviews.llvm.org/D138376
2022-11-20 00:24:40 -08:00
Fabian Parzefall
9b6e7861ae [BOLT] Track fragment info for all split fragments
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132051
2022-08-24 18:07:09 -07:00
Fabian Parzefall
6304e38281 Revert "[BOLT] Track fragment info for all split fragments"
This reverts commit 7e254818e49454a53bd00e3737007025b62d0f79.
2022-08-24 10:51:19 -07:00
Fabian Parzefall
7e254818e4 [BOLT] Track fragment info for all split fragments
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132051
2022-08-24 10:17:17 -07:00
Rafael Auler
fc0ced73dc Add BAT testing framework
This patch refactors BAT to be testable as a library, so we
can have open-source tests on it. This further fixes an issue with
basic blocks that lack a valid input offset, making BAT omit those
when writing translation tables.

Test Plan: new testcases added, new testing tool added (llvm-bat-dump)

Differential Revision: https://reviews.llvm.org/D129382
2022-07-29 14:55:04 -07:00
Fabian Parzefall
8477bc6761 [BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's
layout. It also lays the groundwork for splitting functions into
multiple fragments (as opposed to a strict hot/cold split).

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129518
2022-07-16 17:23:24 -07:00
Amir Ayupov
def464aaae [BOLT][NFC] Fix braces usage in Profile
Summary:
Refactor bolt/*/Profile to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33345741)
2021-12-28 18:29:54 -08:00
Maksim Panchenko
2f09f445b2 [BOLT][NFC] Fix file-description comments
Summary: Fix comments at the start of source files.

(cherry picked from FBD33274597)
2021-12-21 10:21:41 -08:00
Maksim Panchenko
40c2e0fafe [BOLT][NFC] Reformat with clang-format
Summary: Selectively apply clang-format to BOLT code base.

(cherry picked from FBD33119052)
2021-12-14 16:52:51 -08:00
Rafael Auler
a34c753fe7 Rebase: [NFC] Refactor sources to be buildable in shared mode
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.

Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.

To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.

To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).

Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.

(cherry picked from FBD32746834)
2021-10-08 11:47:10 -07:00