Bolt makes use of add_llvm_library and as such ends up exporting its
libraries from LLVMExports.cmake, which is not correct.
Bolt doesn't have its own exports file, and I assume that there is no
desire to have one either -- Bolt libraries are not intended to be
consumed as a cmake module, right?
As such, this PR adds a NO_EXPORT option to simplify exclude these
libraries from the exports file.
Reuse the definition of profile density from llvm-profgen (#92144):
- the density is computed in perf2bolt using raw samples (perf.data or
pre-aggregated data),
- function density is the ratio of dynamically executed function bytes
to the static function size in bytes,
- profile density:
- functions are sorted by density in decreasing order, accumulating
their respective sample counts,
- profile density is the smallest density covering 99% of total sample
count.
In other words, BOLT binary profile density is the minimum amount of
profile information per function (excluding functions in tail 1% sample
count) which is sufficient to optimize the binary well.
The density threshold of 60 was determined through experiments with
large binaries by reducing the sample count and checking resulting
profile density and performance. The threshold is conservative.
perf2bolt would print the warning if the density is below the threshold
and suggest to increase the sampling duration and/or frequency to reach
a given density, e.g.:
```
BOLT-WARNING: BOLT is estimated to optimize better with 2.8x more samples.
```
Test Plan: updated pre-aggregated-perf.test
Reviewers: maksfb, wlei-llvm, rafaelauler, ayermolo, dcci, WenleiHe
Reviewed By: WenleiHe, wlei-llvm
Pull Request: https://github.com/llvm/llvm-project/pull/101094
Pseudo probe function records contain GUIDs assigned by the compiler
using an IR function name. Thus suffixes added later (e.g. `.llvm.`
for internal symbols, `.destroy`/`.resume` for coroutine fragments,
and `.cold`/`.warm` for split fragments) cause GUID mismatch.
Address that by dropping those suffixes using `getCommonName` which is
a parametrized form of `getLTOCommonName`.
Continue from #87196 as author did not have much time, I have taken over
working on this PR. We would like to have this so it'll be easier to
package for Nix.
Can be tested by copying cmake, bolt, third-party, and llvm directories
out into their own directory with this PR applied and then build bolt.
---------
Co-authored-by: pca006132 <john.lck40@gmail.com>
Added flag '--match-profile-with-function-hash' to match functions
based on exact hash. After identical and LTO name matching, more
functions can be recovered for inference with exact hash, in the case
of function renaming with no functional changes. Collisions are
possible in the unlikely case where multiple functions share the same
exact hash. The flag is off by default as it requires the processing of
all binary functions and subsequently is expensive.
Test Plan: added hashing-based-function-matching.test.
Relax assumptions that YAML output is not supported in BAT mode.
Set up basic infrastructure for emitting YAML for functions not covered
by BAT, such as from `.bolt.org.text` section (code identical to input binary
sans external refs), or non-rewritten functions in non-relocation mode (where
the function stays in the same section but BAT mapping is not emitted).
This diff only produces YAML profile for non-BAT functions (skipped,
non-simple). YAML profile for BAT functions is added in follow-up diffs:
- https://github.com/llvm/llvm-project/pull/76911 emits YAML profile with
internal control flow information only (branch profile),
- https://github.com/llvm/llvm-project/pull/76896 adds cross-function profile
(calls profile).
Test Plan: Added bolt/test/X86/bolt-address-translation-yaml.test
Reviewers: ayermolo, dcci, maksfb, rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/76910
Check if program header addresses fall into the kernel space to detect a
Linux kernel binary on x86-64.
Delete opts::LinuxKernelMode and use BinaryContext::IsLinuxKernel
instead.
Use MCAsmBackend::writeNopData() interface to emit NOP instructions on
x86. There are multiple forms of NOP instruction on x86 with different
sizes. Currently, LLVM's assembly/disassembly does not support all forms
correctly which can lead to a breakage of input code semantics, e.g. if
the program relies on NOP instructions for reserving a patch space.
Add "--keep-nops" option to preserve NOP instructions.
Closes https://github.com/llvm/llvm-project/issues/63097
Before merging please make sure the change to
bolt/include/bolt/Passes/StokeInfo.h is correct.
bolt/include/bolt/Passes/StokeInfo.h
```diff
// This Pass solves the two major problems to use the Stoke program without
- // proting its code:
+ // probing its code:
```
I'm still not happy about the awkward wording in this comment.
bolt/include/bolt/Passes/FixRelaxationPass.h
```
$ ed -s bolt/include/bolt/Passes/FixRelaxationPass.h <<<'9,12p'
// This file declares the FixRelaxations class, which locates instructions with
// wrong targets and fixes them. Such problems usually occures when linker
// relaxes (changes) instructions, but doesn't fix relocations types properly
// for them.
$
```
bolt/docs/doxygen.cfg.in
bolt/include/bolt/Core/BinaryContext.h
bolt/include/bolt/Core/BinaryFunction.h
bolt/include/bolt/Core/BinarySection.h
bolt/include/bolt/Core/DebugData.h
bolt/include/bolt/Core/DynoStats.h
bolt/include/bolt/Core/Exceptions.h
bolt/include/bolt/Core/MCPlusBuilder.h
bolt/include/bolt/Core/Relocation.h
bolt/include/bolt/Passes/FixRelaxationPass.h
bolt/include/bolt/Passes/InstrumentationSummary.h
bolt/include/bolt/Passes/ReorderAlgorithm.h
bolt/include/bolt/Passes/StackReachingUses.h
bolt/include/bolt/Passes/StokeInfo.h
bolt/include/bolt/Passes/TailDuplication.h
bolt/include/bolt/Profile/DataAggregator.h
bolt/include/bolt/Profile/DataReader.h
bolt/lib/Core/BinaryContext.cpp
bolt/lib/Core/BinarySection.cpp
bolt/lib/Core/DebugData.cpp
bolt/lib/Core/DynoStats.cpp
bolt/lib/Core/Relocation.cpp
bolt/lib/Passes/Instrumentation.cpp
bolt/lib/Passes/JTFootprintReduction.cpp
bolt/lib/Passes/ReorderData.cpp
bolt/lib/Passes/RetpolineInsertion.cpp
bolt/lib/Passes/ShrinkWrapping.cpp
bolt/lib/Passes/TailDuplication.cpp
bolt/lib/Rewrite/BoltDiff.cpp
bolt/lib/Rewrite/DWARFRewriter.cpp
bolt/lib/Rewrite/RewriteInstance.cpp
bolt/lib/Utils/CommandLineOpts.cpp
bolt/runtime/instr.cpp
bolt/test/AArch64/got-ld64-relaxation.test
bolt/test/AArch64/unmarked-data.test
bolt/test/X86/Inputs/dwarf5-cu-no-debug-addr-helper.s
bolt/test/X86/Inputs/linenumber.cpp
bolt/test/X86/double-jump.test
bolt/test/X86/dwarf5-call-pc-function-null-check.test
bolt/test/X86/dwarf5-split-dwarf4-monolithic.test
bolt/test/X86/dynrelocs.s
bolt/test/X86/fallthrough-to-noop.test
bolt/test/X86/tail-duplication-cache.s
bolt/test/runtime/X86/instrumentation-ind-calls.s
The dependence is needed since Utils includes VCSRevision.h, and other
LLVM components that include this header also have the llvm_vcsrevision_h
dependency.
Fixes#60460.
Reviewed By: #bolt, ayermolo
Differential Revision: https://reviews.llvm.org/D143101
Reuse getLTOCommonName in components other than Profile (to be used in Core)
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D142259
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
This does *not* link with libLLVM, but with static archives instead. Not
super-great, but at least the build works, which is probably better than
failing.
Related to #57551
Differential Revision: https://reviews.llvm.org/D134434
In perf2bolt and `-aggregate-only` BOLT mode, the output profile file is written
in fdata format by default. Provide a knob `-profile-format=[fdata,yaml]` to
control the format.
Note that `-w` option still dumps in YAML format.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D133995
Prefer using these accessors to access the special sub-commands
corresponding to the top-level (no subcommand) and all sub-commands.
This is a preparatory step towards removing the use of ManagedStatic:
with a subsequent change, these global instances will be moved to
be regular function-scope statics.
It is split up to give downstream projects a (albeit short) window in
which they can switch to using the accessors in a forward-compatible
way.
Differential Revision: https://reviews.llvm.org/D129118
Add the option to run -equalize-bb-counts before shrink
wrapping to avoid unnecessarily optimizing some CFGs where profile is
inaccurate but we can prove two blocks have the same frequency.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126113
Handle the case where LLVM_REVISION is undefined (due to LLVM_APPEND_VC_REV=OFF
or otherwise) by setting "<unknown>" value as before D123549.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D123852
The cold text section alignment is set using the maximum alignment value
passed to the emitCodeAlignment. In order to calculate tentetive layout
right we will set the minimum alignment of such sections to the maximum
possible function alignment explicitly.
Differential Revision: https://reviews.llvm.org/D121392
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb, Amir
Differential Revision: https://reviews.llvm.org/D118271
Summary:
Refactor remaining bolt sources to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33345885)
Summary:
Make BOLT build in VisualStudio compiler and run without
crashing on a simple test. Other tests are not running.
(cherry picked from FBD32378736)
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.
Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.
To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.
To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).
Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.
(cherry picked from FBD32746834)