2448 Commits

Author SHA1 Message Date
Paschalis Mpeis
427725508b
[BOLT] Add getter for optional relocations (#133085)
Minor refactoring on comments.
2025-03-28 14:07:51 +00:00
Maksim Panchenko
96e5ee23a7
[BOLT][AArch64] Add partial support for lite mode (#133014)
In lite mode, we only emit code for a subset of functions while
preserving the original code in .bolt.org.text. This requires updating
code references in non-emitted functions to ensure that:

* Non-optimized versions of the optimized code never execute.
* Function pointer comparison semantics is preserved.

On x86-64, we can update code references in-place using "pending
relocations" added in scanExternalRefs(). However, on AArch64, this is
not always possible due to address range limitations and linker address
"relaxation".

There are two types of code-to-code references: control transfer (e.g.,
calls and branches) and function pointer materialization.
AArch64-specific control transfer instructions are covered by #116964.

For function pointer materialization, simply changing the immediate
field of an instruction is not always sufficient. In some cases, we need
to modify a pair of instructions, such as undoing linker relaxation and
converting NOP+ADR into ADRP+ADD sequence.

To achieve this, we use the instruction patch mechanism instead of
pending relocations. Instruction patches are emitted via the regular MC
layer, just like regular functions. However, they have a fixed address
and do not have an associated symbol table entry. This allows us to make
more complex changes to the code, ensuring that function pointers are
correctly updated. Such mechanism should also be portable to RISC-V and
other architectures.

To summarize, for AArch64, we extend the scanExternalRefs() process to
undo linker relaxation and use instruction patches to partially
overwrite unoptimized code.
2025-03-27 21:33:25 -07:00
Ash Dobrescu
a308d421aa
Remove -no-pie case from indirect-goto-relocs.test (#133067)
This test was added in PR:
https://github.com/llvm/llvm-project/pull/120267. The -no-pie case in
the above mentioned test needs to be removed as subsequent changes have
caused it to fail.
2025-03-26 11:11:55 +00:00
Anatoly Trosinenko
b6b40e9ac9
[BOLT] Gadget scanner: reformulate the state for data-flow analysis (#131898)
In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as `ret x1` in
`f_nonx30_ret_non_auted` test case.
2025-03-25 21:45:02 +03:00
Kazu Hirata
993311799b [BOLT] Fix a warning
This patch fixes:

  bolt/lib/Passes/PAuthGadgetScanner.cpp:438:18: error: unused
  variable 'BC' [-Werror,-Wunused-variable]
2025-03-21 11:08:27 -07:00
Anatoly Trosinenko
72d1058af0
[BOLT] Gadget scanner: refactor analysis of RET instructions (#131897)
In preparation for implementing detection of more gadget kinds,
refactor checking for non-protected return instructions.
2025-03-21 19:54:57 +03:00
Paschalis Mpeis
6bbd45dec7
[NFC][BOLT] Refactor ForcePatch option (#127812)
Move force-patch flag to CommandLineOpts and add details on
PatchEntries.
2025-03-21 15:55:09 +00:00
Anatoly Trosinenko
03557169e0
[BOLT] Gadget scanner: streamline issue reporting (#131896)
In preparation for adding more gadget kinds to detect, streamline
issue reporting.

Rename classes representing issue reports. In particular, rename
`Annotation` base class to `Report`, as it has nothing to do with
"annotations" in `MCPlus` terms anymore. Remove references to "return
instructions" from variable names and report messages, use generic
terms instead. Rename NonPacProtectedRetAnalysis to PAuthGadgetScanner.

Remove `GeneralDiagnostic` as a separate class, make `GenericReport`
(former `GenDiag`) store `std::string Text` directly. Remove unused
`operator=` and `operator==` methods, as `Report`s are created on the
heap and referenced via `shared_ptr`s.

Introduce `GadgetKind` class - currently, it only wraps a `const char *`
description to display to the user. This description is intended to be
a per-gadget-kind constant (or a few hard-coded constants), so no need
to store it to `std::string` field in each report instance. To handle
both free-form `GenericReport`s and statically-allocated messages
without unnecessary overhead, move printing of the report header to the
base class (and take the message argument as a `StringRef`).
2025-03-21 11:19:53 +03:00
Fangrui Song
42a8813757 [RISCV] Rename VariantKind to Specifier
Follow the X86 and Mips renaming.

> "Relocation modifier" suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation.
> "Relocation specifier" is clear, aligns with Arm and IBM AIX's documentation, and fits the assembler's role seamlessly.

In addition, rename *MCExpr::getKind, which confusingly shadows the base class getKind.
2025-03-20 22:25:57 -07:00
Paschalis Mpeis
5f6d9b45e9
[BOLT] Make Relocations a class and add optional field (#131638)
This patch converts `Relocations` from a struct to a class, and
introduces the `Optional` field. Patch #116964 will use it.

Some optimizations, like `scanExternalRefs`, create relocations that
patch the old code. Under certain circumstances these may be skipped
without correctness implications.
2025-03-20 17:16:14 +00:00
Kazu Hirata
10624e67c3 [BOLT] Fix warnings
bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp:62:13: error: unused
  function 'traceInst' [-Werror,-Wunused-function]

  bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp:68:13: error: unused
  function 'traceReg' [-Werror,-Wunused-function]

  bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp:80:13: error: unused
  function 'traceRegMask' [-Werror,-Wunused-function]
2025-03-20 10:12:46 -07:00
Anatoly Trosinenko
482b95217e
[BOLT] Gadget scanner: factor out utility code (#131895)
Factor out the code for mapping from physical registers to consecutive
array indexes.

Introduce helper functions to print instructions and registers to
prevent mixing of analysis logic and implementation details of debug
output.

Removed the debug printing from `Gadget::generateReport`, as it doesn't
seem to add important information to what was already printed in the
report itself.
2025-03-20 19:35:31 +03:00
Ash Dobrescu
3bba268013
[BOLT] Support computed goto and allow map addrs inside functions (#120267)
Create entry points for addresses referenced by dynamic relocations and
allow getNewFunctionOrDataAddress to map addrs inside functions. By
adding addresses referenced by dynamic relocations as entry points. This
patch fixes an issue where bolt fails on code using computing goto's.
This also fixes a mapping issue with the bugfix from this PR:
https://github.com/llvm/llvm-project/pull/117766.
2025-03-19 14:55:59 +00:00
Maksim Panchenko
70bf5e514b
[BOLT][AArch64] Symbolize ADRP after relaxation (#131414)
When the linker relaxes a GOT load, it changes ADRP+LDR instruction pair
into ADRP+ADD. It is relatively straightforward to detect and symbolize
the second instruction in the disassembler. However, it is not always
possible to properly symbolize the ADRP instruction without looking at
the second instruction. Hence, we have the FixRelaxationPass that adjust
the operand of ADRP by looking at the corresponding ADD.

This PR tries to properly symbolize ADRP earlier in the pipeline, i.e.
in AArch64MCSymbolizer. This change makes it easier to adjust the
instruction once we add AArch64 support in `scanExternalRefs()`.
Additionally, we get a benefit of looking at proper operands while
observing the function state prior to running FixRelaxationPass.

To disambiguate the operand of ADRP that has a GOT relocation against
it, we look at the contents/value of the operand. If it contains an
address of a page that is valid for GOT, we assume that the operand
wasn't modified by the linker and leave it up to FixRelaxationPass to do
a proper adjustment. If the page referenced by ADRP cannot point to GOT,
then it's an indication that the linker has modified the operand and we
substitute the operand with a non-GOT reference to the symbol.
2025-03-18 14:31:31 -07:00
Kazu Hirata
c72f7958b0 [BOLT] Fix the build
This is a follow-up for:

  commit 3c4b9317916ccd2e18c30b1540589518a4c7c88a
  Author: Fangrui Song <i@maskray.me>
  Date:   Mon Mar 17 20:05:28 2025 -0700
2025-03-17 20:18:34 -07:00
Anatoly Trosinenko
4f2ee07454
[BOLT][AArch64] Do not crash on authenticated branch instructions (#129898)
When an indirect branch instruction is decoded, analyzeIndirectBranch
method is asked if this is a well-known code pattern. On AArch64, the
only special pattern which is detected is Jump Table, emitted as a
branch to the sum of a constant base address and a variable offset.
Therefore, `Inst.getOpcode()` being one of `AArch64::BRA*` means Inst
cannot belong to such Jump Table pattern, thus returning early.
2025-03-17 12:00:05 +03:00
Kazu Hirata
4b1b629d60 [BOLT] Fix a warning
This patch fixes:

  bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp:128:20: error:
  unused variable 'SymbolPageAddr' [-Werror,-Wunused-variable]
2025-03-14 19:20:03 -07:00
Maksim Panchenko
bac21719a8
[BOLT] Pass unfiltered relocations to disassembler. NFCI (#131202)
Instead of filtering and modifying relocations in readRelocations(),
preserve the relocation info and use it in the symbolizing disassembler.
This change mostly affects AArch64, where we need to look at original
linker relocations in order to properly symbolize instruction operands.
2025-03-14 18:44:33 -07:00
Paschalis Mpeis
2f9d94981c
[BOLT] Change Relocation Type to 32-bit NFCI (#130792) 2025-03-14 18:15:59 +00:00
Kazu Hirata
03614b9a8a
[BOLT] Workaround failures (#131245)
These tests have been failing since:

  commit 1cfca53b9f2eadbf864b85995ec7f819d7f29b5e
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Wed Mar 12 16:20:13 2025 -0700

This patch works around the failures by removing some FileCheck
directives.  Hopefully, BOLT folks can chime in and commit a right
fix.
2025-03-13 20:55:43 -07:00
Nikita Popov
f137c3d592
[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940)
This avoids doing a Triple -> std::string -> Triple round trip in lots
of places, now that the Module stores a Triple.
2025-03-12 17:35:09 +01:00
Maksim Panchenko
a28daa7c1a
[BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (#129980)
We used to filter out relocations corresponding to NOP+ADR instruction
pairs that were a result of linker "relaxation" optimization. However,
these relocations will be useful for reversing the linker optimization.
Keep the relocations and ignore them while symbolizing ADR instruction
operands.
2025-03-05 23:06:01 -08:00
chrisPyr
038fff3f24
[NFC][BOLT] Make file-local cl::opt global variables static (#126472)
#125983
2025-03-05 22:11:05 -08:00
Yevhen Babiichuk (DustDFG)
36cd60144b
[BOLT] Remove unexisting targets from bolt dockerfile (#122321)
`perf2bolt` and `llvm-boltdiff` are now not separate targets but just
symlinks to `llvm-bolt` created when you install `llvm-bolt` itself so
when you try to build it ninja reports there are no targets for both of
them
2025-03-05 09:23:06 +00:00
Eric Wang
fcb65ad2a2
[BOLT] Fix kernel version check for THP in hugify (#129380)
BOLT --hugify does not work in kernel 6.x.

Co-authored-by: rfwang07 <wangrufeng5@huawei.com>
2025-03-04 20:38:41 -08:00
Maksim Panchenko
b971d4d7c8
[BOLT][AArch64] Add symbolizer for AArch64 disassembler. NFCI (#127969)
Add AArch64MCSymbolizer that symbolizes `MCInst` operands during
disassembly. The symbolization was previously done in
`BinaryFunction::disassemble()`, but it is also required by
`scanExternalRefs()` for "lite" mode functionality. Hence, similar to
x86, I've implemented the symbolizer interface that uses
`BinaryFunction` relocations to properly create instruction operands. I
expect the result of the disassembly to be identical after the change.

AArch64 disassembler was not calling `tryAddingSymbolicOperand()` for
`MOV` instructions. Fix that. Additionally, the disassembler marks `ldr`
instructions as branches by setting `IsBranch` parameter to true. Ignore
the parameter and rely on `MCPlusBuilder` interface instead.

I've modified `--check-encoding` flag to check symolization of operands
of instructions that have relocations against them.
2025-03-03 12:44:28 -08:00
Maksim Panchenko
6a161cbfd4
[BOLT] Remove BinaryFunction::IsPatched. NFC (#129461)
BinaryFunction::IsPatched is no longer used.
2025-03-02 23:40:02 -08:00
Fangrui Song
74638f1634 [test] Replace .data.rel.ro with .section .data.rel.ro,"aw"
to avoid using the extension unsupported by gas.
2025-03-01 20:55:17 -08:00
Maksim Panchenko
5a11912ece
[BOLT] Refactor interface for creating instruction patches. NFCI (#129404)
Add BinaryContext::createInstructionPatch() interface for patching parts
of the original binary with new instruction sequences. Refactor
PatchEntries pass to use the new interface.
2025-03-01 19:20:17 -08:00
Maksim Panchenko
8910e41c86
[BOLT][AArch64] Refactor ADR to ADRP+ADD conversion pass. NFCI (#129399)
In preparation of using the new interface in more places, refactor the
ADR conversion pass.
2025-03-01 14:10:59 -08:00
Maksim Panchenko
074c2c6713
[BOLT] Refactor MCInst target symbol lookup. NFCI (#129131)
In analyzeInstructionForFuncReference(), use MCPlusBuilder interface
while scanning symbolic operands of MCInst. Should be NFC on x86, but
will make the function work on other architectures. Note that it's
currently unused on non-x86 as its functionality is exclusive to safe
ICF that runs on x86 only.
2025-02-28 17:57:54 -08:00
ShatianWang
7e33bebe7c
[BOLT] Report flow conservation scores (#127954)
Add two additional profile quality stats for CG (call graph) and CFG
(control flow graph) flow conservations besides the CFG discontinuity
stats introduced in #109683. The two new stats quantify how different
"in-flow" is from "out-flow" in the following cases where they should be
equal. The smaller the reported stats, the better the flow conservations
are.

CG flow conservation: for each function that is not a program entry, the
number of times the function is called according to CG ("in-flow")
should be equal to the number of times the transition from an entry
basic block of the function to another basic block within the function
is recorded ("out-flow").

CFG flow conservation: for each basic block that is not a function entry
or exit, the number of times the transition into this basic block from
another basic block within the function is recorded ("in-flow") should
be equal to the number of times the transition from this basic block to
another basic block within the function is recorded ("out-flow").

Use `-v=1` for more detailed bucketed stats, and use `-v=2` to dump
functions / basic blocks with bad flow conservations.
2025-02-28 11:06:52 -05:00
YongKang Zhu
5401c675eb
[BOLT][instr] Avoid WX segment (#128982)
BOLT instrumented binary today has a readable (R), writeable (W) and also
executable (X) segment, which Android system won't load due to its WX
attribute. Such RWX segment was produced because BOLT has a two step linking,
first for everything in the updated or rewritten input binary and next for
runtime library. Each linking will layout sections in the order of RX sections
followed by RO sections and then followed by RW sections. So we could end up
having a RW section `.bolt.instr.counters` surrounded by a number of RO and RX
sections, and a new text segment was then formed by including all RX sections
which includes the RW section in the middle, and hence the RWX segment. One
way to fix this is to separate the RW `.bolt.instr.counters` section into its
own segment by a). assigning the starting addresses for section
`.bolt.instr.counters` and its following section with regular page aligned
addresses and b). creating two extra program headers accordingly.
2025-02-27 16:13:57 -08:00
Amir Ayupov
f567524399
[BOLT] Fix doTrace in BAT mode (#128546)
When processing BOLTed binaries with BAT section, we used to
indiscriminately use `BAT->getFallthroughsInTrace` to record
fall-throughs, even if the function is not covered by BAT.

Fix that by using non-BAT CFG-based `getFallthroughsInTrace` if the
function is not in BAT.

Test Plan: updated bolt-address-translation-yaml.test
2025-02-25 10:56:13 -08:00
Amir Ayupov
3968ebd00d
[BOLT] Keep multi-entry functions simple in aggregation mode (#128253)
BOLT used to mark multi-entry functions non-simple in non-relocation
mode with the reasoning that we can't move them due to potentially
undetected references. However, in aggregation mode it doesn't apply as
BOLT doesn't perform optimizations.

Relax this constraint in case of an aggregation job.

Test Plan: added entry-point-fallthru.s
2025-02-25 10:53:45 -08:00
Kristof Beyls
6c61c55756
[BOLT] pacret-scanner: fix regression test failure (#128576)
... which is caused by a seemingly recent change in BOLTs basic block
calculation, where function calls seem to be ending basic blocks? I
don't have a pointer to the commit that caused this change. I'll be
looking for that later. For now, I'm trying to get the regression tests
passing again.
2025-02-24 21:08:43 +00:00
Kristof Beyls
55c76ea391
[BOLT] pacret-scanner: fix regression tests... (#128565)
by making the regex to match basic block names more general. See failing
test case that was reported on some system in comment
https://github.com/llvm/llvm-project/pull/122304#issuecomment-2679460678

These test cases were introduced in PR #122304, commit
850b49297615a613ac83adca2c9cf823a4b8ef95 .
2025-02-24 20:24:12 +00:00
Christian Sigg
0770afb88e [bolt] Remove unnecessary include.
... which introduced a testing dependency in
850b492976
2025-02-24 09:05:40 +01:00
Kristof Beyls
850b492976
[BOLT][binary-analysis] Add initial pac-ret gadget scanner (#122304)
This adds an initial pac-ret gadget scanner to the
llvm-bolt-binary-analysis-tool.

The scanner is taken from the prototype that was published last year at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype,
and has been discussed in RFC

https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and in the EuroLLVM 2024 keynote "Does LLVM implement security
hardenings correctly? A BOLT-based static analyzer to the rescue?"
[Video](https://youtu.be/Sn_Fxa0tdpY)
[Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf)

In the spirit of incremental development, this PR aims to add a minimal
implementation that is "fully working" on its own, but has major
limitations, as described in the bolt/docs/BinaryAnalysis.md
documentation in this proposed commit. These and other limitations will
be fixed in follow-on PRs, mostly based on code already existing in the
prototype branch. I hope incrementally upstreaming will make it easier
to review the code.

Note that I believe that this could also form the basis of a scanner to
analyze correct implementation of PAuthABI.
2025-02-24 07:26:28 +00:00
Amir Ayupov
209252f3d5
[BOLT] Introduce skip-inline flag (#128135)
Introduce exclusion list for inlining, allowing more fine-grained
control than using skip-funcs.

Test Plan: added skip-inline.s
2025-02-21 09:10:53 -08:00
YongKang Zhu
9fa77c1854
[BOLT][Linker][NFC] Remove lookupSymbol() in favor of lookupSymbolInfo() (#128070)
Sometimes we need to know the size of a symbol besides its address, so
maybe we can start using the existing `BOLTLinker::lookupSymbolInfo()`
(that returns symbol address and size) and remove
`BOLTLinker::lookupSymbol()` (that only returns symbol address). And for
both we need to check return value as it is wrapped in `std::optional<>`,
which makes the difference even smaller.
2025-02-20 17:14:33 -08:00
Maksim Panchenko
0ba391a85f
[BOLT] Improve constant island disassembly (#127971)
* Add label that identifies constant island.
* Support cases where the island is located after the function.
2025-02-20 11:16:01 -08:00
YongKang Zhu
19bad2ac4a
[BOLT][NFC] Fix an incorrect address used in a BOLT-INFO message (#127902) 2025-02-19 16:57:18 -08:00
Nikita Popov
e235fcb582
[BOLT] Only link and initialize supported targets (#127509)
Bolt currently links and initializes all LLVM targets. This
substantially increases the binary size, and link time if LTO is used.

Instead, only link the targets specified by BOLT_TARGETS_TO_BUILD. We
also have to only initialize those targets, so generate a
TargetConfig.def file with the necessary information. The way the
initialization is done mirrors what llvm-exegesis does.

This reduces llvm-bolt size from 137MB to 78MB for me.
2025-02-18 09:17:51 +01:00
Amir Ayupov
61acfb07e8
[BOLT] Add pre-aggregated trace support (#127125)
Traces are triplets of branch source, target, and fall-through end (next
branch).

Traces simplify differentiation of fall-throughs into local- and
external-origin, which improves performance over profile with
undifferentiated fall-throughs by eliminating profile discontinuity in
call to continuation fall-throughs. This makes it possible to avoid
converting return profile into call to continuation profile which may
introduce statistical biases.

The existing format makes provisions for local- (F) and external- (f)
origin fall-throughs, but the profile producer needs to know function
boundaries. BOLT has that information readily available, so providing
the origin branch of a fall-through is a functional replacement of the
fall-through kind (f or F). This also has an effect of combining
branches and fall-throughs into a single record.

As traces subsume other pre-aggregated profile kinds, BOLT may drop
support for them soon. Users of pre-aggregated profile format are
advised to migrate to the trace format.

Test Plan: Updated callcont-fallthru.s
2025-02-13 15:14:56 -08:00
Paschalis Mpeis
385af283cd
[BOLT] Prevent addRelocation from adding pending relocs (#123635)
`addPendingRelocation` is the only way to add a pending
relocation. Can no longer use `addRelocation` for this.

Update the only user (`BinaryContextTester`).
2025-02-12 15:24:11 +00:00
Nikita Popov
0abe058d7f
[BOLT] Use getMainExecutable() (#126698)
Use LLVM's getMainExecutable() helper instead of rolling our own. This
will result in standard behavior across platforms, such as making sure
that symlinks are always resolved.
2025-02-12 09:44:26 +01:00
YongKang Zhu
1e0a489671
[BOLT] Resolve symlink for library lookup (#126386) 2025-02-08 14:02:46 -08:00
Amir Ayupov
b884be8640
[BOLT] Exit with error code on missing DWO CU (#125976)
If BOLT fails to locate DWO CU when using split DWARF, this signifies an
issue with the input (missing .dwo) rather than an internal assertion.
2025-02-06 10:01:12 -08:00
Maksim Panchenko
3115278c4e [BOLT] Fixup for commit 137c378/#125961 2025-02-06 00:26:20 -08:00