Adds a function to create a LinkGraph of absolute symbols, and a
callback in dynamic library search generators to enable using it to
expose its symbols to the platform/orc runtime. This allows e.g. using
__orc_rt_run_program to run a precompiled function that was found via
dlsym. Ideally we would use this in llvm-jitlink's own search generator,
but it will require more work to align with the Process/Platform
JITDylib split, so not handled here.
As part of this change we need to handle LinkGraphs that only have
absolute symbols.
This could happen, for example, when instrumenting an AArch64 binary on
an x86 host because the instrumentation library is always built for the
host.
Note that this check will probably need to be refined in the future as
merely having the same architecture does not guarantee objects can be
linked. For example, on RISC-V, the float ABI of all objects should
match.
In large code model, the address of GOT is calculated by the
static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ
instruction. In the final binary, it can be disassembled as a regular
immediate, but because such immediate is the result of PC-relative
pointer arithmetic, we need to parse this relocation and update this
calculation whenever we move code, otherwise we break the code trying
to read GOT.
A test case showing how GOT is accessed was provided.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D158911
Calls on RISC-V are typically compiled to `auipc`/`jalr` pairs to allow
a maximum target range (32-bit pc-relative). In order to optimize calls
to near targets, linker relaxation may replace those pairs with, for
example, single `jal` instructions.
To allow BOLT to freely reassign function addresses in relaxed binaries,
this patch proposes the following approach:
- Expand all relaxed calls back to `auipc`/`jalr`;
- Rely on JITLink to relax those back to shorter forms where possible.
This is implemented by detecting all possible call instructions and
replacing them with `PseudoCALL` (or `PseudoTAIL`) instructions. The
RISC-V backend then expands those and adds the necessary relocations for
relaxation.
Since BOLT generally ignores pseudo instruction, this patch makes
`MCPlusBuilder::isPseudo` virtual so that `RISCVMCPlusBuilder` can
override it to exclude `PseudoCALL` and `PseudoTAIL`.
To ensure JITLink knows about the correct section addresses while
relaxing, reassignment of addresses has been moved to a post-allocation
pass. Note that this is probably the time it had to be done in the
first place since in `notifyResolved` (where it was done before), all
symbols are supposed to be resolved already.
Depends on D159082
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D159089
BOLT uses `MCAsmLayout` to calculate the output values of functions and
basic blocks. This means output values are calculated based on a
pre-linking state and any changes to symbol values during linking will
cause incorrect values to be used.
This issue can be triggered by enabling linker relaxation on RISC-V.
Since linker relaxation can remove instructions, symbol values may
change. This causes, among other things, the symbol table created by
BOLT in the output executable to be incorrect.
This patch solves this issue by using `BOLTLinker` to get symbol values
instead of `MCAsmLayout`. This way, output values are calculated based
on a post-linking state. To make sure the linker can update all
necessary symbols, this patch also makes sure all these symbols are not
marked as temporary so that they end-up in the object file's symbol
table.
Note that this patch only deals with symbols of binary functions
(`BinaryFunction::updateOutputValues`). The technique described above
turned out to be too expensive for basic block symbols so those are
handled differently in D155604.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D154604
RuntimeDyld has been deprecated in favor of JITLink. [1] This patch
replaces all uses of RuntimeDyld in BOLT with JITLink.
Care has been taken to minimize the impact on the code structure in
order to ease the inspection of this (rather large) changeset. Since
BOLT relied on the RuntimeDyld API in multiple places, this wasn't
always possible though and I'll explain the changes in code structure
first.
Design note: BOLT uses a JIT linker to perform what essentially is
static linking. No linked code is ever executed; the result of linking
is simply written back to an executable file. For this reason, I
restricted myself to the use of the core JITLink library and avoided ORC
as much as possible.
RuntimeDyld contains methods for loading objects (loadObject) and symbol
lookup (getSymbol). Since JITLink doesn't provide a class with a similar
interface, the BOLTLinker abstract class was added to implement it. It
was added to Core since both the Rewrite and RuntimeLibs libraries make
use of it. Wherever a RuntimeDyld object was used before, it was
replaced with a BOLTLinker object.
There is one major difference between the RuntimeDyld and BOLTLinker
interfaces: in JITLink, section allocation and the application of fixups
(relocation) happens in a single call (jitlink::link). That is, there is
no separate method like finalizeWithMemoryManagerLocking in RuntimeDyld.
BOLT used to remap sections between allocating (loadObject) and linking
them (finalizeWithMemoryManagerLocking). This doesn't work anymore with
JITLink. Instead, BOLTLinker::loadObject accepts a callback that is
called before fixups are applied which is used to remap sections.
The actual implementation of the BOLTLinker interface lives in the
JITLinkLinker class in the Rewrite library. It's the only part of the
BOLT code that should directly interact with the JITLink API.
For loading object, JITLinkLinker first creates a LinkGraph
(jitlink::createLinkGraphFromObject) and then links it (jitlink::link).
For the latter, it uses a custom JITLinkContext with the following
properties:
- Use BOLT's ExecutableFileMemoryManager. This one was updated to
implement the JITLinkMemoryManager interface. Since BOLT never
executes code, its finalization step is a no-op.
- Pass config: don't use the default target passes since they modify
DWARF sections in a way that seems incompatible with BOLT. Also run a
custom pre-prune pass that makes sure sections without symbols are not
pruned by JITLink.
- Implement symbol lookup. This used to be implemented by
BOLTSymbolResolver.
- Call the section mapper callback before the final linking step.
- Copy symbol values when the LinkGraph is resolved. Symbols are stored
inside JITLinkLinker to ensure that later objects (i.e.,
instrumentation libraries) can find them. This functionality used to
be provided by RuntimeDyld but I did not find a way to use JITLink
directly for this.
Some more minor points of interest:
- BinarySection::SectionID: JITLink doesn't have something equivalent to
RuntimeDyld's Section IDs. Instead, sections can only be referred to
by name. Hence, SectionID was updated to a string.
- There seem to be no tests for Mach-O. I've tested a small hello-world
style binary but not more than that.
- On Mach-O, JITLink "normalizes" section names to include the segment
name. I had to parse the section name back from this manually which
feels slightly hacky.
[1] https://reviews.llvm.org/D145686#4222642
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D147544