On ARM64EC, the `IMAGE_SCN_GPREL` flag is repurposed to indicate that
section code is x86_64. This enables embedding x86_64 code within
ARM64EC object files. MSVC uses this for export thunks in .exp files.
Originally, the intent behind symtab was to represent the symbol table
seen in the PE header (without applying ARM64X relocations). However, in
most cases outside of `writeHeader()`, the code references either both
symbol tables or only the EC one, for example, `mainSymtab` in
`linkerMain()` maps to `hybridSymtab` on ARM64X.
MSVC's link.exe allows pure ARM64EC images to include native ARM64
files. This patch prepares LLD to support the same, which will require
`hybridSymtab` to be available even for ARM64EC. At that point,
`writeHeader()` will need to use the EC symbol table, and the original
reasoning for keeping it in `hybridSymtab` no longer applies.
Given this, it seems cleaner to treat the EC symbol table as the “main”
one, assigning it to `symtab`, and use `hybridSymtab` for the native
symbol table instead. Since `writeHeader()` will need to be conditional
anyway, this change simplifies the rest of the code by allowing other
parts to consistently treat `ctx.symtab` as the main symbol table.
As a further simplification, this also allows us to eliminate `symtabEC`
and use `symtab` directly; I’ll submit that as a separate PR.
The map file now uses the EC symbol table for printed entry points and
exports, matching MSVC behavior.
On Cygwin at least, GCC is not passing any -m flag to the linker, so
fall back to the default target triple to determine if we need to apply
i386-specific behaviors.
Fixes#134558
This is a ~port of https://reviews.llvm.org/D117284. Like in that
change, archives without indices are treated as a collection of lazy
object files (as in `--start-lib/--end-lib`)
Porting the ELF follow-up to convert *all* archives to the lazy object
code path (https://reviews.llvm.org/D119074) is a natural next step, but
we would need to ensure the assertions about memory use hold for Mach-O.
NB: without an index, we can't do the part of the `-ObjC` scan where we
check for Objective-C symbols directly. We *can* still check for
`__obcj` sections so I wonder how much of a problem this actually is,
since I'm not sure how the "symbols but no sections" case can appear in
the wild.
This lock is unnecessary because we can add the relocations to
shards and let them be sorted later.
Reviewers: smithp35, fmayer, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/135123
Following from the discussion in #132224, this seems like the best
approach to deal with a mix of XO and RX output sections in the same
binary. This change will also simplify the implementation of the
PURECODE section flag for AArch64.
To control this behaviour, the `--[no-]xosegment` flag is added to LLD
(similarly to `--[no-]rosegment`), which determines whether to allow
merging XO and RX sections in the same segment. The default value is
`--no-xosegment`, which is a breaking change compared to the previous
behaviour.
Release notes are also added, since this will be a breaking change.
This reverts commit 6a1bdd9 and re-instate behavior that matches what
MSVC link.exe does, that is, error out when trying to dllimport a symbol
from a static library.
A hint is now displayed in stdout, mentioning that we should rather dllimport the symbol
from a import library.
Fixes https://github.com/llvm/llvm-project/issues/131807
Original code sequence:
* pcalau12i $a0, %ie_pc_hi20(sym)
* ld.d $a0, $a0, %ie_pc_lo12(sym)
The code sequence converted is as follows:
* lu12i.w $a0, %le_hi20(sym) # le_hi20 != 0, otherwise NOP
* ori $a0, src, %le_lo12(sym) # le_hi20 != 0, src = $a0,
# otherwise, src = $zero
TODO: When relaxation is enabled, redundant NOP can be removed. This
will be implemented in a future patch.
Note: In the normal or medium code model, original code sequence with
relocations allow interleaving, because converted code sequence
calculates the absolute offset. However, in extreme code model, to
identify the current code model, the first four instructions with
relocations must appear consecutively.
In `OutputSegment.cpp`, we need to ensure a specific order for certain
sections. The current sorting logic incorrectly prioritizes code
sections over explicitly defined section orders. This is problematic
because the `__objc_stubs` section is both a code section *and* has a
specific ordering requirement. The current logic would incorrectly
prioritize its code section status, causing it to be sorted *before* the
`__stubs` section. This incorrect ordering breaks the branch extension
algorithm, ultimately leading to linker failures due to relocation
errors.
We also modify the `lld/test/MachO/arm64-objc-stubs.s` test to ensure
correct section ordering.
This allows NOCROSSREFS to be specified in OVERLAY linker script
descriptions. This is a particularly useful part of the OVERLAY syntax,
since it's very rarely possible for one overlay section to sensibly
reference another.
Closes#128790
This permits an AArch64AbsLongThunk to be used in an environment where
unaligned accesses are disabled.
The AArch64AbsLongThunk does a load of an 8-byte address. When unaligned
accesses are disabled this address must be 8-byte aligned.
The vast majority of AArch64 systems will have unaligned accesses
enabled in userspace. However, after a reset, before the MMU has been
enabled, all memory accesses are to "device" memory, which requires
aligned accesses. In systems with multi-stage boot loaders a thunk may
be required to a later stage before the MMU has been enabled.
As we only want to increase the alignment when the ldr is used we delay
the increase in thunk alignment until we know we are going to write an
ldr. We also need to account for the ThunkSection alignment increase
when this happens.
In some of the test updates, particularly those with shared CHECK lines
with position independent thunks it was easier to ensure that the thunks
started at an 8-byte aligned address in all cases.
This allows the contents of OVERLAYs to be attributed to memory regions.
This is the only clean way to overlap VMAs in linker scripts that choose
to primarily use memory regions to lay out addresses.
This also simplifies OVERLAY expansion to better match GNU LD.
Expressions for the first section's LMA and VMA are not generated if the
user did not provide them. This allows the LMA/VMA offset to be
preserved across multiple overlays in the same region, as with regular
sections.
Closes#129816
clang -fexperimental-relative-c++-abi-vtables might generate `@plt` and
`@gotpcrel` specifiers in data directives. The syntax is not used in
humand-written assembly code, and is not supported by GNU assembler.
Note: the `@plt` in `.word foo@plt` is different from
the legacy `call func@plt` (where `@plt` is simply ignored).
The `@plt` syntax was selected was simply due to a quirk of AsmParser:
the syntax was supported by all targets until I updated it
to be an opt-in feature in a0671758eb6e52a758bd1b096a9b421eec60204c
RISC-V favors the `%specifier(expr)` syntax following MIPS and Sparc,
and we should follow this convention.
This PR adds support for `.word %pltpcrel(foo+offset)` and
`.word %gotpcrel(foo)`, and drops `@plt` and `@gotpcrel`.
* MCValue::SymA can no longer have a SymbolVariant. Add an assert
similar to that of AArch64ELFObjectWriter.cpp before
https://reviews.llvm.org/D81446 (see my analysis at
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers
if intrigued)
* `jump foo@plt, x31` now has a different diagnostic.
Pull Request: https://github.com/llvm/llvm-project/pull/132569
The value of an absolute relocation, like R_RISCV_HI20 or R_PPC64_LO16,
with a symbol index of 0, the resulting value should be treated as
absolute and permitted in both -pie and -shared links.
This change also resolves an absolute relocation referencing an
undefined symbol in statically-linked executables.
PPC64 has unfortunate exceptions:
* R_PPC64_TOCBASE uses symbol index 0 but it should be treated as
referencing the linker-defined .TOC.
* R_PPC64_PCREL_OPT (https://reviews.llvm.org/D84360) could no longer
rely on `isAbsoluteValue` return false.
When using the linker flags `--icf=safe_thunks` and `--keep-icf-stabs`
together, an issue arises with the STABS debugging entries in the linked
output. The problem affects STABS entries for functions that are folded
via ICF using thunks.
For instance, if `func1` is merged into `func2` through a thunk, the
STABS entry for `func1` incorrectly points to the object file of
`func2`. This is incorrect behavior—each function’s STABS entry should
consistently point to its own original object file (e.g., the STABS
entry for `func1` should reference `func1`’s object file). This issue
causes `dsymutil` to not be able to retrieve the debug information for
the problematic function.
This patch corrects this behavior - making it so that STABS entries
always point to the correct object file.
This implements arm, armeb, thumb, thumbeb PLT entries parsing support
in ELF for llvm-objdump.
Implementation is similar to AArch64MCInstrAnalysis::findPltEntries. PLT
entry signatures are based on LLD code for PLT generation
(ARM::writePlt).
llvm-objdump tests are produced from lld/test/ELF/arm-plt-reloc.s,
lld/test/ELF/armv8-thumb-plt-reloc.s.
This prints a stack of reasons that symbols that match the given glob(s)
survived GC. It has no effect unless section GC occurs.
This implementation does not require -ffunction-sections or
-fdata-sections to produce readable results, althought it does tend to
work better (as does GC).
Details about the semantics:
- Some chain of liveness reasons is reported; it isn't specified which
chain.
- A symbol or section may be live:
- Intrisically (e.g., entry point)
- Because needed by a live symbol or section
- (Symbols only) Because part of a section live for another reason
- (Sections only) Because they contain a live symbol
- Both global and local symbols (`STB_LOCAL`) are supported.
- References to symbol + offset are considered to point to:
- If the referenced symbol is a section (`STT_SECTION`):
- If a sized symbol encloses the referenced offset, the enclosing
symbol.
- Otherwise, the section itself, generically.
- Otherwise, the referenced symbol.
Set the default processor version to v68 when the user does not specify
one in the command line. This includes changes in the LLVM backed and
linker (lld). Since lld normally sets the version based on inputs, this
change will only affect cases when there are no inputs.
Fixes#127558
"libmsvcrt-os" was added to the list of excluded libs in binutils in
9d9c67b06c1bf4c4550e3de0eb575c2bfbe96df9 in 2017.
"libucrt" was added in c4a8df19ba0a82aa8dea88d9f72ed9e63cb1fa84 in 2022.
"libucrtapp" isn't in the binutils exclusion list yet, but a patch for
adding it has been submitted. Since
0d403d5dd13ce22c07418058f3b640708992890c in mingw-w64 (in 2020), there's
such a third variant of the UCRT import library available.
Since 18df3e8323dcf9fdfec56b5f12c04a9c723a0931 in 2025, "libpthread" and
"libwinpthread" are also excluded.
When a library is specified with both `-l` and `-reexport_libraries`,
lld will emit two load commands for it, in contrast with ld64.
In an upcoming version of macOS, this fails dyld validation; see
https://crbug.com/404905688
---------
Co-authored-by: Mark Rowe <markrowe@chromium.org>>
If we have an absolute address whose high bits are known to be a sign
extend of the low 12 bits, we can avoid emitting the LUI entirely. This
is implemented in an analogous manner to the gp relative relocations -
defining an internal usage relocation type.
Since 12 bits (really 11 since the high bit must be zero in user code)
is less than one page, all of these offsets fit in the null page. As
such, the only application of these is likely to be undefined weak
symbols except for embedded use cases. I'm mostly posting this for
completeness sake.
The `--disable_verify` flag is implemented for ELF and is used to
disable LLVM module verification.
93afd8f9ac/lld/ELF/Options.td (L661)
This allows us to quickly suppress verification errors.
When GCS was introduced to LLD, the gcs-report option allowed for a user
to gain information relating to if their relocatable objects supported
the feature. For an executable or shared-library to support GCS, all
relocatable objects must declare that they support GCS.
The gcs-report checks were only done on relocatable object files,
however for a program to enable GCS, the executable and all shared
libraries that it loads must enable GCS. gcs-report-dynamic enables
checks to be performed on all shared objects loaded by LLD, and in cases
where GCS is not supported, a warning or error will be emitted.
It should be noted that only shared files directly passed to LLD are
checked for GCS support. Files that are noted in the `DT_NEEDED` tags
are assumed to have had their GCS support checked when they were
created.
The behaviour of the -zgcs-dynamic-report option matches that of GNU ld.
The behaviour is as follows unless the user explicitly sets the value:
* -zgcs-report=warning or -zgcs-report=error implies
-zgcs-report-dynamic=warning.
This approach avoids inheriting an error level if the user wishes to
continue building a module without rebuilding all the shared libraries.
The same approach was taken for the GNU ld linker, so behaviour is
identical across the toolchains.
This implementation matches the error message and command line interface
used within the GNU ld Linker. See here:
724a8341f6
To support this option being introduced, two other changes are included
as part of this PR. The first converts the -zgcs-report option to
utilise an Enum, opposed to StringRef values. This enables easier
tracking of the value the user defines when inheriting the value for the
gas-report-dynamic option. The second is to parse the Dynamic Objects
program headers to locate the GNU Attribute flag that shows GCS is
supported. This is needed so, when using the gcs-report-dynamic option,
LLD can correctly determine if a dynamic object supports GCS.
---------
Co-authored-by: Fangrui Song <i@maskray.me>
On ARM64X, symbol names alone are ambiguous as they may refer to either
a native or an EC symbol. Append '(EC symbol)' or '(native symbol)' in
diagnostic messages to distinguish them.
Refactor to add some early return logic to `applySafeThunksToRange` so
that we completely skip irrelevant ranges.
Also add a check for `isCodeSection` to ensure we only apply branch
thunks to code section (they don't make sense for anything else).
Currently this isn't an issue since there are no `keepUnique` non-code
sections - but this is not a hard restriction and may be implemented in
the future, so we should be able to handle (i.e. avoid) this scenario.
When attempting to add KEEP within an OVERLAY description, which the
Linux kernel would like to do for ARCH=arm to avoid dropping the
.vectors sections with '--gc-sections' [1], ld.lld errors with:
ld.lld: error: ./arch/arm/kernel/vmlinux.lds:37: section pattern is expected
>>> __vectors_lma = .; OVERLAY 0xffff0000 : AT(__vectors_lma) { .vectors { KEEP(*(.vectors)) } ...
>>> ^
readOverlaySectionDescription() does not handle all input section
description keywords, despite GNU ld's documentation stating that "The
section definitions within the OVERLAY construct are identical to those
within the general SECTIONS construct, except that no addresses and no
memory regions may be defined for sections within an OVERLAY."
Reuse the existing parsing in readInputSectionDescription(), which
handles KEEP, allowing the Linux kernel's use case to work properly.
[1]: https://lore.kernel.org/20250221125520.14035-1-ceggers@arri.de/
This prevents useless spills to the same memory region from causing
spilling to take too many passes to converge.
Handling this at spilling time allows us to relax the generation of
spill sections; specifically, multiple spills can now be generated per
output section. This should be fairly benign for performance, and it
would eventually allow linker scripts to express things like holes or
minimum addresses for parts of output sections. The linker could then
spill within an output section whenever address constraints are
violated.
In local-exec form, the code sequence is converted as follows:
```
From:
lu12i.w $rd, %le_hi20_r(sym)
R_LARCH_TLS_LE_HI20_R, R_LARCH_RELAX
add.w/d $rd, $rd, $tp, %le_add_r(sym)
R_LARCH_TLS_LE_ADD_R, R_LARCH_RELAX
addi/ld/st.w/d $rd, $rd, %le_lo12_r(sym)
R_LARCH_TLS_LE_LO12_R, R_LARCH_RELAX
To:
addi/ld/st.w/d $rd, $tp, %le_lo12_r(sym)
R_LARCH_TLS_LE_LO12_R
```
In global-dynamic or local-dynamic, the code sequence is converted as
follows:
```
From:
pcalau12i $a0, %ld_pc_hi20(sym) | %gd_pc_hi20(sym)
R_LARCH_TLS_GD_PC_HI20 | R_LARCH_TLS_LD_PC_HI20, R_LARCH_RELAX
addi.w/d $a0, $a0, %got_pc_lo12(sym) | %got_pc_lo12(sym)
R_LARCH_GOT_PC_LO12, R_LARCH_RELAX
To:
pcaddi $a0, %got_pc_lo12(sym) | %got_pc_lo12(sym)
R_LARCH_TLS_GD_PCREL20_S2 | R_LARCH_TLS_LD_PCREL20_S2
```
Note: For initial-exec form, since it involves the conversion from IE to
LE, we will implement it in a future patch.
Instructions with relocation `R_LARCH_CALL36` may be relax as follows:
```
From:
pcaddu18i $dest, %call36(foo)
R_LARCH_CALL36, R_LARCH_RELAX
jirl $r, $dest, 0
To:
b/bl foo # bl if r=$ra, b if r=$zero
R_LARCH_B26
```
This patch fixes the buildbots failuer of lld tests.
Changes: Modify test files: from `sym@plt` to `%plt(sym)`.