When encountering an instruction like `if (p0) r0 = add(r0,##bar@GOT)`,
lld would fail with:
```
ld.lld: error: unrecognized instruction for 16_X type: 0x7400C000
```
This issue was encountered while building libreadline with clang 19.1.0.
Fixes: #111876
(cherry picked from commit 77aa8257acbd773c0c430cd962da1bcfbd5ee94b)
Similar to commit 686cff17cc310884e48ae963bf7507f96950cc90 for SHT_REL (#57693).
CREL hasn't been tested with ICF before.
And avoid a pitfall that eqClass[0] might interfere with ICF.
(cherry picked from commit e82f0838ae88ad69515ebec234765e3e2607bebf)
Empty archives are apparently routine in linux kernel builds, so instead
of asserting, we should handle this case with a sane default value.
(cherry picked from commit d1ba432533aafc52fc59158350af937a8b6b9538)
Since we don't generate relocations for those, it doesn't make sense to
assert them here; fallout of
https://github.com/llvm/llvm-project/pull/106722.
(cherry picked from commit a3816b5a573dbf57ba3082a919ca2de6b47257e9)
A crash was happening when both ObjC Category Merging and Relative
method lists were enabled.
ObjC Category Merging creates new data sections and adds them by calling
`addInputSection`. `addInputSection` uses the symbols within the added
section to determine which container to actually add the section to.
The issue is that ObjC Category merging is calling `addInputSection`
before actually adding the relevant symbols the the added section. This
causes `addInputSection` to add the `InputSection` to the wrong
container, eventually resulting in a crash.
To fix this, we ensure that ObjC Category Merging calls
`addInputSection` only after the symbols have been added to the
`InputSection`.
(cherry picked from commit 0df91893efc752a76c7bbe6b063d66c8a2fa0d55)
GNU ld silently accepts the -rpath option for Windows targets, as a
no-op.
This has lead to some build systems (and users) passing this option
while building for Windows/MinGW, even if Windows doesn't have any
concept like rpath.
Older versions of Conan did include -rpath in the pkg-config files it
generated, see e.g.
17c58f0c61/conans/client/generators/pkg_config.py (L104-L114)
and
17c58f0c61/conans/client/build/compiler_flags.py (L26-L34)
- and see https://github.com/mstorsjo/llvm-mingw/issues/300 for user
reports about this issue.
Recognize the option in LLD for MinGW targets, to improve drop-in
compatibility compared to GNU ld, but produce a warning to alert users
that the option really has no effect for these targets.
(cherry picked from commit 69f76c782b554a004078af6909c19a11e3846415)
Follow-up to #98115. For EhInputSection, RelocationScanner::scan calls
sortRels, which doesn't support the CREL iterator. We should set
supportsCrel to false to ensure that the initial_location fields in
.eh_frame FDEs are relocated.
(cherry picked from commit a821fee312d15941174827a70cb534c2f2fe1177)
Previously, we selected the Thumb2 PLT sequences if any input object is
marked as not supporting the ARM ISA, which then causes assertion
failures when calls from ARM code in other objects are seen. I think the
intention here was to only use Thumb PLTs when the target does not have
the ARM ISA available, signalled by no objects being marked as having it
available. To do that we need to track which ISAs we have seen as we
parse the build attributes, and defer the decision about PLTs until all
input objects have been parsed.
This bug was triggered by real code in picolibc, which have some
versions of string.h functions built with Thumb2-only build attributes,
so that they are compatible with v7-A, v7-R and v7-M.
Fixes#99008.
(cherry picked from commit a1c6467bd90905d52cf8f6162b60907f8e98a704)
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.
---
Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.
(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)
A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.
---
Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).
* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.
SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.
Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600
Pull Request: https://github.com/llvm/llvm-project/pull/98115
(cherry picked from commit 0af07c078798b7c427e2981377781b5cc555a568)
Relocs is to simplify CREL support (#98115) while invokeOnRelocs
simplifies some relsOrRelas call sites that will use the CREL iterator.
(cherry picked from commit 6efc3774bd8c5fcb105cda73ec27c05ef850dc19)
Similar to #99836 for AArch64.
Non-unique names save .strtab space and match GNU assembler.
Pull Request: https://github.com/llvm/llvm-project/pull/99906
(cherry picked from commit 298a9223a57c50cb0d24b82687ad1bc2f7a022e6)
This reverts commit f55b79f59a77b4be586d649e9ced9f8667265011.
The known issues with chained fixups have been addressed by #98913,
#98305, #97156 and #95171.
Compared to the original commit, support for xrOS (which postdates
chained fixups' introduction) was added and an unnecessary test change
was removed.
----------
Original commit message:
Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:
93d74eafc3/src/ld/Options.cpp (L5458-L5549)
Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32
Add `createLocalSymbol` to create a local, non-temporary symbol.
Different from `createRenamableSymbol`, the `Used` bit is ignored,
therefore multiple local symbols might share the same name.
Utilizing `createLocalSymbol` in AArch64 allows for efficient mapping
symbol creation with non-unique names, saving .strtab space.
The behavior matches GNU assembler.
Pull Request: https://github.com/llvm/llvm-project/pull/99836
It happened due to lld's COFF linker multiple regression tests failure.
It got reliably reproduced after the needed intialization of
isUsedinRegularObject bit in the Symbol's ctor, but not handled at
replaceSymbol API properly while creating a specific symbol to insert in
symbol table.
So, now while creating the specific symbol using replaceSymbol, by
explicitly setting the value of isUsedinRegularObject to newly created
symbol around the ctor call of symbol would solve the regression failure
In https://reviews.llvm.org/D115416, it was decided that an explicit
thread pool should be used instead of the simpler fork-join model of the
`parallelFor*` family of functions. Since then, more parallelism has
been added to LLD, but these changes always used the latter strategy,
similarly to other ports of LLD.
This meant that we ended up spawning twice the requested amount of
threads; one set for the `llvm/Support/Parallel.h` executor, and one for
the thread pool.
Since that decision, 3b4d800911 has landed, which allows us to
explicitly enqueue jobs on the executor pool of the parallel algorithms,
which should be enough to achieve sharded output writing and
parallelized input file parsing. Now only the construction of the map
file is left that should be done *concurrently* with different linking
steps, this commit proposes explicitly spawning a dedicated worker
thread for it.
Support `preinit_array . (TYPE=SHT_PREINIT_ARRAY) : { QUAD(16) }`
Follow-up to https://reviews.llvm.org/D118840
peek2() could be eliminated by a future change.
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
We were already not deleting category names for Swift classes as those
names can be reused by other parts. However, I have come across a corner
case where this also happens for ObjC categories - so we can't delete
category names for them either. TODO remains to optimize this behavior
for both ObjC and Swift.
When `cd %t` is used, it's conventional to move it above and omit `-o /dev/null`.
We don't check the string before `warning:` since (a) the string is not
very useful and (b) downstream might customize `ctx->e.logName`
(argv[0]).
count 0 is better than `--allow-empty`. In addition, without `2>&1` the
previous test was effective.
Normally, AArch64 ELF objects use the SHT_RELA type of relocation
section, with addends stored in each relocation. But some legacy AArch64
object producers still use SHT_REL in some situations, storing the
addend in the initial value of the data item or instruction immediate
field that the relocation will modify. LLD was mishandling relocations
of this type in multiple ways.
Firstly, many of the cases in the `getImplicitAddend` switch statement
were apparently based on a misunderstanding. The relocation types that
operate on instructions should be expecting to find an instruction of
the appropriate type, and should extract its immediate field. But many
of them were instead behaving as if they expected to find a raw 64-, 32-
or 16-bit value, and wanted to extract the right range of bits. For
example, the relocation for R_AARCH64_ADD_ABS_LO12_NC read a 16-bit word
and extracted its bottom 12 bits, presumably on the thinking that the
relocation writes the low 12 bits of the value it computes. But the
input addend for SHT_REL purposes occupies the immediate field of an
AArch64 ADD instruction, which meant it should have been reading a
32-bit AArch64 instruction encoding, and extracting bits 10-21 where the
immediate field lives. Worse, the R_AARCH64_MOVW_UABS_G2 relocation was
reading 64 bits from the input section, and since it's only relocating a
32-bit instruction, the second half of those bits would have been
completely unrelated!
Adding to that confusion, most of the values being read were first
sign-extended, and _then_ had a range of bits extracted, which doesn't
make much sense. They should have first extracted some bits from the
instruction encoding, and then sign-extended that 12-, 19-, or 21-bit
result (or whatever else) to a full 64-bit value.
Secondly, after the relocated value was computed, in most cases it was
being written into the target instruction field via a bitwise OR
operation. This meant that if the instruction field didn't initially
contain all zeroes, the wrong result would end up in it. That's not even
a 100% reliable strategy for SHT_RELA, which in some situations is used
for its repeatability (in the sense that applying the relocation twice
should cause the second answer to overwrite the first, so you can
relocate an image in advance to its most likely address, and then do it
again at load time if that turns out not to be available). But for
SHT_REL, when you expect nonzero immediate fields in normal use, it
couldn't possibly work. You could see the effect of this in the existing
test, which had a lot of FFFFFF in the expected output which there
wasn't any plausible justification for.
Finally, one relocation type was actually missing: there was no support
for R_AARCH64_ADR_PREL_LO21 at all.
So I've rewritten most of the cases in `getImplicitAddend`; replaced the
bitwise ORs with overwrites; and replaced the previous test with a much
more thorough one, obtained by writing an input assembly file with
explicitly specified relocations on instructions that also have
carefully selected immediate fields, and then doing some yaml2obj
seddery to turn the RELA relocation section into a REL one.
Previously, we only saved those members of thin archives into a repro
file that were actually used during linking. However, -ObjC handling
requires us to inspect all members, even those that don't end up being
loaded.
We weren't handling missing members correctly and crashed with an
"unhandled `Error`" failure in LLVM_ENABLE_ABI_BREAKING_CHECKS builds.
To fix this, we now eagerly load all object files and warn when
encountering missing members (in the instances where it wasn't a hard
error before). To avoid having to patch out the checks when dealing
with older repro files, the `--no-warn-thin-archive-missing-members`
flag is added as an escape hatch.
Starting with Xcode 16 (dyld-1122), Apple's binary utilities, e.g.
`dyld_info` (but not dyld itself), will refuse to load binaries built
against the macOS 15 SDK or newer that contain the same `LC_RPATH`
entry multiple times:
https://github.com/apple-oss-distributions/dyld/blob/rel/dyld-1122/mach_o/Policy.cpp#L246-L249
`ld-prime` deduplicates entries (regardless of the deployment target),
we now do the same. We also match `ld-prime`'s and `ld64`'s behavior by
warning on duplicate `-rpath` arguments. This can be disabled by the
LLD-specific `--no-warn-duplicate-rpath` flag.
Implement the two commands described by
https://sourceware.org/binutils/docs/ld/Miscellaneous-Commands.html
After `outputSections` is available, check each output section described
by at least one `NOCROSSREFS`/`NOCROSSERFS_TO` command. For each checked
output section, scan relocations from its input sections.
This step is slow, therefore utilize `parallelForEach(isd->sections, ...)`.
To support non SHF_ALLOC sections, `InputSectionBase::relocations`
(empty) cannot be used. In addition, we may explore eliminating this
member to speed up relocation scanning.
Some parse code is adapted from #95714.
Close#41825
Pull Request: https://github.com/llvm/llvm-project/pull/98773
This matches ELF (#97480). clang cc1 -emit-llvm and -emit-llvm-bc for
ThinLTO backend compilation also uses `PreCodeGenModuleHook`.
While here, replace deprecated %T with %t.
Pull Request: https://github.com/llvm/llvm-project/pull/98589
This section contains metadata that's only relevant for Identical Code
Folding at link time, we should not include it in the output.
We still treat it like a regular section during input file parsing (e.g.
create a `ConcatInputSection` for it), as we want its relocations to be
parsed. But it should not be passed to `addInputSection`, as that's what
assigns it to an `OutputSection` and adds it to the `inputSections`
vector which specifies the inputs to dead-stripping and relocation
scanning.
This fixes a "__DATA,__llvm_addrsig, offset 0: fixups overlap" error
when using `--icf=safe` alongside `-fixup_chains`. This occurs because
all `__llvm_addrsig` sections are 8 bytes large, and the relocations
which signify functions whose addresses are taken are all at offset 0.
This makes the fix in 5fa24ac2 ("Category Merger: add support for
addrsig references") obsolete, as we no longer try to resolve symbols
referenced in `__llvm_addrsig` when writing the output file. When we do
iterate its relocations in `markAddrSigSymbols`, we do not try to
resolve their addresses.
Normally, when doing renamed imports, we do this by providing a
weak alias, towards another regular import, for the symbol we
want to actually import. In a def file, this looks like this:
regularfunc
renamedfunc == regularfunc
However, if we want to link against a function in a DLL, where we
(intentionally) don't provide a regular import for that symbol
with the name in its DLL, doing the renamed import with a weak
alias doesn't work, as there's no symbol that the weak alias can
point towards.
We can't make up such an import either, as we may intentionally
not want to provide a regular import for that name.
This situation can either be resolved by using the "long" import
library format (as e.g. produced by GNU dlltool), or by using the
new short import library name type "export as".
This patch implements it by using the "export as" name type.
When producing a renamed import, defer emitting it until all regular
imports have been produced. If the renamed import refers to a
symbol that does exist as a regular import entry, produce a
weak alias, just as before. (This implementation also avoids needing
to know whether the symbol that the alias points towards actually
is prefixed or not, too.)
If the renamed import points at a symbol that isn't otherwise
available (or is available as a renamed symbol itself), generate
an "export as" import entry.
This name type is new, and is normally used in ARM64EC import
libraries, but can also be used for other architectures.
We don't currently have a great way to detect the architecture of shared
object files under wasm. The currently method involves checking if the
imported or exported memory is 64-bit. However some shared libraries
don't use linear memory at all.
See https://github.com/llvm/llvm-project/issues/98778
This patch improves GNU ld compatibility.
Close#87891: Support `OUTPUT_FORMAT(binary)`, which is like
--oformat=binary. --oformat=binary takes precedence over an ELF
`OUTPUT_FORMAT`.
In addition, if more than one OUTPUT_FORMAT command is specified, only
check the first one.
Pull Request: https://github.com/llvm/llvm-project/pull/98837
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.
This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
This PR adds a number of thus-far missing extended mnemonics to the
assembler and disassembler for SystemZ.
The following mnemonics have been added and are supported for the
assembler and disassembler:
- `NOP(R)?`
- `LFI`
- `RISBG(N)?Z`
The following mnemonics have been added and are supported for the
assembler only:
- `JC(TH)?`
- `LLG(F|H)I`
- `NOT(G)?R`
Previously we would ignore all undefined symbols when using
`-shared` or `-pie`. All undefined symbols would be treated as imports
regardless of whether those symbols we defined in any shared library.
With this change we now track symbol in shared libraries and report
undefined symbols in the main program by default.
The old behavior is still available via the
`--unresolved-symbols=import-dynamic` command line flag.
This rationale for allowing this type of breaking change is that `-pie`
and `-shared` are both still experimental will warn as such, unless
`--experimental-pic` is passed.
As part of this change the linker now models shared library symbols
via new SharedFunctionSymbol and SharedDataSymbol types.
I've also added a new `--no-shlib-sigcheck` option that bypassed the
checking of functions signature in shared libraries. This is
specifically required by emscripten the case where the imports/exports
of shared libraries have been modified by via JS type legalization (this
is only needed when targeting old JS engines where bigint is not yet
available
See https://github.com/emscripten-core/emscripten/issues/18198
Parsing the new input file's symbols might invalidate LTO codegen, but
the semantics of deplibs require them to be parsed. Accordingly, report
an error unless the file had already been added to the link.
Fixes#56070
- Add support for `.openbsd.mutable`
(rebaser's note) adapted from:
bd249b5664
New auto-coalescing sections removed
In the linkers, collect objects in section "openbsd.mutable" and place
them into a page-aligned region in the bss, with the right markers for
kernel/ld.so to identify the region and skip making it immutable. While
here, fix readelf/objdump versions to show all of this. ok miod kettenis
- Add support for `.openbsd.syscalls`
(rebaser's note) adapted from:
42a61acefa
Collect .openbsd.syscalls sections into a new PT_OPENBSD_SYSCALLS
segment. This will be used soon to pin system calls to designated call
sites.
ok deraadt@
- Scope OpenBSD special section handling under that ELFOSABI
As a preexisting comment in `ELF/Writer.cpp` says:
> section names shouldn't be significant in ELF in spirit.
so scoping OSABI-specific magic name hacks to just the OSABI in
question limits the degree to which we deviate from that "spirit" for
all other OSABIs.
OpenBSD in particular is very fast moving, having added a number of
special sections, etc. in recent years. It is unclear how possible /
reasonable it is for upstream to implement all these features in any
event, but scoping like this at least mitigates the fallout for other
OSABIs systems which wish to be more slow-moving.
Co-authored-by: deraadt <deraadt@openbsd.org>