This is a ~port of https://reviews.llvm.org/D117284. Like in that
change, archives without indices are treated as a collection of lazy
object files (as in `--start-lib/--end-lib`)
Porting the ELF follow-up to convert *all* archives to the lazy object
code path (https://reviews.llvm.org/D119074) is a natural next step, but
we would need to ensure the assertions about memory use hold for Mach-O.
NB: without an index, we can't do the part of the `-ObjC` scan where we
check for Objective-C symbols directly. We *can* still check for
`__obcj` sections so I wonder how much of a problem this actually is,
since I'm not sure how the "symbols but no sections" case can appear in
the wild.
The `--disable_verify` flag is implemented for ELF and is used to
disable LLVM module verification.
93afd8f9ac/lld/ELF/Options.td (L661)
This allows us to quickly suppress verification errors.
The symbol string table does not have deduplication.
Here we add code to deduplicate the symbol string table.
This has a rather large size impact (20-30%) on unstripped binaries
(typically debug binaries) but no size impact on stripped
binaries(typically release binaries).
We enable deduplication by default and add a flag to disable it
(`-no-deduplicate-symbol-strings`).
Add new CLI options for feature parity with ELF w.r.t pass plugins.
Most of the changes are ported directly from
0c86198b27.
With this change, it is now possible to load and run external pass
plugins during the LTO phase.
ld64.lld would previously allow you to link against dylibs linked with
`-allowable_client`, even if the client's name does not match any
allowed client.
This change fixes that. See #114146 for related discussion.
The test binary `liballowable_client.dylib` was created on macOS with:
echo | clang -xc - -dynamiclib -mmacosx-version-min=10.11 -arch x86_64
-Wl,-allowable_client,allowed -o lib/liballowable_client.dylib
For COFF and ELF that are mostly free of global states, lld::errs() and
lld::outs() should not be used. This migration change allows us to
remove lld::errs, which uses the global errorHandler().
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Strip calls to raw_string_ostream::str(), to avoid excess layer of indirection.
Currently, our `safe` ICF mode only merges non-address-significant code,
leaving duplicate address-significant functions in the output. This
patch introduces `safe_thunks` ICF mode, which keeps a single master
copy of each function and replaces address-significant duplicates with
thunks that branch to the master copy.
Currently `--icf=safe_thunks` is only supported for `arm64`
architectures.
**Perf stats for a large binary:**
| ICF Option | Total Size | __text Size | __unwind_info | % total |
|-------------------|------------|-------------|---------------------|---------------------------|
| `--icf=none` | 91.738 MB | 55.220 MB | 1.424 MB | 0% |
| `--icf=safe` | 85.042 MB | 49.572 MB | 1.168 MB | 7.30% |
| `--icf=safe_thunks` | 84.650 MB | 49.219 MB | 1.143 MB | 7.72% |
| `--icf=all` | 82.060 MB | 48.726 MB | 1.111 MB | 10.55% |
So overall we can expect a `~0.45%` binary size reduction for a typical
large binary compared to the `--icf=safe` option.
**Runtime:**
Linking the above binary took ~10 seconds. Comparing the link
performance of --icf=safe_thunks vs --icf=safe, a ~2% slowdown was
observed.
...including for catalyst.
The usecase for this is to put certain security-critical variables into
a special segment/section that's mapped as read-only most of the time,
and that temporary gets remapped as writeable when these variables are
written to be the program. This protects against them being written to
by heap spraying attacks. This special section should be mapped as
read-only at program start, so using
`-segprot MY_PROTECTED_MEMORY_THINGER rw r`
to mark that segment as rw maxprot and r initprot is exactly what we
want.
lld has so far rejected mismatching initprot and maxprot.
ld64 doesn't reject this, but silently writes initprot into both fields
(!) It looks like this might not be fully intentional, see
https://crbug.com/41495919#comment5 and
http://crbug.com/41495919#comment8.
In any case, when postprocessing ld64's output to have different values
for initprot and maxprot, the dynamic loader seems to do the right thing
(see also the previous two links).
The same technique also works on Windows, using both link.exe and
lld-link.exe using `/SECTION:myprotsect,R`.
So, since this is useful, allow it when targeting macOS, and make it do
what you'd expect.
Since loader support for this on iOS is less clear, keep disallowing it
there for now.
See the PR for the program I used to check that this seems to work. (I
only checked on arm64 macOS 14.5 so far; will run this on many more
systems on bots once this is merged and rolled in.)
This patch makes `-objc_relative_method_lists` default on MacOS
10.16+/iOS 14+. Manual override still work if command line argument is
provided.
To test this change, many explict arguments are removed from the test
files. Some explict `-objc_no_objc_relative_method_lists` are also added
for tests that don't support this yet.
This commit tries to revive #101360, which exposes a bug that breaks CI.
#104081 has fixed that bug.
This patch makes `objc_relative_method_lists` default on MacOS 11+ / iOS
14+. Manual override still work if command line argument is provided.
To test this change, many explicit arguments are removed from the test
files. Some explicit `no_objc_relative...` are also added for tests that
don't support this yet.
Signed-off-by: Peter Rong <PeterRong@meta.com>
Add the lld flags `--irpgo-profile-sort=<profile>` and
`--compression-sort={function,data,both}` to order functions to improve
startup time, and functions or data to improve compressed size,
respectively.
We use Balanced Partitioning to determine the best section order using
traces from IRPGO profiles (see
https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
for details) to improve startup time and using hashes of section
contents to improve compressed size.
In our recent LLVM talk (https://www.youtube.com/watch?v=yd4pbSTjwuA),
we showed that this can reduce page faults during startup by 40% on a
large iOS app and we can reduce compressed size by 0.8-3%.
More details can be found in https://dl.acm.org/doi/10.1145/3660635
---------
Co-authored-by: Vincent Lee <thevinster@users.noreply.github.com>
This reverts commit f55b79f59a77b4be586d649e9ced9f8667265011.
The known issues with chained fixups have been addressed by #98913,
#98305, #97156 and #95171.
Compared to the original commit, support for xrOS (which postdates
chained fixups' introduction) was added and an unnecessary test change
was removed.
----------
Original commit message:
Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:
93d74eafc3/src/ld/Options.cpp (L5458-L5549)
Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32
Previously, we only saved those members of thin archives into a repro
file that were actually used during linking. However, -ObjC handling
requires us to inspect all members, even those that don't end up being
loaded.
We weren't handling missing members correctly and crashed with an
"unhandled `Error`" failure in LLVM_ENABLE_ABI_BREAKING_CHECKS builds.
To fix this, we now eagerly load all object files and warn when
encountering missing members (in the instances where it wasn't a hard
error before). To avoid having to patch out the checks when dealing
with older repro files, the `--no-warn-thin-archive-missing-members`
flag is added as an escape hatch.
Starting with Xcode 16 (dyld-1122), Apple's binary utilities, e.g.
`dyld_info` (but not dyld itself), will refuse to load binaries built
against the macOS 15 SDK or newer that contain the same `LC_RPATH`
entry multiple times:
https://github.com/apple-oss-distributions/dyld/blob/rel/dyld-1122/mach_o/Policy.cpp#L246-L249
`ld-prime` deduplicates entries (regardless of the deployment target),
we now do the same. We also match `ld-prime`'s and `ld64`'s behavior by
warning on duplicate `-rpath` arguments. This can be disabled by the
LLD-specific `--no-warn-duplicate-rpath` flag.
This section contains metadata that's only relevant for Identical Code
Folding at link time, we should not include it in the output.
We still treat it like a regular section during input file parsing (e.g.
create a `ConcatInputSection` for it), as we want its relocations to be
parsed. But it should not be passed to `addInputSection`, as that's what
assigns it to an `OutputSection` and adds it to the `inputSections`
vector which specifies the inputs to dead-stripping and relocation
scanning.
This fixes a "__DATA,__llvm_addrsig, offset 0: fixups overlap" error
when using `--icf=safe` alongside `-fixup_chains`. This occurs because
all `__llvm_addrsig` sections are 8 bytes large, and the relocations
which signify functions whose addresses are taken are all at offset 0.
This makes the fix in 5fa24ac2 ("Category Merger: add support for
addrsig references") obsolete, as we no longer try to resolve symbols
referenced in `__llvm_addrsig` when writing the output file. When we do
iterate its relocations in `markAddrSigSymbols`, we do not try to
resolve their addresses.
`-no_objc_category_merging` flag was behaving like
`-objc_category_merging` - i.e. acting to enable category merging.
This is because we were using `hasArg` instead of `hasFlag` to test for
it. Fix this and add test to ensure it behaves as expected.
When `-fixup_chains`/`-init_offsets` is used, a different section,
`__init_offsets` is synthesized from `__mod_init_func`. If there are any
symbols defined inside `__mod_init_func`, they are added to the symbol
table unconditionally while processing the input files. Later, when
querying these symbols' addresses (when constructing the symtab or
exports trie), we crash with a null deref, as there is no output section
assigned to them.
Just making the symbols point to `__init_offsets` is a bad idea, as the
new section stores 32-bit integers instead of 64-bit pointers; accessing
the symbols would not do what the programmer intended. We should
entirely omit them from the output. This is what ld64 and ld-prime do.
This patch uses the same mechanism as dead-stripping to mark these
symbols as not needed in the output. There might be nicer fixes than the
workaround, this is discussed in #97155.
Fixes https://github.com/llvm/llvm-project/pull/79894#issuecomment-1944092892Fixes#94716
The -all_load flag is intended to force the linker to load all lazy members, but doesn't do so if the archive is specified with --start-lib, --end-lib flags. The `-all_load` flag is global, that is it can be placed anywhere in the linker invocation, and it affects the load behavior of all conventional archives listed. Unlike ELF's --whole-archive, the user need not necessarily have access to the entire linker invocation to reasonably make use of the flag. The user can supply `-all_load` to a build system without inspecting the rest of the linker invocation.
To make the behavior of `--start-lib` style archives consistent with regular archives, this patch makes it so that -all_load also applies in this case.
This change adds the `--keep-icf-stabs` which, when specified, preserves
symbols that were folded by ICF in the binary's stabs entries.
This allows `dsymutil` to process debug information for the folded
symbols.
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
276 under llvm-project/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
The MachO format supports relative offsets for ObjC method lists. This
support is present already in ld64. With this change we implement this
support in lld also.
Relative method lists can be identified by a specific flag (0x80000000)
in the method list header. When this flag is present, the method list
will contain 32-bit relative offsets to the current Program Counter
(PC), instead of absolute pointers.
Additionally, when relative method lists are used, the offset to the
selector name will now be relative and point to the selector reference
(selref) instead of the name itself.
In a previous PR: https://github.com/llvm/llvm-project/pull/83878, the
intent was to make no functional changes, just refactor out the code for
reuse.
However, by creating `ObjCSelRefsSection` as a `SyntheticSection` - this
slightly changed the functionality of the application as the
`SyntheticSection` constructor registers the `SyntheticSection` as a
functional one - with an associated `SyntheticInputSection`.
With this change we remove this unintended consequence by making the
code not use a `SyntheticSection` as base, but just by having it be a
static helper.
Before this change, after `InputSection` objects are created, they need
to be added to the appropriate container for tracking.
The logic for selecting the appropriate container lives in `Driver.cpp`
/ `gatherInputSections`, where the `InputSection` is added to the
matching container depending on the input config and the type of
`InputSection`.
Also, multiple other locations also insert directly into `inputSections`
array - assuming that that is the appropriate container for the
`InputSection`'s they create. Currently this is the correct assumption,
however an upcoming feature will change this.
For an upcoming feature (relative method lists), we need to route
`InputSection`'s either to `inputSections` array or to a synthetic
section, depending on weather the relative method list optimization is
enabled or not.
We can achieve the above either by duplicating some of the logic or
refactoring the routing and `InputSection`'s and reusing that.
The refactoring & code sharing approach seems the correct way to go - as
such this diff performs the refactoring while not introducing any
functional changes. Later on we can just call `addInputSection` and not
have to worry about routing logic.
---------
This change adds a flag to lld to enable category merging for MachoO +
ObjC.
It adds the '-objc_category_merging' flag for enabling this option and
uses the existing '-no_objc_category_merging' flag for disabling it.
In ld64, this optimization is enabled by default, but in lld, for now,
we require explicitly passing the '-objc_category_merging' flag in order
to enable it.
Behavior: if in the same link unit, multiple categories are extending
the same class, then they get merged into a single category.
Ex: Cat1(method1+method2,protocol1) + Cat2(method3+method4,protocol2,
property1) = Cat1_2(method1+method2+method3+method4,
protocol1+protocol2, property1)
Notes on implementation decisions made in this diff:
There is a possibility to further improve the current implementation by
directly merging the category data into the base class (if the base
class is present in the link unit) - this improvement may be done as a
follow-up. This improved functionality is already present in ld64.
We do the merging on the raw inputSections - after dead-stripping
(categories can't be dead stripped anyway).
The changes are mostly self-contained to ObjC.cpp, except for adding a
new flag (linkerOptimizeReason) to ConcatInputSection and StringPiece to
mark that this data has been optimized away. Another way to do it would
have been to just mark the pieces as not 'live' but this would cause the
old symbols to show up in the linker map as being dead-stripped - even
if dead-stripping is disabled. This flag allows us to match the ld64
behavior.
Note: This is a re-land of
https://github.com/llvm/llvm-project/pull/82928 after fixing using
already freed memory in `generatedSectionData`. This issue was detected
by ASAN build.
---------
Co-authored-by: Alex B <alexborcan@meta.com>
This change adds a flag to lld to enable category merging for MachoO +
ObjC.
It adds the '-objc_category_merging' flag for enabling this option and
uses the existing '-no_objc_category_merging' flag for disabling it.
In ld64, this optimization is enabled by default, but in lld, for now,
we require explicitly passing the '-objc_category_merging' flag in order
to enable it.
Behavior: if in the same link unit, multiple categories are extending
the same class, then they get merged into a single category.
Ex: `Cat1(method1+method2,protocol1) + Cat2(method3+method4,protocol2,
property1) = Cat1_2(method1+method2+method3+method4,
protocol1+protocol2, property1)`
Notes on implementation decisions made in this diff:
1. There is a possibility to further improve the current implementation
by directly merging the category data into the base class (if the base
class is present in the link unit) - this improvement may be done as a
follow-up. This improved functionality is already present in ld64.
2. We do the merging on the raw inputSections - after dead-stripping
(categories can't be dead stripped anyway).
3. The changes are mostly self-contained to ObjC.cpp, except for adding
a new flag (linkerOptimizeReason) to ConcatInputSection and StringPiece
to mark that this data has been optimized away. Another way to do it
would have been to just mark the pieces as not 'live' but this would
cause the old symbols to show up in the linker map as being
dead-stripped - even if dead-stripping is disabled. This flag allows us
to match the ld64 behavior.
---------
Co-authored-by: Alex B <alexborcan@meta.com>
Move symbol names from directly under `objc` scope to
`objc::symbol_names`.
Ex: `objc::klass` -> `objc::symbol_names::klass`
Co-authored-by: Alex B <alexborcan@meta.com>
This caused links to fail with:
lld/MachO/Symbols.cpp:97:
virtual uint64_t lld::macho::Defined::getVA() const:
Assertion `target->usesThunks()' failed.
or crash when asserts are disabled. See comment on
https://github.com/llvm/llvm-project/pull/79894
> Enable chained fixups in lld when all platform and version criteria are
> met. This is an attempt at simplifying the logic used in ld 907:
>
> 93d74eafc3/src/ld/Options.cpp (L5458-L5549)
>
> Some changes were made to simplify the logic:
> - only enable chained fixups for macOS from 13.0 to avoid the arch check
> - only enable chained fixups for iphonesimulator from 16.0 to avoid the
> arch check
> - don't enable chained fixups for not specifically listed platforms
> - don't enable chained fixups for arm64_32
This reverts commit 775c2856fb32868f357a3ce3f2b4139541e12578.
Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:
93d74eafc3/src/ld/Options.cpp (L5458-L5549)
Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32
This patch implements `-objc_stubs_small` targeting arm64, aiming to
align with ld64's behavior.
1. `-objc_stubs_fast`: As previously implemented, this always uses the
Global Offset Table (GOT) to invoke `objc_msgSend`. The alignment of the
objc stub is 32 bytes.
2. `-objc_stubs_small`: This behavior depends on whether `objc_msgSend`
is defined. If it is, it directly jumps to `objc_msgSend`. If not, it
creates another stub to indirectly jump to `objc_msgSend`, minimizing
the size. The alignment of the objc stub in this case is 4 bytes.
Find object files in library search path just like Apple's linker, this
makes building with some older MacOS SDKs easier since clang runs with
`-lcrt1.10.6.o`