35 Commits

Author SHA1 Message Date
alx32
2a3a79ce4c
[lld-macho][NFC] Preserve original symbol isec, unwindEntry and size (#88357)
Currently, when moving symbols from one `InputSection` to another (like
in ICF) we directly update the symbol's `isec`, `unwindEntry` and
`size`. By doing this we lose the original information. This information
will be needed in a future change. Since when moving symbols we always
set the symbol's `wasCoalesced` and `isec-> replacement`, we can just
use this info to conditionally get the information we need at access
time.
2024-04-18 11:42:22 -07:00
alx32
742a82a729
[lld-macho] Implement support for ObjC relative method lists (#86231)
The MachO format supports relative offsets for ObjC method lists. This
support is present already in ld64. With this change we implement this
support in lld also.

Relative method lists can be identified by a specific flag (0x80000000)
in the method list header. When this flag is present, the method list
will contain 32-bit relative offsets to the current Program Counter
(PC), instead of absolute pointers.
Additionally, when relative method lists are used, the offset to the
selector name will now be relative and point to the selector reference
(selref) instead of the name itself.
2024-03-27 14:34:27 -07:00
Jez Ng
dd78e7334f [lld-macho] Don't include zero-size private label symbols in map file
This is also what ld64 does. This will make it easier to compare their
respective map files.

Reviewed By: #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D145654
2023-03-11 01:40:14 -05:00
Jez Ng
5b21395cc2 [lld-macho] Print archive names in linker map
If a symbol is pulled in from an archive, we should include the archive
name in the map file output. This is what ld64 does.

Note that we aren't using `toString(InputFile*)` here because it
includes the install name for dylibs in its output, and ld64's map file
does not contain those.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D145623
2023-03-11 01:40:14 -05:00
Jez Ng
aa288fd984 [lld-macho] Emit map file entries for more synthetic sections
We now handle the GOT, TLV, and stubs/lazy pointer sections.

Reviewed By: #lld-macho, thevinster, thakis

Differential Revision: https://reviews.llvm.org/D139762
2022-12-21 17:28:18 -05:00
Jez Ng
41f90e970e Reland "[lld-macho] Emit map file entry for compact unwind info"
This reverts commit ac3096e1dd77a2687797d38976d5f8c93f7353e5.

The buildbot failure from the earlier patch set has been fixed by 7c7e39db7a.

Differential Revision: https://reviews.llvm.org/D137369
2022-12-06 16:41:47 -05:00
David Spickett
7c7e39db7a [lld-macho] Fix map file test on 32 bit hosts
The test added in https://reviews.llvm.org/D137368 has been failing
on our 32 bit arm bots:
https://lab.llvm.org/buildbot/#/builders/178/builds/3460

You get this for the strings:
<<dead>> 0x883255000000003 [ 10] literal string: Hello, it's me
Instead of the expected:
<<dead>>  0x0000000F  [  3] literal string: Hello, it's me

This is because unlike symbols whose size is a uint64_t, strings
use a StringRef whose size is size_t. size_t changes size between
32 and 64 bit platforms.

This fixes the test by using %z to print the size of the strings,
this works for 32 and 64 bit.
2022-12-06 10:45:07 +00:00
Jez Ng
7ca32bd402 Reland "[lld-macho] Overhaul map file code"
This reverts commit 38d6202a425462ce5923d038bc54532115a80a1f.

Differential Revision: https://reviews.llvm.org/D137368
2022-12-05 16:57:35 -05:00
Muhammad Omair Javaid
38d6202a42 Revert "[lld-macho] Overhaul map file code"
This reverts commit 213dbdbef0bad835abca0753f9e59b17dc2bcde2.
This patch series breaks lld:map-file.s on arm v7 linux buildbots.
e.g https://lab.llvm.org/buildbot/#/builders/178/builds/3190
2022-11-17 12:13:13 +04:00
Muhammad Omair Javaid
ac3096e1dd Revert "[lld-macho] Emit map file entry for compact unwind info"
This reverts commit 7f0779967f0690482c2cef70fc49e1381d32af1e.
This patch series breaks lld:map-file.s on arm v7 linux buildbots.
e.g https://lab.llvm.org/buildbot/#/builders/178/builds/3190
2022-11-17 12:13:13 +04:00
Jez Ng
7f0779967f [lld-macho] Emit map file entry for compact unwind info
Just like ld64 does.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D137369
2022-11-08 16:33:28 -05:00
Jez Ng
213dbdbef0 [lld-macho] Overhaul map file code
The previous map file code left out was modeled after LLD-ELF's
implementation. However, ld64's map file differs quite a bit from
LLD-ELF's. I've revamped our map file implementation so it is better
able to emit ld64-style map files.

Notable differences:
* ld64 doesn't demangle symbols in map files, regardless of whether
  `-demangle` is passed. So we don't have to bother with
  `getSymbolStrings()`.
* ld64 doesn't emit symbols in cstring sections; it emits just the
  literal values. Moreover, it emits these literal values regardless of
  whether they are labeled with a symbol.
* ld64 emits map file entries for things that are not strictly symbols,
  such as unwind info, GOT entries, etc. That isn't handled in this
  diff, but this redesign makes them easy to implement.

Additionally, the previous implementation sorted the symbols so as to
emit them in address order. This was slow and unnecessary -- the symbols
can already be traversed in address order by walking the list of
OutputSections. This brings significant speedups. Here's the numbers
from the chromium_framework_less_dwarf benchmark on my Mac Pro, with the
`-map` argument added to the response file:

             base            diff           difference (95% CI)
  sys_time   2.922 ± 0.059   2.950 ± 0.085  [  -0.7% ..   +2.5%]
  user_time  11.464 ± 0.191  8.290 ± 0.123  [ -28.7% ..  -26.7%]
  wall_time  11.235 ± 0.175  9.184 ± 0.169  [ -19.3% ..  -17.2%]
  samples    16              23

(It's worth noting that map files are written in parallel with the
output binary, but they often took longer to write than the binary
itself.)

Finally, I did further cleanups to the map-file.s test -- there was no
real need to have a custom-named section. There were also alt_entry
symbol declarations that had no corresponding definition. Either way,
neither custom-named sections nor alt_entry symbols trigger special code
paths in our map file implementation.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D137368
2022-11-08 16:33:22 -05:00
Jez Ng
39917b5e01 [lld-macho] Don't sort map file entries by name
ld64 emits them in address order but not in alphabetical order. This
sorting is particularly expensive for dead-stripped symbols (which don't
need to be sorted at all, unlike live symbols that need to be sorted by
address).

Timings for chromium_framework_less_dwarf (with the `-map` flag added to
the response file) on my 16-core Mac Pro:

             base           diff           difference (95% CI)
  sys_time   1.997 ± 0.038  2.004 ± 0.028  [  -0.6% ..   +1.3%]
  user_time  8.698 ± 0.085  8.167 ± 0.070  [  -6.6% ..   -5.6%]
  wall_time  7.965 ± 0.114  7.715 ± 0.347  [  -5.1% ..   -1.2%]
  samples    25             23

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D136536
2022-10-25 16:29:13 -04:00
Jez Ng
b945733026 [lld-macho] Map file should map symbols to their original bitcode file
... instead of mapping them to the intermediate object file.
This matches ld64.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D136380
2022-10-21 22:49:02 -04:00
Jez Ng
da374d180d [lld-macho][nfc] Update map file sample output in comment
Include symbol sizes (present after {D135883}) as well as an example of
a dead-stripped symbol.
2022-10-21 22:49:02 -04:00
Vy Nguyen
fc7a71890d [lld-macho][nfc] Clean up includes
- remove unused/duplicate includes
- reformatting/whitespaces

Differential Revision: https://reviews.llvm.org/D136266
2022-10-19 13:56:24 -04:00
Jez Ng
bdd0cec569 [lld-macho] Include symbol sizes in mapfile
This matches ld64's behavior.

Additionally, I edited the "Dead Stripped Symbols" header to omit "Address" --
this also matches ld64.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D135883
2022-10-13 16:44:29 -04:00
Jez Ng
3925ea4172 [lld-macho][nfci] Don't include null terminator in StringRefs
So @keith observed
[here](https://reviews.llvm.org/D128108#inline-1263900) that the
StringRefs we were returning from `CStringInputSection::getStringRef()`
included the null terminator in their total length, but regular
StringRefs do not. Let's fix that so these StringRefs are less confusing
to use.

Reviewed By: #lld-macho, keith, Roger

Differential Revision: https://reviews.llvm.org/D133728
2022-09-13 21:23:48 -04:00
Nico Weber
b99da9d255 [lld/mac] Use C++17 structured bindings
No behavior change.

Differential Revision: https://reviews.llvm.org/D131355
2022-08-08 07:21:28 -04:00
Nico Weber
7effcbda49 Rename parallelForEachN to just parallelFor
Patch created by running:

  rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'

No behavior change.

Differential Revision: https://reviews.llvm.org/D128140
2022-06-19 17:49:00 -04:00
Roger Kim
4f2c46c35c Print C-string literals in mapfile
This diff has the C-string literals printed into the mapfile in the symbol table like how ld64 does.

Here is what ld64's mapfile looks like with C-string literals:
```
# Path: out
# Arch: x86_64
# Object files:
[  0] linker synthesized
[  1] foo.o
# Sections:
# Address       Size            Segment Section
0x100003F7D     0x0000001D      __TEXT  __text
0x100003F9A     0x0000001E      __TEXT  __cstring
0x100003FB8     0x00000048      __TEXT  __unwind_info
# Symbols:
# Address       Size            File  Name
0x100003F7D     0x0000001D      [  1] _main
0x100003F9A     0x0000000E      [  1] literal string: Hello world!\n
0x100003FA8     0x00000010      [  1] literal string: Hello, it's me\n
0x100003FB8     0x00000048      [  0] compact unwind info
```

Here is what the new lld's Mach-O mapfile looks like:
```
# Path: /Users/rgr/local/llvm-project/build/Debug/tools/lld/test/MachO/Output/map-file.s.tmp/c-string-liter
al-out
# Arch: x86_64
# Object files:
[  0] linker synthesized
[  1] /Users/rgr/local/llvm-project/build/Debug/tools/lld/test/MachO/Output/map-file.s.tmp/c-string-literal
.o
# Sections:
# Address       Size            Segment Section
0x1000002E0     0x0000001D      __TEXT  __text
0x1000002FD     0x0000001D      __TEXT  __cstring
# Symbols:
# Address           File  Name
0x1000002E0     [  1] _main
0x1000002FD     [  1] literal string: Hello world!\n
0x10000030B     [  1] literal string: Hello, it's me\n
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D118077
2022-02-11 19:42:20 -05:00
Roger Kim
422084332a [lld][Macho] Include dead-stripped symbols in mapfile
ld64 outputs dead stripped symbols when using the -dead-strip flag. This change mimics that behavior for lld.

ld64's -dead_strip flag outputs:
```
$ ld -map map basics.o -o out -dead_strip -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem
$ cat map
# Path: out
# Arch: x86_64
# Object files:
[  0] linker synthesized
[  1] basics.o
# Sections:
# Address       Size            Segment Section
0x100003F97     0x00000021      __TEXT  __text
0x100003FB8     0x00000048      __TEXT  __unwind_info
0x100004000     0x00000008      __DATA_CONST    __got
0x100008000     0x00000010      __DATA  __ref_section
0x100008010     0x00000001      __DATA  __common
# Symbols:
# Address       Size            File  Name
0x100003F97     0x00000006      [  1] _ref_local
0x100003F9D     0x00000001      [  1] _ref_private_extern
0x100003F9E     0x0000000C      [  1] _main
0x100003FAA     0x00000006      [  1] _no_dead_strip_globl
0x100003FB0     0x00000001      [  1] _ref_from_no_dead_strip_globl
0x100003FB1     0x00000006      [  1] _no_dead_strip_local
0x100003FB7     0x00000001      [  1] _ref_from_no_dead_strip_local
0x100003FB8     0x00000048      [  0] compact unwind info
0x100004000     0x00000008      [  0] non-lazy-pointer-to-local: _ref_com
0x100008000     0x00000008      [  1] _ref_data
0x100008008     0x00000008      [  1] l_ref_data
0x100008010     0x00000001      [  1] _ref_com

# Dead Stripped Symbols:
#               Size            File  Name
<<dead>>        0x00000006      [  1] _unref_extern
<<dead>>        0x00000001      [  1] _unref_local
<<dead>>        0x00000007      [  1] _unref_private_extern
<<dead>>        0x00000001      [  1] _ref_private_extern_u
<<dead>>        0x00000008      [  1] _unref_data
<<dead>>        0x00000008      [  1] l_unref_data
<<dead>>        0x00000001      [  1] _unref_com
```

Reviewed By: int3, #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D114737
2022-01-28 10:51:27 -08:00
Roger Kim
f84023a812 [lld][macho] Stop grouping symbols by sections in mapfile.
As per [Bug 50689](https://bugs.llvm.org/show_bug.cgi?id=50689),

```
2. getSectionSyms() puts all the symbols into a map of section -> symbols, but this seems unnecessary. This was likely copied from the ELF port, which prints a section header before the list of symbols it contains. But the Mach-O map file doesn't print these headers.
```

This diff removes `getSectionSyms()` and keeps all symbols in a flat vector.

What does ld64's mapfile look like?
```
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin test.s -o test.o
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin foo.s -o foo.o
$ ld -map map test.o foo.o -o out -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem
```

```
[  0] linker synthesized
[  1] test.o
[  2] foo.o
0x100003FB7     0x00000001      __TEXT  __text
0x100003FB8     0x00000000      __TEXT  obj
0x100003FB8     0x00000048      __TEXT  __unwind_info
0x100004000     0x00000001      __DATA  __common
0x100003FB7     0x00000001      [  1] _main
0x100003FB8     0x00000000      [  2] _foo
0x100003FB8     0x00000048      [  0] compact unwind info
0x100004000     0x00000001      [  1] _number
```

Perf numbers when linking chromium framework on a 16-Core Intel Xeon W Mac Pro:
```
base           diff           difference (95% CI)
sys_time   1.406 ± 0.020  1.388 ± 0.019  [  -1.9% ..   -0.6%]
user_time  5.557 ± 0.023  5.914 ± 0.020  [  +6.2% ..   +6.6%]
wall_time  4.455 ± 0.041  4.436 ± 0.035  [  -0.8% ..   -0.0%]
samples    35             35
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D114735
2022-01-20 12:16:37 -08:00
Xuanda Yang
01cb9c5fc5 [lld][MachO] Sort symbols in parallel in -map
source: https://bugs.llvm.org/show_bug.cgi?id=50689

When writing a map file, sort symbols in parallel using parallelSort.
Use address name to break ties if two symbols have the same address.

Reviewed By: thakis, int3

Differential Revision: https://reviews.llvm.org/D104346
2021-06-17 10:19:59 +08:00
Jez Ng
b8bbb9723a [lld-macho][nfc] Put back shouldOmitFromOutput() asserts
I removed them in rG5de7467e982 but @thakis pointed out that
they were useful to keep, so here they are again. I've also converted
the `!isCoalescedWeak()` asserts into `!shouldOmitFromOutput()` asserts,
since the latter check subsumes the former.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104169
2021-06-16 15:23:04 -04:00
Jez Ng
e06b9ba485 [lld-macho] Reword comment for clarity 2021-06-14 13:47:25 -04:00
Jez Ng
5de7467e98 [lld-macho] Fix debug build
D103977 broke a bunch of stuff as I had only tested the release build
which eliminated asserts.

I've retained the asserts where possible, but I also removed a bunch
instead of adding a whole lot of verbose ConcatInputSection casts.
2021-06-11 20:21:27 -04:00
Nico Weber
a5645513db [lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324
2021-06-02 11:09:26 -04:00
Nico Weber
d5a70db193 [lld/mac] Write every weak symbol only once in the output
Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.

Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:

        N           Min           Max        Median           Avg        Stddev
    x  10     3.9957051     4.3496981     4.1411121      4.156837    0.10092097
    +  10      3.908154      4.169318     3.9712729     3.9846753   0.075773012
    Difference at 95.0% confidence
            -0.172162 +/- 0.083847
            -4.14165% +/- 2.01709%
            (Student's t, pooled s = 0.0892373)

Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)

Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
  Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
  (that is, catch block unwind information) and Personality Routines
  associated with weak functions still not stripped. This is wasteful,
  but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
  correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
  one symbol (eg from .alt_entry) for now

Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
  found it a bit more explicit)
- exports

Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).

This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.

I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.

Differential Revision: https://reviews.llvm.org/D102076
2021-05-07 17:11:40 -04:00
Jez Ng
ed4a4e3312 [lld-macho][nfc] Add accessors for commonly-used PlatformInfo fields
As discussed here: https://reviews.llvm.org/D100523#inline-951543

Reviewed By: #lld-macho, thakis, alexshap

Differential Revision: https://reviews.llvm.org/D100978
2021-04-21 15:43:56 -04:00
Alexander Shaposhnikov
5c835e1ae5 [lld][MachO] Add support for LC_VERSION_MIN_* load commands
This diff adds initial support for the legacy LC_VERSION_MIN_* load commands.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D100523
2021-04-21 05:41:14 -07:00
Greg McGary
427d359721 [lld-macho][NFC] Drop unnecessary macho:: namespace prefix on unambiguous references to Symbol
Within `lld/macho/`, only `InputFiles.cpp` and `Symbols.h` require the `macho::` namespace qualifier to disambiguate references to `class Symbol`.

Add braces to outer `for` of a 5-level single-line `if`/`for` nest.

Differential Revision: https://reviews.llvm.org/D99555
2021-03-30 14:58:35 -07:00
Nico Weber
742f663705 fix comment typo to cycle bots 2021-03-29 14:35:57 -04:00
Jez Ng
4bcaafeb0e [lld-macho] Add more TimeTraceScopes
I added just enough to allow us to see a top-level breakdown of time taken. This
is the result of loading the time-trace output into `chrome:://tracing`:

ef5e8234f3/tracing.png

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D99311
2021-03-25 14:51:31 -04:00
caoming.roy
ed8bff13dc [lld-macho] implement options -map
Implement command-line options -map

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D98323
2021-03-18 10:39:19 -04:00