214 Commits

Author SHA1 Message Date
Daniel Thornburgh
074af0f30f
[lld][ELF] Add --why-live flag (inspired by Mach-O) (#127112)
This prints a stack of reasons that symbols that match the given glob(s)
survived GC. It has no effect unless section GC occurs.

This implementation does not require -ffunction-sections or
-fdata-sections to produce readable results, althought it does tend to
work better (as does GC).

Details about the semantics:
- Some chain of liveness reasons is reported; it isn't specified which
chain.
 - A symbol or section may be live:
   - Intrisically (e.g., entry point)
   - Because needed by a live symbol or section
   - (Symbols only) Because part of a section live for another reason
   - (Sections only) Because they contain a live symbol
 - Both global and local symbols (`STB_LOCAL`) are supported.
 - References to symbol + offset are considered to point to:
   - If the referenced symbol is a section (`STT_SECTION`):
- If a sized symbol encloses the referenced offset, the enclosing
symbol.
     - Otherwise, the section itself, generically.
   - Otherwise, the referenced symbol.
2025-03-26 09:56:33 -07:00
Fangrui Song
3733ed6f1c [ELF] Introduce Symbol::isExported to cache includeInDynsym
isExported, intended to replace exportDynamic, is primarily set in two
locations, (a) after parseSymbolVersion and (b) during demoteSymbols.

In the future, we should try removing exportDynamic. Currently,
merging exportDynamic/isExported would cause
riscv-gp.s to fail:

* The first isExported computation considers the undefined symbol exported
* Defined as a linker-synthesized symbol
* isExported remains true, while it should be false
2024-12-08 22:40:14 -08:00
Fangrui Song
2991a4e209 [ELF] Replace functions bAlloc/saver/uniqueSaver with member access 2024-11-16 22:34:13 -08:00
Fangrui Song
3b75a5c4c8 [ELF] Replace message(...) with Msg(ctx) 2024-11-16 15:34:42 -08:00
Fangrui Song
a626eb2a2f [ELF] Pass ctx to bAlloc/saver/uniqueSaver 2024-11-16 15:20:21 -08:00
Fangrui Song
58a971f42f [ELF] Replace contex-less toString(x) with toStr(ctx, x)
so that we can remove the global `ctx` from toString implementations.
Rename lld::toString (to lld:🧝:toStr) to simplify name lookup (we
have many llvm::toString and another lld::toString(const llvm::opt::Arg
&)).
2024-11-16 11:58:10 -08:00
Fangrui Song
dbd197118d [ELF] Pass Ctx & to Symbol 2024-10-11 23:34:43 -07:00
Fangrui Song
5c33424778 [ELF] Pass Ctx & to MarkLive 2024-09-29 15:32:16 -07:00
Fangrui Song
df0864e761 [ELF] Move elf::symtab into Ctx
Remove the global variable `symtab` and add a member variable
(`std::unique_ptr<SymbolTable>`) to `Ctx` instead.

This is one step toward eliminating global states.

Pull Request: https://github.com/llvm/llvm-project/pull/109612
2024-09-23 10:33:43 -07:00
Fangrui Song
6f482010ae [ELF] Replace config-> with ctx.arg. 2024-09-21 22:46:13 -07:00
Fangrui Song
40e8e4ddcb [ELF] Move partitions into ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.
2024-09-15 14:52:56 -07:00
Fangrui Song
b4feb26606 [ELF] Move target to Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

Follow-up to driver (2022-10) and script (2024-08).
2024-08-21 23:53:36 -07:00
Fangrui Song
4629aa1797 [ELF] Move script into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

We now use default-initialization for `LinkerScript` and should pay
attention to non-class types (e.g. `dot` is initialized by commit
503907dc505db1e439e7061113bf84dd105f2e35).
2024-08-21 21:23:28 -07:00
Fangrui Song
0af07c0787
[ELF] Support relocatable files using CREL with explicit addends
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.

---

Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.

(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)

A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.

---

Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).

* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.

SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/98115
2024-08-01 10:22:03 -07:00
Fangrui Song
0e47dfede4
[ELF] Add isStaticRelSecType to simplify SHT_REL/SHT_RELA testing. NFC
and make it easier to introduce a new relocation format.

https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/85893
2024-03-20 09:58:56 -07:00
Fangrui Song
f6455606bb [ELF] Move getSymbol/getRelocTargetSym from ObjFile<ELFT> to InputFile. NFC
This removes lots of unneeded `template getFile<ELFT>()`.
2024-03-10 23:01:26 -07:00
Amilendra Kodithuwakku
9acbab60e5 [LLD][ELF] Cortex-M Security Extensions (CMSE) Support
This commit provides linker support for Cortex-M Security Extensions (CMSE).
The specification for this feature can be found in ARM v8-M Security Extensions:
Requirements on Development Tools.

The linker synthesizes a security gateway veneer in a special section;
`.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`,
defined relative to the same text section and having the same address. The
address of `<entry>` is retargeted to the starting address of the
linker-synthesized security gateway veneer in section `.gnu.sgstubs`.

In summary, the linker translates input:

```
    .text
  entry:
  __acle_se_entry:
    [entry_code]

```
into:

```
    .section .gnu.sgstubs
  entry:
    SG
    B.W __acle_se_entry

    .text
  __acle_se_entry:
    [entry_code]
```

If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker
considers that `<entry>` already defines a secure gateway veneer so does not
synthesize one.

If `--out-implib=<out.lib>` is specified, the linker writes the list of secure
gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library
will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway
veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with
value `<addr>`.

If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import
library `<in.lib>` and preserves the entry function addresses in the resulting
executable and new import library.

Reviewed By: MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D139092
2023-07-06 11:34:07 +01:00
Alex Brachet
3bd72b2a58 [ELF][NFC] Change comment terminology
Differential Revision: https://reviews.llvm.org/D153978
2023-06-29 11:22:00 +00:00
Mitch Phillips
cd116e0460 Revert "Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support"""
This reverts commit 9246df7049b0bb83743f860caff4221413c63de2.

Reason: This patch broke the UBSan buildbots. See more information in
the original phabricator review: https://reviews.llvm.org/D139092
2023-06-22 14:33:57 +02:00
Amilendra Kodithuwakku
9246df7049 Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support""
This reverts commit a685ddf1d104b3ce9d53cf420521f5aaff429630.

This relands Arm CMSE support (D139092) and fixes the GCC build bot errors.
2023-06-21 22:27:13 +01:00
Amilendra Kodithuwakku
a685ddf1d1 Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support"
This reverts commit c4fea3905617af89d1ad87319893e250f5b72dd6.

I am reverting this for now until I figure out how to fix
the build bot errors and warnings.

Errors:
llvm-project/lld/ELF/Arch/ARM.cpp:1300:29: error: expected primary-expression before ‘>’ token
 osec->writeHeaderTo<ELFT>(++sHdrs);

Warnings:
llvm-project/lld/ELF/Arch/ARM.cpp:1306:31: warning: left operand of comma operator has no effect [-Wunused-value]
2023-06-21 16:13:44 +01:00
Amilendra Kodithuwakku
c4fea39056 [LLD][ELF] Cortex-M Security Extensions (CMSE) Support
This commit provides linker support for Cortex-M Security Extensions (CMSE).
The specification for this feature can be found in ARM v8-M Security Extensions:
Requirements on Development Tools.

The linker synthesizes a security gateway veneer in a special section;
`.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`,
defined relative to the same text section and having the same address. The
address of `<entry>` is retargeted to the starting address of the
linker-synthesized security gateway veneer in section `.gnu.sgstubs`.

In summary, the linker translates input:

```
    .text
  entry:
  __acle_se_entry:
    [entry_code]

```
into:

```
    .section .gnu.sgstubs
  entry:
    SG
    B.W __acle_se_entry

    .text
  __acle_se_entry:
    [entry_code]
```

If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker
considers that `<entry>` already defines a secure gateway veneer so does not
synthesize one.

If `--out-implib=<out.lib>` is specified, the linker writes the list of secure
gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library
will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway
veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with
value `<addr>`.

If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import
library `<in.lib>` and preserves the entry function addresses in the resulting
executable and new import library.

Reviewed By: MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D139092
2023-06-21 14:47:34 +01:00
Fangrui Song
8d85c96e0e [lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC
The latter form is now preferred to be similar to C++20 starts_with.
This replacement also removes one function call when startswith is not inlined.
2023-06-05 14:36:19 -07:00
Fangrui Song
2bf5d86422 [ELF] Change rawData to content() and data() to contentMaybeDecompress()
Clarify data() which may trigger decompression and make it feasible to refactor
the member variable rawData.
2022-11-20 22:43:22 +00:00
Fangrui Song
14f996dca8 [ELF] Move inputSections/ehInputSections into Ctx. NFC 2022-10-16 00:49:48 -07:00
Fangrui Song
9c626d4a0d [ELF] Remove symtab indirection. NFC
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.
2022-10-01 14:46:49 -07:00
Fangrui Song
34fa860048 [ELF] Remove ctx indirection. NFC
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection. We can move other global variables into ctx without
indirection concern. In the long term we may consider passing Ctx
as a parameter to various functions and eliminate global state as
much as possible and then remove `Ctx::reset`.
2022-10-01 12:06:33 -07:00
Fangrui Song
bc89663b69 [ELF] MarkLive: remove dead code from D24750. NFC 2022-09-05 00:01:09 -07:00
Sam Clegg
2cd4cd9a32 [lld][ELF] Rename SymbolTable::symbols() to SymbolTable::getSymbols(). NFC
This change renames this method match its original name and the name
used in the wasm linker.

Back in d8f8abbd4a2823f223bd7bc56445541fb221b512 the ELF SymbolTable
method `getSymbols()` was replaced with `forEachSymbol`.

Then in a2fc96441788fba1e4709d63677f34ed8e321dae `forEachSymbol` was
replaced with a `llvm::iterator_range`.

Then in e9262edf0d11a907763098d8e101219ccd9c43e9 we came full circle
and the `llvm::iterator_range` was replaced with a `symbols()` accessor
that was identical the original `getSymbols()`.

`getSymbols` also matches the name used elsewhere in the ELF linker as
well as in both COFF and wasm backend (e.g. `InputFiles.h` and
`SyntheticSections.h`)

Differential Revision: https://reviews.llvm.org/D130787
2022-08-19 14:56:08 -07:00
Fangrui Song
3e9adff456 [ELF] Split EhInputSection::pieces into cies and fdes
This simplifies code, removes a read32 (for id==0 check), and makes it feasible
to combine some operations in EhInputSection::split and EhFrameSection::addRecords.

Mostly NFC, but fixes "Relocation not in any piece" assertion failure in an
erroneous case when a relocation offset precedes all CIE/FDE pices.
2022-07-31 16:16:10 -07:00
Fangrui Song
c09d323599 [ELF] Move EhInputSection out of inputSections. NFC
inputSections temporarily contains EhInputSection objects mainly for
combineEhSections. Place EhInputSection objects into a new vector
ehInputSections instead of inputSections.
2022-07-31 11:58:08 -07:00
Fangrui Song
9a572164d5 [ELF] Move InputFiles global variables (memoryBuffers, objectFiles, etc) into Ctx. NFC 2022-06-29 18:53:38 -07:00
Fangrui Song
8565a87fd4 [ELF] Simplify MergeInputSection::getParentOffset. NFC
and remove overly verbose comments.
2022-03-28 10:02:35 -07:00
Fangrui Song
38fbedab32 [ELF] Don't rely on Symbols.h's transitive inclusion of InputFiles.h. NFC 2022-02-23 20:44:34 -08:00
Fangrui Song
47d18be58b [ELF] Remove SharedSymbol::getFile. NFC
Symbol.h depends on InputFiles.h. This change moves us toward dropping the
weird dependency.

The call sites will become slightly uglier (`cast<SharedFile>(s->file)`), but
the compromise is acceptable.
2022-02-23 17:57:52 -08:00
Fangrui Song
ae1ba6194f [ELF] Replace uncompressed InputSectionBase::data() with rawData. NFC
In many call sites we know uncompression cannot happen (non-SHF_ALLOC, or the
data (even if compressed) must have been uncompressed by a previous pass).
Prefer rawData in these cases. data() increases code size and prevents
optimization on rawData.
2022-02-21 00:39:26 -08:00
Fangrui Song
27bb799095 [ELF] Clean up headers. NFC 2022-02-07 21:53:34 -08:00
Fangrui Song
7cd0c45364 [ELF] Simplify SectionBase::partition handling and make it live by default. NFC
Previously an InputSectionBase is dead (`partition==0`) by default.
SyntheticSection calls markLive and BssSection overrides that with markDead.

It is more natural to make InputSectionBase live by default and let
--gc-sections mark InputSectionBase dead.

When linking a Release build of clang:

* --no-gc-sections:, the removed `inputSections` loop decreases markLive time from 4ms to 1ms.
* --gc-sections: the extra `inputSections` loop increases markLive time from 0.181296s to 0.188526s.
  This is as of we lose the removing one `inputSections` loop optimization (4374824ccf6e7ae68264d996a9ae5bb5e3be7fc5).
  I believe the loss can be mitigated if we refactor markLive.
2022-01-30 15:12:09 -08:00
Alexandre Ganea
83d59e05b2 Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

The previous land f860fe362282ed69b9d4503a20e5d20b9a041189 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac9440a74b2e5b3fe3ff13ccdbf55af3.

Differential Revision: https://reviews.llvm.org/D108850
2022-01-20 14:53:26 -05:00
Alexandre Ganea
e6b153947d Revert [LLD] Remove global state in lldCommon
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16 11:03:06 -05:00
Alexandre Ganea
f860fe3622 [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

Differential Revision: https://reviews.llvm.org/D108850
2022-01-16 08:57:57 -05:00
Fangrui Song
ed67d5a03a [ELF] Switch cNamedSections to SmallVector. NFC
Make it smaller
2021-12-30 16:08:26 -08:00
Fangrui Song
de92a13fec [ELF] --gc-sections: Work around SHT_PROGBITS .init_array.N for Rust
See https://github.com/rust-lang/rust/issues/92181
2021-12-28 16:40:51 -08:00
Fangrui Song
464cc4c920 [ELF] Remove stale comment which was duplicated in MarkLive<ELFT>::run
Pointed out by thakis
2021-12-23 15:13:46 -08:00
Fangrui Song
4374824ccf [ELF] --gc-sections: combine two iterations over inputSections
There is a slight speed-up.
2021-12-23 09:53:08 -08:00
Fangrui Song
48161b7490 [ELF] --gc-sections: Work around SHT_PROGBITS .init_array
Older Go cmd/link used SHT_PROGBITS for .init_array .
Work around the lack of https://golang.org/cl/373734 for a while.
It does not generate .fini_array or .preinit_array
2021-12-21 10:44:29 -08:00
Fangrui Song
7c0881a38f [ELF] --gc-sections: Change startwith(".jcr") to exact match
GNU ld's internal linker script keeps `.jcr`, but not other sections
starting with `.jcr`.
2021-12-15 01:27:08 -08:00
Fangrui Song
21dbfd4300 [ELF] --gc-sections: Change startwith(".init") (and ".fini") to exact match
GNU ld's internal linker script keeps `.init`, but not other sections starting
with `.init`. .fini is similar.
2021-12-15 01:16:26 -08:00
Fangrui Song
7a54ae9c1d [ELF] Change objectFiles to ELFFileBase *
This can sometimes avoid `cast<ObjFile<...>>`.

I intentionally do not touch postScanRelocations to wait for its stabilization.
2021-12-15 00:37:10 -08:00
Fangrui Song
ecc93ed2d7 [ELF] Replace InputBaseSection::{areRelocsRela,firstRelocation,numRelocation} with relSecIdx
For `InputSection` `.foo`, its `InputBaseSection::{areRelocsRela,firstRelocation,numRelocation}` basically
encode the information of `.rel[a].foo`. However, one uint32_t (the relocation section index)
suffices. See the implementation of `relsOrRelas`.

This change decreases sizeof(InputSection) from 184 to 176 on 64-bit Linux.

The maximum resident set size linking a large application (1.2G output) decreases by 0.39%.

Differential Revision: https://reviews.llvm.org/D112513
2021-10-27 09:51:07 -07:00