681 Commits

Author SHA1 Message Date
Fangrui Song
3ff634dee8 Revert "ELFObjectWriter: Disable STT_SECTION adjustment for .reloc"
This reverts commit 1c5961c0481b9c7421d38e3141d3c5a1e6084234.
Inadvertently pushed.
2025-04-12 21:02:27 -07:00
Fangrui Song
1c5961c048 ELFObjectWriter: Disable STT_SECTION adjustment for .reloc
to match GNU Assembler. This generalizes the SHT_LLVM_CALL_GRAPH_PROFILE
(which uses BFD_RELOC_NONE https://reviews.llvm.org/D104080) special case.
2025-04-12 21:00:25 -07:00
Fangrui Song
329a675006 ELFObjectWriter: Simplify STT_SECTION adjustment. NFC 2025-04-12 20:54:05 -07:00
Fangrui Song
7ccdc3d5ca [MC] Replace getSymA()->getSymbol() with getAddSym. NFC
We will replace the MCSymbolRefExpr member in MCValue with MCSymbol.
This change reduces dependence on MCSymbolRefExpr.
2025-04-05 13:16:25 -07:00
Fangrui Song
db603a09da [MC] Move ELF-specific handleAddSubRelocations to ELFObjectWriter::recordRelocation 2025-03-29 19:08:07 -07:00
Fangrui Song
b73e144bdf MCValue: Simplify code with getSubSym
MCValue::SymB is a MCSymbolRefExpr *, which might become MCSymbol * in
the future. Simplify some code that uses MCValue::SymB.
2025-03-23 12:13:13 -07:00
Fangrui Song
83c3ec1b07 [MC] Move isMemtag test to AArch64
And introduce MCValue::getAddSym & MCValue::getSubSym to simplify code.

We do not utilize the MCSymbol argument of needsRelocateWithSymbol
as it will go away in the future.
2025-03-23 11:59:21 -07:00
Fangrui Song
c39d393038 ELFObjectWriter: Remove relocation specifier test from shouldRelocateWithSymbol
It's the decision of backend needsRelocateWithSymbol whether the
STT_SECTION adjustment should be suppressed.

test/MC/AArch64/data-directive-specifier.s demonstrates how to test this
property.
2025-03-23 11:50:58 -07:00
Fangrui Song
8553fafff0 [MC] Remove ELFObjectWriter::fixSymbolsInTLSFixups
Finish the migration started by
eea7d32bd262bb5f61790c42ebaa147aa26c3979.
STT_TLS setting has been moved to backend getRelocType.
75f5a4f0dc7d96134cca86543ef3f86ef218ce77 migrated the last target, VE.
2025-03-22 20:10:31 -07:00
Fangrui Song
b19b6d9fab Move SystemZ-specific MCSymbolRefExpr::VariantKind to SystemZMCExpr::Specifier
Similar to previous migration done for other targets (PowerPC, X86, ARM,
etc). Switch from the confusing VariantKind to Specifier, which aligns
with Arm and IBM AIX's documentation.

In addition, rename *MCExpr::getKind, which confusingly shadows the base class getKind.

In the future, relocation specifiers should be encoded as part of
SystemZMCExpr instead of MCSymbolRefExpr.
2025-03-22 18:05:40 -07:00
Fangrui Song
c177dbe484
Move X86-specific MCSymbolRefExpr::VariantKind to X86MCExpr::Specifier
Move target-specific members outside of MCSymbolRefExpr::VariantKind
(a legacy interface I am eliminating). Most changes are mechanic,
except:

* ELFObjectWriter::shouldRelocateWithSymbol
* The legacy generic code uses `ELFObjectWriter::fixSymbolsInTLSFixups`
  to set `STT_TLS` (and use an unnecessary expression walk). The better
  way is to do this in `getRelocType`, which I have done for
  AArch64, PowerPC, and RISC-V.

In the future, we should encode expressions with a relocation specifier
as X86MCExpr and use MCValue::RefKind to hold the specifier of the
relocatable expression.
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers

While here, rename "Modifier' to "Specifier":

> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.

Pull Request: https://github.com/llvm/llvm-project/pull/132149
2025-03-20 20:28:49 -07:00
Fangrui Song
5d5f16204f Move PowerPC-specific MCSymbolRefExpr::VariantKind to PPCMCExpr
Most changes are mechanic, except:

* ELFObjectWriter::shouldRelocateWithSymbol: .TOC.@tocbase does not
  register the undefined symbol.  Move the handling into the
  Sym->isUndefined() code path.
* ELFObjectWriter::fixSymbolsInTLSFixups's VK_PPC* cases are moved to
  PPCELFObjectWriter::getRelocType. We should do similar refactoring
  for other targets and eventually remove fixSymbolsInTLSFixups.

In the future, we should classify PPCMCExpr similar to AArch64MCExpr.
2025-03-12 23:00:03 -07:00
Fangrui Song
eea7d32bd2 [MC] Move fixSymbolsInTLSFixups to ELFObjectWriter
so that we only need to do it once during recordRelocation. In the
future, we should change fixSymbolsInTLSFixups to apply to MCValue
instead of MCExpr, similar to GNU assembler.
2025-03-12 19:49:52 -07:00
Arthur Eubanks
7aabbf22f0
[llvm][ELF] Separate out .dwo bytes written in stats (#126165)
So we can distinguish between debug info sections written to .dwo files
and those written to the object file.
2025-02-07 10:34:16 -08:00
Kazu Hirata
24892b8681
[MC] Avoid repeated hash lookups (NFC) (#123502) 2025-01-19 10:58:26 -08:00
Kazu Hirata
d73d5c8c9b
[MC] Remove unused includes (NFC) (#116317)
Identified with misc-include-cleaner.
2024-11-15 07:26:22 -08:00
Arthur Eubanks
e44ecf76e0
[llvm][ELF] Add ELF header/section header table size statistics (#109345)
Followup to #102363. This makes the `elf-object-writer.*Bytes` stats sum
up to `assembler.ObjectBytes`.
2024-09-23 13:53:58 -07:00
Arthur Eubanks
36293eea68
[NFC][ELF] Rename some ELFWriter methods (#109332)
More consistent casing + more accurate names.
2024-09-19 15:06:57 -07:00
Fangrui Song
59721f2326
[MIPS] Optimize sortRelocs for o32
The o32 ABI specifies:

> Each relocation type of R_MIPS_HI16 must have an associated R_MIPS_LO16 entry immediately following it in the list of relocations. [...] the addend AHL is computed as (AHI << 16) + (short)ALO

In practice, the high-part and low-part relocations may not be adjacent
in assembly files, requiring the assembler to reorder relocations.
http://reviews.llvm.org/D19718 performed the reordering, but did not
optimize for the common case where a %lo immediately follows its
matching %hi. The quadratic time complexity could make sections with
many relocations very slow to process.

This patch implements the fast path, simplifies the code, and makes the
behavior more similar to GNU assembler (for the .rel.mips_hilo_8b test).
We also remove `OriginalSymbol`, removing overhead for other targets.

Fix #104562

Pull Request: https://github.com/llvm/llvm-project/pull/104723
2024-08-23 00:05:20 -07:00
Fangrui Song
1a6bf94407 [MC] Remove ELFRelocationEntry::OriginalAddend
For MIPS's o32 ABI (REL), https://reviews.llvm.org/D19718 introduced
`OriginalAddend` to find the matching R_MIPS_LO16 relocation for
R_MIPS_GOT16 when STT_SECTION conversion is applicable.

    lw $2, %lo(local1)
    lui $2, %got(local1)

However, we could just store the original `Addend` in
`ELFRelocationEntry` and remove `OriginalAddend`.

Note: The relocation ordering algorithm in
https://reviews.llvm.org/D19718 is inefficient (#104562), which will be
addressed by another patch.
2024-08-18 15:47:38 -07:00
Arthur Eubanks
1baa6f75f3
[llvm][ELF] Add statistics on various section sizes (#102363)
Useful with other infrastructure that consume LLVM statistics to get an
idea of distribution of section sizes.

The breakdown of various section types is subject to change, this is
just an initial go at gather some sort of stats.

Example stats compiling X86ISelLowering.cpp (-g1):

```
        "elf-object-writer.AllocROBytes": 308268,
        "elf-object-writer.AllocRWBytes": 6240,
        "elf-object-writer.AllocTextBytes": 1659203,
        "elf-object-writer.DebugBytes": 3180386,
        "elf-object-writer.OtherBytes": 5862,
        "elf-object-writer.RelocationBytes": 2623440,
        "elf-object-writer.StrtabBytes": 228599,
        "elf-object-writer.SymtabBytes": 120336,
        "elf-object-writer.UnwindBytes": 85216,
```
2024-08-08 14:14:03 -07:00
Fangrui Song
5e1a5ffc2a [MC,ARM] Move SHF_ARM_PUECODE change for .text to ARMTargetELFStreamer::finish
and remove MCELFObjectWriter::addTargetSectionFlags.
2024-08-05 18:08:39 -07:00
Fangrui Song
2db576c8ef ELFObjectWriter: Remove unneeded subclasses 2024-07-23 00:15:29 -07:00
Fangrui Song
219d80bcb7 MCAssembler: Move FileNames and CompilerVersion to MCObjectWriter 2024-07-22 20:20:32 -07:00
Fangrui Song
9e97f80cc5 MCAssembler: Move Symvers to ELFObjectWriter
Similar to c473e75adeaf2998e4fb444b0bdbf2dd19312e50
2024-07-22 19:33:00 -07:00
Fangrui Song
c473e75ade MCAssmembler: Move ELFHeaderEFlags to ELFObjectWriter
Now that MCELFStreamer can access ELFObjectWriter (commit
70c52b62c5669993e341664a63bfbe5245e32884), we can move ELFHeaderEFlags
there.
2024-07-22 18:20:18 -07:00
Fangrui Song
70c52b62c5 [MC] Export llvm::ELFObjectWriter
Similar to commit 28fcafb50274be2520117eacb0a886adafefe59d (2011) for
MachObjectWriter and commit 9539a7796094ff5fb59d9c685140ea2e214b945c for
WinCOFFObjectWriter.

MCELFStreamer can now access ELFObjectWriter directly without adding
ELF-specific markGnuAbi (https://reviews.llvm.org/D97976) and
setOverrideABIVersion to MCObjectWriter.

A few member variables have to be made public since we cannot use a
friend declaration for ELFWriter.
2024-07-22 16:18:25 -07:00
Fangrui Song
1952dba49a [MC,ELF] Extract CREL encoder code
The extracted ELFObjectWriter.cpp code will be reused by llvm-objcopy
support (#97521).
2024-07-08 09:28:09 -07:00
Alexis Engelke
d54802092d
[MC][ELF] Eliminate some hash maps from ELFObjectWriter (#97421)
Remove some maps. Mostly cleanup, only a slight performance win.

- Replace SectionIndexMap with layout order: The section layout order is
only used in MachO, so we can repurpose the field as section table
index.
- Store section offsets in MCSectionELF: No need for a map, and
especially not a std::map. Direct access to the underlying (and easily
modifyable) data structure is always faster.
- Improve storage of groups: There's no point in having a DenseMap, the
number of sections and groups are reasonably small to use vectors.
2024-07-03 20:15:29 +02:00
Fangrui Song
057f28be3e [MC] Remove unused MCAsmLayout declarations and includes 2024-07-01 17:47:13 -07:00
Fangrui Song
bd3215149a MCExpr::evaluateKnownAbsolute: replace the MCAsmLayout parameter with MCAssembler
and add a comment.
2024-07-01 16:45:57 -07:00
Fangrui Song
6b707a8cc1 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::executePostLayoutBinding 2024-07-01 10:47:46 -07:00
Fangrui Song
1b704e889f
[MC,llvm-readobj,yaml2obj] Support CREL relocation format
CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form)
available with `llvm-mc -filetype=obj -crel a.s -o a.o`.
A dependent patch will add `clang -c -Wa,--crel,--allow-experimental-crel`.

Also add llvm-readobj support (for both REL and RELA forms) to
facilitate testing the assembler. Additionally, yaml2obj gains support
for the RELA form to aid testing with llvm-readobj.

We temporarily assign the section type code 0x40000020 from the generic
range to `SHT_CREL`. We avoided using `SHT_LLVM_` or `SHT_GNU_` to
avoid code churn and maintain broader applicability for interested psABIs.
Similarly, `DT_CREL` is temporarily 0x40000026.

LLVM will change the code and break compatibility. This is not an issue
if all relocatable files using CREL are regenerated (aka no prebuilt
relocatable files).

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/91280
2024-07-01 10:32:02 -07:00
Fangrui Song
8e8c455a06 ELFObjectWriter: Use DenseMap+SmallVector. NFC 2024-07-01 10:15:29 -07:00
Fangrui Song
dce1828683 [MC] Remove the MCAsmLayout parameter from ELFObjectWriter 2024-07-01 10:13:03 -07:00
Fangrui Song
23e6224374 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::{writeObject,writeSectionData} 2024-07-01 10:04:59 -07:00
Fangrui Song
4289c422a8 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::recordRelocation 2024-06-30 22:13:54 -07:00
Fangrui Song
67957a45ee [MC] Start merging MCAsmLayout into MCAssembler
Follow-up to 10c894cffd0f4bef21b54a43b5780240532e44cf.

MCAsmLayout, introduced by ac8a95498a99eb16dff9d3d0186616645d200b6e
(2010), provides APIs to compute fragment/symbol/section offsets.
The separate class is cumbersome and passing it around has overhead.
Let's remove it as the underlying implementation is tightly coupled with
MCAsmLayout anyway.

Some forwarders are added to ease migration.
2024-06-30 16:10:27 -07:00
Fangrui Song
bc6d925528 [MC] Simplify isSymbolRefDifferenceFullyResolvedImpl overloads. NFC
The base implementation is simple. Just inline it.
2024-06-29 16:10:33 -07:00
Fangrui Song
fec1b6f9d3 [MC] Move ELFWriter::createMemtagRelocs to AArch64TargetELFStreamer::finish
Follow-up to https://reviews.llvm.org/D128958

* Move target-specific code away from the generic ELFWriter.
* All sections should have been created before MCAssembler::layout.
* Remove one `registerSection` use, which should be considered private to MCAssembler.
2024-06-23 21:14:34 -07:00
Fangrui Song
5997e7d748 Revert "[MC] Move ELFWriter::createMemtagRelocs to AArch64ELFStreamer::finishImpl"
This reverts commit 9d63506ddc6d60e220d967eb11779114075d401d.

There is a heap-use-after-free.
2024-06-23 20:25:11 -07:00
Fangrui Song
9d63506ddc [MC] Move ELFWriter::createMemtagRelocs to AArch64ELFStreamer::finishImpl
Follow-up to https://reviews.llvm.org/D128958

* Move target-specific code away from the generic ELFWriter.
* All sections should have been created before MCAssembler::layout.
* Remove one `registerSection` use, which should be considered private to MCAssembler.
2024-06-23 15:57:44 -07:00
Fangrui Song
bf67610a8a [MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.

Note: bolt tests used /usr/bin/clang before
llvmorg-19-init-9532-g59bfc3106874.
The revert llvmorg-19-init-9531-g28b55342e1a8 actually broke
bolt/test/RISCV/fake-label-no-entry.c
2024-04-26 08:30:27 -07:00
Amir Ayupov
28b55342e1 Revert "[MC] Rename temporary symbols of empty name to ".L0 " (#89693)"
This reverts commit 96c45a7fa12619c3abd6b81effe4c80f0916b78b.

Broke BOLT builders and all pre-merge testing:
https://lab.llvm.org/buildbot/#/builders/244/builds/28097
2024-04-25 20:05:29 -07:00
Fangrui Song
b9f2c16b50 [MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.
2024-04-24 16:25:45 -07:00
Mehdi Amini
9961311216
Revert "[MC] Rename temporary symbols of empty name to ".L0 "" (#90002)
Reverts llvm/llvm-project#89693

This broke the premerge bot (bolt tests failing)
2024-04-25 01:00:31 +02:00
Fangrui Song
96c45a7fa1
[MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.
2024-04-24 13:16:02 -07:00
Fangrui Song
a41bfea5c0 [MC] Simplify ELFObjectWriter. NFC
And fix `if (hasRelocationAddend())` to `usesRela` to properly treat
SHT_LLVM_CALL_GRAPH_PROFILE as SHT_REL. The incorrect does not cause a
problem because the synthesized SHT_LLVM_CALL_GRAPH_PROFILE has zero
addends.
2024-03-27 22:10:11 -07:00
Fangrui Song
3a63f737e2 [MC] Refactor writeRelocations. NFC
MIPS is different and should better off use separate code.
2024-03-23 10:15:47 -07:00
Fangrui Song
87c7f4a12b [MC] Remove unnecessary reversal of relocations. NFC
Commit f44db24e1fd948c75c87aea017646f16553d3361 (2015) enabled this
simplication.
2024-03-23 10:03:09 -07:00