670 Commits

Author SHA1 Message Date
Fangrui Song
5d5f16204f Move PowerPC-specific MCSymbolRefExpr::VariantKind to PPCMCExpr
Most changes are mechanic, except:

* ELFObjectWriter::shouldRelocateWithSymbol: .TOC.@tocbase does not
  register the undefined symbol.  Move the handling into the
  Sym->isUndefined() code path.
* ELFObjectWriter::fixSymbolsInTLSFixups's VK_PPC* cases are moved to
  PPCELFObjectWriter::getRelocType. We should do similar refactoring
  for other targets and eventually remove fixSymbolsInTLSFixups.

In the future, we should classify PPCMCExpr similar to AArch64MCExpr.
2025-03-12 23:00:03 -07:00
Fangrui Song
eea7d32bd2 [MC] Move fixSymbolsInTLSFixups to ELFObjectWriter
so that we only need to do it once during recordRelocation. In the
future, we should change fixSymbolsInTLSFixups to apply to MCValue
instead of MCExpr, similar to GNU assembler.
2025-03-12 19:49:52 -07:00
Arthur Eubanks
7aabbf22f0
[llvm][ELF] Separate out .dwo bytes written in stats (#126165)
So we can distinguish between debug info sections written to .dwo files
and those written to the object file.
2025-02-07 10:34:16 -08:00
Kazu Hirata
24892b8681
[MC] Avoid repeated hash lookups (NFC) (#123502) 2025-01-19 10:58:26 -08:00
Kazu Hirata
d73d5c8c9b
[MC] Remove unused includes (NFC) (#116317)
Identified with misc-include-cleaner.
2024-11-15 07:26:22 -08:00
Arthur Eubanks
e44ecf76e0
[llvm][ELF] Add ELF header/section header table size statistics (#109345)
Followup to #102363. This makes the `elf-object-writer.*Bytes` stats sum
up to `assembler.ObjectBytes`.
2024-09-23 13:53:58 -07:00
Arthur Eubanks
36293eea68
[NFC][ELF] Rename some ELFWriter methods (#109332)
More consistent casing + more accurate names.
2024-09-19 15:06:57 -07:00
Fangrui Song
59721f2326
[MIPS] Optimize sortRelocs for o32
The o32 ABI specifies:

> Each relocation type of R_MIPS_HI16 must have an associated R_MIPS_LO16 entry immediately following it in the list of relocations. [...] the addend AHL is computed as (AHI << 16) + (short)ALO

In practice, the high-part and low-part relocations may not be adjacent
in assembly files, requiring the assembler to reorder relocations.
http://reviews.llvm.org/D19718 performed the reordering, but did not
optimize for the common case where a %lo immediately follows its
matching %hi. The quadratic time complexity could make sections with
many relocations very slow to process.

This patch implements the fast path, simplifies the code, and makes the
behavior more similar to GNU assembler (for the .rel.mips_hilo_8b test).
We also remove `OriginalSymbol`, removing overhead for other targets.

Fix #104562

Pull Request: https://github.com/llvm/llvm-project/pull/104723
2024-08-23 00:05:20 -07:00
Fangrui Song
1a6bf94407 [MC] Remove ELFRelocationEntry::OriginalAddend
For MIPS's o32 ABI (REL), https://reviews.llvm.org/D19718 introduced
`OriginalAddend` to find the matching R_MIPS_LO16 relocation for
R_MIPS_GOT16 when STT_SECTION conversion is applicable.

    lw $2, %lo(local1)
    lui $2, %got(local1)

However, we could just store the original `Addend` in
`ELFRelocationEntry` and remove `OriginalAddend`.

Note: The relocation ordering algorithm in
https://reviews.llvm.org/D19718 is inefficient (#104562), which will be
addressed by another patch.
2024-08-18 15:47:38 -07:00
Arthur Eubanks
1baa6f75f3
[llvm][ELF] Add statistics on various section sizes (#102363)
Useful with other infrastructure that consume LLVM statistics to get an
idea of distribution of section sizes.

The breakdown of various section types is subject to change, this is
just an initial go at gather some sort of stats.

Example stats compiling X86ISelLowering.cpp (-g1):

```
        "elf-object-writer.AllocROBytes": 308268,
        "elf-object-writer.AllocRWBytes": 6240,
        "elf-object-writer.AllocTextBytes": 1659203,
        "elf-object-writer.DebugBytes": 3180386,
        "elf-object-writer.OtherBytes": 5862,
        "elf-object-writer.RelocationBytes": 2623440,
        "elf-object-writer.StrtabBytes": 228599,
        "elf-object-writer.SymtabBytes": 120336,
        "elf-object-writer.UnwindBytes": 85216,
```
2024-08-08 14:14:03 -07:00
Fangrui Song
5e1a5ffc2a [MC,ARM] Move SHF_ARM_PUECODE change for .text to ARMTargetELFStreamer::finish
and remove MCELFObjectWriter::addTargetSectionFlags.
2024-08-05 18:08:39 -07:00
Fangrui Song
2db576c8ef ELFObjectWriter: Remove unneeded subclasses 2024-07-23 00:15:29 -07:00
Fangrui Song
219d80bcb7 MCAssembler: Move FileNames and CompilerVersion to MCObjectWriter 2024-07-22 20:20:32 -07:00
Fangrui Song
9e97f80cc5 MCAssembler: Move Symvers to ELFObjectWriter
Similar to c473e75adeaf2998e4fb444b0bdbf2dd19312e50
2024-07-22 19:33:00 -07:00
Fangrui Song
c473e75ade MCAssmembler: Move ELFHeaderEFlags to ELFObjectWriter
Now that MCELFStreamer can access ELFObjectWriter (commit
70c52b62c5669993e341664a63bfbe5245e32884), we can move ELFHeaderEFlags
there.
2024-07-22 18:20:18 -07:00
Fangrui Song
70c52b62c5 [MC] Export llvm::ELFObjectWriter
Similar to commit 28fcafb50274be2520117eacb0a886adafefe59d (2011) for
MachObjectWriter and commit 9539a7796094ff5fb59d9c685140ea2e214b945c for
WinCOFFObjectWriter.

MCELFStreamer can now access ELFObjectWriter directly without adding
ELF-specific markGnuAbi (https://reviews.llvm.org/D97976) and
setOverrideABIVersion to MCObjectWriter.

A few member variables have to be made public since we cannot use a
friend declaration for ELFWriter.
2024-07-22 16:18:25 -07:00
Fangrui Song
1952dba49a [MC,ELF] Extract CREL encoder code
The extracted ELFObjectWriter.cpp code will be reused by llvm-objcopy
support (#97521).
2024-07-08 09:28:09 -07:00
Alexis Engelke
d54802092d
[MC][ELF] Eliminate some hash maps from ELFObjectWriter (#97421)
Remove some maps. Mostly cleanup, only a slight performance win.

- Replace SectionIndexMap with layout order: The section layout order is
only used in MachO, so we can repurpose the field as section table
index.
- Store section offsets in MCSectionELF: No need for a map, and
especially not a std::map. Direct access to the underlying (and easily
modifyable) data structure is always faster.
- Improve storage of groups: There's no point in having a DenseMap, the
number of sections and groups are reasonably small to use vectors.
2024-07-03 20:15:29 +02:00
Fangrui Song
057f28be3e [MC] Remove unused MCAsmLayout declarations and includes 2024-07-01 17:47:13 -07:00
Fangrui Song
bd3215149a MCExpr::evaluateKnownAbsolute: replace the MCAsmLayout parameter with MCAssembler
and add a comment.
2024-07-01 16:45:57 -07:00
Fangrui Song
6b707a8cc1 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::executePostLayoutBinding 2024-07-01 10:47:46 -07:00
Fangrui Song
1b704e889f
[MC,llvm-readobj,yaml2obj] Support CREL relocation format
CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form)
available with `llvm-mc -filetype=obj -crel a.s -o a.o`.
A dependent patch will add `clang -c -Wa,--crel,--allow-experimental-crel`.

Also add llvm-readobj support (for both REL and RELA forms) to
facilitate testing the assembler. Additionally, yaml2obj gains support
for the RELA form to aid testing with llvm-readobj.

We temporarily assign the section type code 0x40000020 from the generic
range to `SHT_CREL`. We avoided using `SHT_LLVM_` or `SHT_GNU_` to
avoid code churn and maintain broader applicability for interested psABIs.
Similarly, `DT_CREL` is temporarily 0x40000026.

LLVM will change the code and break compatibility. This is not an issue
if all relocatable files using CREL are regenerated (aka no prebuilt
relocatable files).

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/91280
2024-07-01 10:32:02 -07:00
Fangrui Song
8e8c455a06 ELFObjectWriter: Use DenseMap+SmallVector. NFC 2024-07-01 10:15:29 -07:00
Fangrui Song
dce1828683 [MC] Remove the MCAsmLayout parameter from ELFObjectWriter 2024-07-01 10:13:03 -07:00
Fangrui Song
23e6224374 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::{writeObject,writeSectionData} 2024-07-01 10:04:59 -07:00
Fangrui Song
4289c422a8 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::recordRelocation 2024-06-30 22:13:54 -07:00
Fangrui Song
67957a45ee [MC] Start merging MCAsmLayout into MCAssembler
Follow-up to 10c894cffd0f4bef21b54a43b5780240532e44cf.

MCAsmLayout, introduced by ac8a95498a99eb16dff9d3d0186616645d200b6e
(2010), provides APIs to compute fragment/symbol/section offsets.
The separate class is cumbersome and passing it around has overhead.
Let's remove it as the underlying implementation is tightly coupled with
MCAsmLayout anyway.

Some forwarders are added to ease migration.
2024-06-30 16:10:27 -07:00
Fangrui Song
bc6d925528 [MC] Simplify isSymbolRefDifferenceFullyResolvedImpl overloads. NFC
The base implementation is simple. Just inline it.
2024-06-29 16:10:33 -07:00
Fangrui Song
fec1b6f9d3 [MC] Move ELFWriter::createMemtagRelocs to AArch64TargetELFStreamer::finish
Follow-up to https://reviews.llvm.org/D128958

* Move target-specific code away from the generic ELFWriter.
* All sections should have been created before MCAssembler::layout.
* Remove one `registerSection` use, which should be considered private to MCAssembler.
2024-06-23 21:14:34 -07:00
Fangrui Song
5997e7d748 Revert "[MC] Move ELFWriter::createMemtagRelocs to AArch64ELFStreamer::finishImpl"
This reverts commit 9d63506ddc6d60e220d967eb11779114075d401d.

There is a heap-use-after-free.
2024-06-23 20:25:11 -07:00
Fangrui Song
9d63506ddc [MC] Move ELFWriter::createMemtagRelocs to AArch64ELFStreamer::finishImpl
Follow-up to https://reviews.llvm.org/D128958

* Move target-specific code away from the generic ELFWriter.
* All sections should have been created before MCAssembler::layout.
* Remove one `registerSection` use, which should be considered private to MCAssembler.
2024-06-23 15:57:44 -07:00
Fangrui Song
bf67610a8a [MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.

Note: bolt tests used /usr/bin/clang before
llvmorg-19-init-9532-g59bfc3106874.
The revert llvmorg-19-init-9531-g28b55342e1a8 actually broke
bolt/test/RISCV/fake-label-no-entry.c
2024-04-26 08:30:27 -07:00
Amir Ayupov
28b55342e1 Revert "[MC] Rename temporary symbols of empty name to ".L0 " (#89693)"
This reverts commit 96c45a7fa12619c3abd6b81effe4c80f0916b78b.

Broke BOLT builders and all pre-merge testing:
https://lab.llvm.org/buildbot/#/builders/244/builds/28097
2024-04-25 20:05:29 -07:00
Fangrui Song
b9f2c16b50 [MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.
2024-04-24 16:25:45 -07:00
Mehdi Amini
9961311216
Revert "[MC] Rename temporary symbols of empty name to ".L0 "" (#90002)
Reverts llvm/llvm-project#89693

This broke the premerge bot (bolt tests failing)
2024-04-25 01:00:31 +02:00
Fangrui Song
96c45a7fa1
[MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.
2024-04-24 13:16:02 -07:00
Fangrui Song
a41bfea5c0 [MC] Simplify ELFObjectWriter. NFC
And fix `if (hasRelocationAddend())` to `usesRela` to properly treat
SHT_LLVM_CALL_GRAPH_PROFILE as SHT_REL. The incorrect does not cause a
problem because the synthesized SHT_LLVM_CALL_GRAPH_PROFILE has zero
addends.
2024-03-27 22:10:11 -07:00
Fangrui Song
3a63f737e2 [MC] Refactor writeRelocations. NFC
MIPS is different and should better off use separate code.
2024-03-23 10:15:47 -07:00
Fangrui Song
87c7f4a12b [MC] Remove unnecessary reversal of relocations. NFC
Commit f44db24e1fd948c75c87aea017646f16553d3361 (2015) enabled this
simplication.
2024-03-23 10:03:09 -07:00
Fangrui Song
a331937197 [MC] Move CompressDebugSections/RelaxELFRelocations from TargetOptions/MCAsmInfo to MCTargetOptions
The convention is for such MC-specific options to reside in
MCTargetOptions. However, CompressDebugSections/RelaxELFRelocations do
not follow the convention: `CompressDebugSections` is defined in both
TargetOptions and MCAsmInfo and there is forwarding complexity.

Move the option to MCTargetOptions and hereby simplify the code. Rename
the misleading RelaxELFRelocations to X86RelaxRelocations. llvm-mc
-relax-relocations and llc -x86-relax-relocations can now be unified.
2024-03-06 23:19:59 -08:00
Emma Pilkington
bc82cfb38d
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.
2024-01-21 11:54:47 -05:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
Fangrui Song
76a441a6ef [MC] Fix compression header size check in ELF writer
This is #66888 with a test. For MC we only use a zstd test, as zlib has
a lot of versions/forks with different speed/size tradeoff, which would
make the test more brittle. If clang/test/Misc/cc1as-compress.s turns
out to be brittle, we could make the string longer.
2023-11-17 01:13:38 -08:00
Hans Wennborg
e96889d36f Revert "Fix compression header size check in ELF writer (#66888)"
This broke lit tests in zstd enabled builds, see comment on the PR.

> The test had 32-bit and 64-bit header sizes the wrong way around.

This reverts commit c5ecf5a130f087f493802800f3565c7bb75c238a.
2023-11-06 14:32:24 +01:00
myxoid
c5ecf5a130
Fix compression header size check in ELF writer (#66888)
The test had 32-bit and 64-bit header sizes the wrong way around.
2023-11-05 15:33:20 -08:00
Kazu Hirata
4a0ccfa865 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-12 21:21:45 -07:00
Fangrui Song
5be7f2a943 [MC,AArch64] Suppress local symbol to STT_SECTION conversion for GOT relocations
Assemblers change certain relocations referencing a local symbol to
reference the section symbol instead. This conversion is disabled for
many conditions (`shouldRelocateWithSymbol`), e.g. TLS symbol, for most
targets (including AArch32, x86, PowerPC, and RISC-V) GOT-generating
relocations.

However, AArch64 encodes the GOT-generating intent in MCValue::RefKind
instead of MCSymbolRef::Kind (see commit
0999cbd0b9ed8aa893cce10d681dec6d54b200ad (2014)), therefore not affected
by the code `case MCSymbolRefExpr::VK_GOT:`. As GNU ld and ld.lld
create GOT entries based on the symbol, ignoring addend, the two ldr
instructions will share the same GOT entry, which is not expected:
```
ldr     x1, [x1, :got_lo12:x]  // converted to .data+0
ldr     x1, [x1, :got_lo12:y]  // converted to .data+4

.data
// .globl x, y  would suppress STT_SECTION conversion
x:
.zero 4
y:
.long 42
```

This patch changes AArch64 to suppress local symbol to STT_SECTION
conversion for GOT relocations, matching most other targets. x and y
will use different GOT entries, which IMO is the most sensable behavior.

With this change, the ABI decision on https://github.com/ARM-software/abi-aa/issues/217
will only affect relocations explicitly referencing STT_SECTION symbols, e.g.
```
ldr     x1, [x1, :got_lo12:(.data+0)]
ldr     x1, [x1, :got_lo12:(.data+4)]
// I consider this unreasonable uses
```

IMO all reasonable use cases are unaffected.

Link: https://github.com/llvm/llvm-project/issues/63418
GNU assembler PR: https://sourceware.org/bugzilla/show_bug.cgi?id=30788

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D158577
2023-08-29 11:07:12 -07:00
Arthur Eubanks
32ceee1303 Revert "[MC,x86-32] Remove a gold<2.34 workaround"
This reverts commit a699921baa91e6c2979ec0f0482430c57f51761d.

Seems to cause miscompiles (https://crbug.com/1459232), following up with author.
2023-06-29 21:19:07 -07:00
Elliot Goodrich
b0abd4893f [llvm] Add missing StringExtras.h includes
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
2023-06-25 15:42:22 +01:00
Fangrui Song
a699921baa [MC,x86-32] Remove a gold<2.34 workaround
This workaround appears to apply with gold<2.34 -O2/-O3 (linker -O2, not
compiler driver -O2). This used to be more visible as we used -Wl,-O3 in
CMake, but the option is generally not recommended and has been removed
by d63016a86548e8231002a760bbe9eb817cd1eb00 (Dec 2021).

This finishes a workaround removal work started by D64327 (2019).

Link: https://github.com/llvm/llvm-project/issues/45269
2023-06-22 13:44:15 -07:00