In the split-LTO-unit mode in ThinLTO, a compilation module is split
into two and global variables that meet a specific criteria is moved to
the split module.
d21fc58aee/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (L315-L366)
And if there is an originally local-linkage global value defined in the
original module and referenced in the split module or the vice versa,
that value is _promoted_ by attaching a module ID to their names in
order to prevent name clashes because now they can be referenced from
other modules.
d21fc58aee/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (L46-L100)
And when that promoted global value is a function, a
`.lto_set_conditional` entry is written to the original module to avoid
breaking references from inline assembly:
d21fc58aee/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (L84-L91)
The syntax of this is, if the original function name is `symbolA` and
the module ID is `123`,
```ll
module asm ".lto_set_conditional symbolA,symbolA.123"
```
These symbols are parsed here:
648981f913/llvm/lib/MC/MCParser/AsmParser.cpp (L6467)
The first function symbol in this `.lto_set_conditional` do not exist as
a function in the bitcode anymore because it was renamed to the second.
So they are not assigned as function symbols but they are not really
data either, so the object writer crashes here:
5b9e6c7993/llvm/lib/MC/WasmObjectWriter.cpp (L1820)
This PR makes the object writer just skip those symbols.
---
This problem was discovered when I was testing with
`-fwhole-program-vtables`. The reason we didn't have this problem before
with ThinLTO was because `-fsplit-lto-unit`, which splits LTO units when
possible, defaults to false, but it defaults to true when
`-fwhole-program-vtables` is used.
Some ELF targets don't use @ for relocation specifiers.
We should not report `error: invalid variant` when @ is used.
Attempt to make expr@specifier parsing less hacky.
Rework https://reviews.llvm.org/D31026
AllowAtInIdentifier is a misnomer: it should be false for ELF targets,
but is currently true as a hack to parse expr@specifier.
The existing pretty printer generates excessive parentheses for
MCBinaryExpr expressions. This update removes unnecessary parentheses
of MCBinaryExpr with +/- operators and MCUnaryExpr.
Since relocatable expressions only use + and -, this change improves
readability in most cases.
Examples:
- (SymA - SymB) + C now prints as SymA - SymB + C.
This updates the output of -fexperimental-relative-c++-abi-vtables for
AArch64 and x86 to `.long _ZN1B3fooEv@PLT-_ZTV1B-8`
- expr + (MCTargetExpr) now prints as expr + MCTargetExpr, with this
change primarily affecting AMDGPUMCExpr.
This MIPS behavior from edb9d84dcc4824865e86f963e52d67eb50dde7f5 (2010)
is obsoleted and misleading. This caused confusion in
https://reviews.llvm.org/D123702 ([NVPTX] Disable parens for identifiers
starting with '$')
Note: $tmp was rejected by AsmParser before
https://reviews.llvm.org/D75111 (2020)
clang -fexperimental-relative-c++-abi-vtables might generate `@plt` and
`@gotpcrel` specifiers in data directives. The syntax is not used in
humand-written assembly code, and is not supported by GNU assembler.
Note: the `@plt` in `.word foo@plt` is different from
the legacy `call func@plt` (where `@plt` is simply ignored).
The `@plt` syntax was selected was simply due to a quirk of AsmParser:
the syntax was supported by all targets until I updated it
to be an opt-in feature in a0671758eb6e52a758bd1b096a9b421eec60204c
RISC-V favors the `%specifier(expr)` syntax following MIPS and Sparc,
and we should follow this convention.
This PR adds support for `.word %pltpcrel(foo+offset)` and
`.word %gotpcrel(foo)`, and drops `@plt` and `@gotpcrel`.
* MCValue::SymA can no longer have a SymbolVariant. Add an assert
similar to that of AArch64ELFObjectWriter.cpp before
https://reviews.llvm.org/D81446 (see my analysis at
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers
if intrigued)
* `jump foo@plt, x31` now has a different diagnostic.
Pull Request: https://github.com/llvm/llvm-project/pull/132569
The directive temporarily switches to the .sxdata section to emit data,
and then calls `insert`, which makes `CurFrag` out of sync of the
current section. Call push/switch/pop instead.
Related to #132464
Option becomes: -instruction-tables=`<level>`
The choice of `<level>` controls number of printed information.
`<level>` may be `none` (default), `normal`, `full`.
Note: If the option is used without `<label>`, default is `normal`
(legacy).
When `<level>` is `full`, additional information are:
- `<Bypass Latency>`: Latency when a bypass is implemented between
operands
in pipelines (see SchedReadAdvance).
- `<LLVM Opcode Name>`: mnemonic plus operands identifier.
- `<Resources units>`: Used resources associated with LLVM Opcode.
- `<instruction comment>`: reports comment if any from source assembly.
Level `full` can be used to better check scheduling info when TableGen
is modified.
LLVM Opcode name help to find right instruction regexp to fix TableGen
Scheduling Info.
-instruction-tables=full option is validated on
AArch64/Neoverse/V1-sve-instructions.s
Follow up of MR #126703
---------
Co-authored-by: Julien Villette <julien.villette@sipearl.com>
Previously `MCSchedModel::getReciprocalThroughput` ignored
`AcquireAtCycle` completey, this patch fixes it by using the largest
`(ReleaseAtCycle - AcquireAtCycle) / NumUnits` as inverse throughput.
Here are some technical explanations:
https://myhsu.xyz/llvm-sched-interval-throughput
---------
Co-authored-by: Julien Villette <julien.villette@sipearl.com>
The parser function used for the .ifeqs and .ifnes directives was
missing the check for whether we are currently in an ignored block of an
outer conditional directive, causing the block to be evaluated when it
should not, for example:
.if 0
.ifeqs "a", "a"
// Should not be evaluated, but is
nop
.endif
.endif
And introduce MCValue::getAddSym & MCValue::getSubSym to simplify code.
We do not utilize the MCSymbol argument of needsRelocateWithSymbol
as it will go away in the future.
It's the decision of backend needsRelocateWithSymbol whether the
STT_SECTION adjustment should be suppressed.
test/MC/AArch64/data-directive-specifier.s demonstrates how to test this
property.
Finish the migration started by
eea7d32bd262bb5f61790c42ebaa147aa26c3979.
STT_TLS setting has been moved to backend getRelocType.
75f5a4f0dc7d96134cca86543ef3f86ef218ce77 migrated the last target, VE.
Similar to previous migration done for other targets (PowerPC, X86, ARM,
etc). Switch from the confusing VariantKind to Specifier, which aligns
with Arm and IBM AIX's documentation.
In addition, rename *MCExpr::getKind, which confusingly shadows the base class getKind.
In the future, relocation specifiers should be encoded as part of
SystemZMCExpr instead of MCSymbolRefExpr.
Move target-specific members outside of MCSymbolRefExpr::VariantKind
(a legacy interface I am eliminating). Most changes are mechanic,
except:
* ELFObjectWriter::shouldRelocateWithSymbol
* The legacy generic code uses `ELFObjectWriter::fixSymbolsInTLSFixups`
to set `STT_TLS` (and use an unnecessary expression walk). The better
way is to do this in `getRelocType`, which I have done for
AArch64, PowerPC, and RISC-V.
In the future, we should encode expressions with a relocation specifier
as X86MCExpr and use MCValue::RefKind to hold the specifier of the
relocatable expression.
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers
While here, rename "Modifier' to "Specifier":
> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.
Pull Request: https://github.com/llvm/llvm-project/pull/132149
Target shouldForceRelocation checks `FirstLiteralRelocationKind` to
determine whether a relocation is forced due to the .reloc directive. We
should move the code to evaluateFixup so that many targets don't need to
override shouldForceRelocation.
In `.equ a, 3; .if a@plt`, a@plt does not evaluate to an absolute value
(MCExpr::evaluateAsRelocatableImpl disables evaluation when the Kind !=
0 at parse time). Similarly, when using MCTargetValue,
evaluateAsAbsolute should return false when MCValue::RefKind==0.
This allows us to remove `if (!Asm)` check from MipsMCExpr.cpp
(%hi(0xdeadbeef) is not evaluated to a constant without RefKind) and
make targets less error-prone.
Commit 752b91bd821ad8a23e004b6cd631ae4f6984ae8b added the argument for
PowerPC to evaluate @l/@ha as constant in 2014. However, this is not
needed and has been cleaned up by
commit 8560da28c69de481f3ad147722577e87b902facb.
Mips also had an inappropriate use, which has been fixed by
79d84a878e83990c235da8710273a98bf835c915
GOFFOstream writes the physical 80 byte records. The records are
connected by flags indicating if there is a successor or a predecessor.
Using the length of the logical record is prone to errors. The new
implementation buffers the last physical record, and writes it out when
new data is written. In this way, the flags can be easily determined.
No obversable change in functionality, therefore no tests.
Most changes are mechanic, except:
* ELFObjectWriter::shouldRelocateWithSymbol: .TOC.@tocbase does not
register the undefined symbol. Move the handling into the
Sym->isUndefined() code path.
* ELFObjectWriter::fixSymbolsInTLSFixups's VK_PPC* cases are moved to
PPCELFObjectWriter::getRelocType. We should do similar refactoring
for other targets and eventually remove fixSymbolsInTLSFixups.
In the future, we should classify PPCMCExpr similar to AArch64MCExpr.
so that we only need to do it once during recordRelocation. In the
future, we should change fixSymbolsInTLSFixups to apply to MCValue
instead of MCExpr, similar to GNU assembler.
checkFeatures() currently goes through ApplyFeatureFlag(), which will
also handle implied features. This is very slow -- just querying every
feature once takes up 10% of a Rust hello world compile.
However, if we only want to query whether certain features are
set/unset, we can do so directly -- implied features have already been
handled when the FeatureBitset was constructed.
Reland https://github.com/llvm/llvm-project/pull/106230
The original PR was reverted due to compilation time regression.
This PR fixed that by adding a condition OutStreamer->isVerboseAsm() to
the generation of extra inlined-at debug info, so that it does not
affect normal compilation time.
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.
For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.
This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.
For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.
This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
The StringRef overload is often error-prone as users might forget to
register the MCSymbol.
Add comments to MCTargetExpr and MCSymbolRefExpr::VariantKind.
In the distant future the VariantKind parameter might be removed.
This commit adds support for WebAssembly's custom-page-sizes proposal to
`wasm-ld`. An overview of the proposal can be found
[here](https://github.com/WebAssembly/custom-page-sizes/blob/main/proposals/custom-page-sizes/Overview.md).
In a sentence, it allows customizing a Wasm memory's page size, enabling
Wasm to target environments with less than 64KiB of memory (the default
Wasm page size) available for Wasm memories.
This commit contains the following:
* Adds a `--page-size=N` CLI flag to `wasm-ld` for configuring the
linked Wasm binary's linear memory's page size.
* When the page size is configured to a non-default value, then the
final Wasm binary will use the encodings defined in the
custom-page-sizes proposal to declare the linear memory's page size.
* Defines a `__wasm_first_page_end` symbol, whose address points to the
first page in the Wasm linear memory, a.k.a. is the Wasm memory's page
size. This allows writing code that is compatible with any page size,
and doesn't require re-compiling its object code. At the same time,
because it just lowers to a constant rather than a memory access or
something, it enables link-time optimization.
* Adds tests for these new features.
r? @sbc100
cc @sunfishcode
They are error-prone as MCParser may parse a variant kind,
which cannot be handled by the target.
The replacement in MCAsmInfo should be used instead.
Follow-up to f244b8eed37a12539fb11b76e19ec7a7eb41dccc
52cf8e44880bcf614068b66b63393aa8da1edd76 (2013) introduced the
VK_PPC_TLSGD workaround to prevent unconditional reference to
_GLOBAL_OFFSET_TABLE_ in ELFObjectWriter.
e2b355d651ed8f2cbe61672c4c39b6419e471265 (2015) removed the
`_GLOBAL_OFFSET_TABLE_` hack for the generic VK_TLSGD,
making the VK_PPC_TLSGD workaround unneeded.