This reapplies commit 1911a50fae8a441b445eb835b98950710d28fc88 with a
minor fix in lld/ELF/LTO.cpp which sets Options.BBAddrMap when
`--lto-basic-block-sections=labels` is passed.
Remove the global variable `symtab` and add a member variable
(`std::unique_ptr<SymbolTable>`) to `Ctx` instead.
This is one step toward eliminating global states.
Pull Request: https://github.com/llvm/llvm-project/pull/109612
(this is the part related to bolt, lld and mlir)
Without these explicit includes, removing other headers, who implicitly
include llvm-config.h, may have non-trivial side effects. For example,
`clangd` may report even `llvm-config.h` as "no used" in case it defines
a macro, that is explicitly used with #ifdef. It is actually amplified
with different build configs which use different set of macros.
Similar to commit 686cff17cc310884e48ae963bf7507f96950cc90 for SHT_REL (#57693).
CREL hasn't been tested with ICF before.
And avoid a pitfall that eqClass[0] might interfere with ICF.
When both CREL and the experimental lld partitions feature are enabled,
the relocation section may look like .crel.llvm_sympart.f1, and
`rels.relas` is empty. While here, support relocation sections with zero
entry.
Ctx was introduced in March 2022 as a more suitable place for such
singletons.
llvm/Support/thread.h includes <thread>, which transitively includes
sstream in libc++ and uses ios_base::in, so we cannot use `#define in ctx.sec`.
`symtab, config, ctx` are now the only variables using
LLVM_LIBRARY_VISIBILITY.
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Strip calls to raw_string_ostream::str(), to avoid excess layer of indirection.
In AArch64, the endianness of instruction encodings is always little,
whereas the endianness of data swaps between LE and BE modes. So
getImplicitAddend must use the right one of read32() and read32le(), for
data and code respectively. It was using read32() throughout, causing
instructions to be read as big-endian in BE mode, getting the wrong
addend.
Fixed, and updated the existing test to check both endiannesses. The
expected results for data must be byte-swapped, but the ones for code
need no adjustment.
Fix the use-after-free bug and re-apply
https://github.com/llvm/llvm-project/pull/106193
* Without the fix, the string referenced by `objSym.Name` could be
destroyed even if string saver keeps a copy of the referenced string.
This caused use-after-free.
* The fix ([latest
commit](9776ed44cf))
updates `objSym.Name` to reference (via `StringRef`) the string saver's
copy.
Test:
1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible
with `-DLLVM_USE_SANITIZER=Address` and gone with the fix.
3. Run all tests by following
https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes.
* Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at
`@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined
[here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30)
* With the fix, the [multi-stage
test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh)
pass stage2 {asan, ubsan, masan}. This is also the test used by
https://lab.llvm.org/buildbot/#/builders/169
**Original commit message**
`StringMap<T>` creates a [copy of the
string](d4c519e7b2/llvm/include/llvm/ADT/StringMapEntry.h (L55-L58))
for entry insertions and intentionally keep copies [since the
implementation optimizes string memory
usage](d4c519e7b2/llvm/include/llvm/ADT/StringMap.h (L124)).
On the other hand, linker keeps copies of symbol names [1] in
`lld:🧝:parseFiles` [2] before invoking `compileBitcodeFiles` [3].
This change proposes to optimize away string copies inside
[LTO::GlobalResolutions](24e791b416/llvm/include/llvm/LTO/LTO.h (L409)),
which will make LTO indexing more memory efficient for ELF. There are
similar opportunities for other (COFF, wasm, MachO) formats.
The optimization takes place for lld (ELF) only. For the rest of use
cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep
copies and use global resolution key for de-duplication.
Together with @kazutakahirata's work to make `ComputeCrossModuleImport`
more memory efficient, we see a ~20% peak memory usage reduction in a
binary where peak memory usage needs to go down. Thanks to the
optimization in
329ba523cc,
the max (as opposed to the sum) of `ComputeCrossModuleImport` or
`GlobalResolution` shows up in peak memory usage.
* Regarding correctness, the set of
[resolved](80c47ad3ae/llvm/lib/LTO/LTO.cpp (L739))
[per-module
symbols](80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L188-L191))
is a subset of
[llvm::lto::InputFile::Symbols](80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L120)).
And bitcode symbol parsing saves symbol name when iterating
`obj->symbols` in `BitcodeFile::parse` already. This change updates
`BitcodeFile::parseLazy` to keep copies of per-module undefined symbols.
* Presumably the undefined symbols in a LTO unit (copied in this patch
in linker unique saver) is a small set compared with the set of symbols
in global-resolution (copied before this patch), making this a
worthwhile trade-off. Benchmarking this change alone shows measurable
memory savings across various benchmarks.
[1] ELF
1cea5c2138/lld/ELF/InputFiles.cpp (L1748)
[2]
ef7b18a53c/lld/ELF/Driver.cpp (L2863)
[3]
ef7b18a53c/lld/ELF/Driver.cpp (L2995)