llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 13:46:37 +00:00

Author	SHA1	Message	Date
Csanád Hajdú	2c1bdd4a08	[LLD][ELF] Allow merging XO and RX sections, and add `--[no-]xosegment` flag (#132412 ) Following from the discussion in #132224, this seems like the best approach to deal with a mix of XO and RX output sections in the same binary. This change will also simplify the implementation of the PURECODE section flag for AArch64. To control this behaviour, the `--[no-]xosegment` flag is added to LLD (similarly to `--[no-]rosegment`), which determines whether to allow merging XO and RX sections in the same segment. The default value is `--no-xosegment`, which is a breaking change compared to the previous behaviour. Release notes are also added, since this will be a breaking change.	2025-04-08 08:47:51 +02:00
Daniel Thornburgh	074af0f30f	[lld][ELF] Add --why-live flag (inspired by Mach-O) (#127112 ) This prints a stack of reasons that symbols that match the given glob(s) survived GC. It has no effect unless section GC occurs. This implementation does not require -ffunction-sections or -fdata-sections to produce readable results, althought it does tend to work better (as does GC). Details about the semantics: - Some chain of liveness reasons is reported; it isn't specified which chain. - A symbol or section may be live: - Intrisically (e.g., entry point) - Because needed by a live symbol or section - (Symbols only) Because part of a section live for another reason - (Sections only) Because they contain a live symbol - Both global and local symbols (`STB_LOCAL`) are supported. - References to symbol + offset are considered to point to: - If the referenced symbol is a section (`STT_SECTION`): - If a sized symbol encloses the referenced offset, the enclosing symbol. - Otherwise, the section itself, generically. - Otherwise, the referenced symbol.	2025-03-26 09:56:33 -07:00
Fangrui Song	0a470a9264	[ELF] --package-metadata: support %[0-9a-fA-F][0-9a-fA-F] (This application-specific option is probably not appropriate as a linker option (.o file offers more flexibility and decouples JSON verification from linkers). However, the option has gained some traction in Linux distributions, with support in GNU ld, gold, and mold.) GNU ld has supported percent-encoded bytes and extensions like `%[comma]` since November 2024. mold supports just percent-encoded bytes. To prepare for potential adoption by Ubuntu, let's support percent-encoded bytes. Link: https://sourceware.org/bugzilla/show_bug.cgi?id=32003 Link: https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/2071468 Pull Request: https://github.com/llvm/llvm-project/pull/126396	2025-02-10 09:21:31 -08:00
Fangrui Song	6ab034b828	[ELF] Add BPSectionOrderer options (#125559 ) Reland #120514 after 2f6e3df08a8b7cd29273980e47310cf09c6fdbd8 fixed iteration order issue and libstdc++/libc++ differences. --- Both options instruct the linker to optimize section layout with the following goals: * `--bp-compression-sort=[data\|function\|both]`: Improve Lempel-Ziv compression by grouping similar sections together, resulting in a smaller compressed app size. * `--bp-startup-sort=function --irpgo-profile=<file>`: Utilize a temporal profile file to reduce page faults during program startup. The linker determines the section order by considering three groups: * Function sections ordered according to the temporal profile (`--irpgo-profile=`), prioritizing early-accessed and frequently accessed functions. * Function sections. Sections containing similar functions are placed together, maximizing compression opportunities. * Data sections. Similar data sections are placed together. Within each group, the sections are ordered using the Balanced Partitioning algorithm. The linker constructs a bipartite graph with two sets of vertices: sections and utility vertices. * For profile-guided function sections: + The number of utility vertices is determined by the symbol order within the profile file. + If `--bp-compression-sort-startup-functions` is specified, extra utility vertices are allocated to prioritize nearby function similarity. * For sections ordered for compression: Utility vertices are determined by analyzing k-mers of the section content and relocations. The call graph profile is disabled during this optimization. When `--symbol-ordering-file=` is specified, sections described in that file are placed earlier. Co-authored-by: Pengying Xu <xpy66swsry@gmail.com>	2025-02-04 09:12:32 -08:00
Hans Wennborg	f3c4b58f4b	Revert "[ELF] Add BPSectionOrderer options (#120514 )" The ELF/bp-section-orderer.s test is failing on some buildbots due to what seems like non-determinism issues, see comments on the original PR and #125450 Reverting to green the build. This reverts commit 0154dce8d39d2688b09f4e073fe601099a399365 and follow-up commits 046dd4b28b9c1a75a96cf63465021ffa9fe1a979 and c92f20416e6dbbde9790067b80e75ef1ef5d0fa4.	2025-02-03 11:41:23 +01:00
Pengying Xu	0154dce8d3	[ELF] Add BPSectionOrderer options (#120514 ) Add new ELF linker options for profile-guided section ordering optimizations: - `--irpgo-profile=<file>`: Read IRPGO profile data for use with startup and compression optimizations - `--bp-startup-sort={none,function}`: Order sections based on profile data to improve star tup time - `--bp-compression-sort={none,function,data,both}`: Order sections using balanced partitioning to improve compressed size - `--bp-compression-sort-startup-functions`: Additionally optimize startup functions for compression - `--verbose-bp-section-orderer`: Print statistics about balanced partitioning section ordering Thanks to the @ellishg, @thevinster, and their team's work. --------- Co-authored-by: Fangrui Song <i@maskray.me>	2025-02-02 17:33:19 -08:00
Peter Collingbourne	64da33a589	ELF: Introduce --randomize-section-padding option. The --randomize-section-padding option randomly inserts padding between input sections using the given seed. It is intended to be used in A/B experiments to determine the average effect of a change on program performance, while controlling for effects such as false sharing in the cache which may introduce measurement bias. For more details, see the RFC: https://discourse.llvm.org/t/rfc-lld-feature-for-controlling-for-code-size-dependent-measurement-bias/83334 Reviewers: smithp35, MaskRay Reviewed By: MaskRay, smithp35 Pull Request: https://github.com/llvm/llvm-project/pull/117653	2024-12-13 11:52:09 -08:00
Enna1	fcfd64304f	[lld][ELF] Fix typo in help text for plugin-opt=opt-remarks-with-hotness (NFC) (#114016 )	2024-10-30 19:25:08 +08:00
Min-Yih Hsu	5fe852e774	[lld][ELF] Add `-plugin-opt=time-trace=` as an alias of `--time-trace=` (#106803 ) Time trace profiler support was added into LLVMgold in cd3255abede5e3687c1538f2d3857deb2c51af1b. This patch adds its `-plugin-opt` counterpart, which is just an alias to `--time-trace=`, into LLD for compatibility.	2024-09-01 17:38:59 -07:00
Joseph Huber	93e0ffa790	[lld] Add `--lto-emit-asm` and alias `--plugin-opt=emit-llvm` to it (#97469 ) Summary: The LTO pass currently supporting emitting LTO via the `--plugin-opt=emit-llvm` option. However, there is a very similar option called `--lto-emit-asm`. This patch just makes the usage more consistent and more obvious that emitting LLVM-IR is supported.	2024-07-02 15:35:51 -05:00
Tatsuyuki Ishi	1d96e4bc2d	[ELF] Change build-id default to sha1 (#93943 ) The current default, build-id=fast, is only 8 bytes due to the usage of 64-bit XXH3. This is incompatible with RPM packaging tools which requires >=16 bytes [1]. In Clang the ENABLE_LINKER_BUILD_ID define makes it pass --build-id without a specific hash type. When also defaulting to LLD, this provides a pretty broken default out-of-box. Using XXH3 was a considerable performance advantage when build-id was first implemented, because sha1 was really sha1 and rather slow. Nowadays sha1 is just 160-bit BLAKE3 which is decently fast and not cryptographically broken, so it should be a good default. Note that the default remains "fast" for wasm because sha1 for wasm is still real sha1. Close https://github.com/llvm/llvm-project/issues/43483. [1]: `b7d427728b/build/files.c (L1883)`	2024-06-10 10:14:44 -07:00
Fangrui Song	4d9020ca0b	[ELF] Implement --force-group-allocation GNU ld's relocatable linking behaviors: * Sections with the `SHF_GROUP` flag are handled like sections matched by the `--unique=pattern` option. They are processed like orphan sections and ignored by input section descriptions. * Section groups' (usually named `.group`) content is updated as the section indexes are updated. Section groups can be discarded with `/DISCARD/ : { (.group) }`. `-r --force-group-allocation` discards section groups and allows sections with the `SHF_GROUP` flag to be matched like normal sections. If two section group members are placed into the same output section, their relocation sections (if present) are combined as well. This behavior can be useful when -r output is used as a pseudo shared object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments). This patch implements --force-group-allocation: Input SHT_GROUP sections are discarded. * Input sections do not get the SHF_GROUP flag, so `addInputSec` will combine relocation sections if their relocated section group members are combined. The default behavior is: * Input SHT_GROUP sections are retained. * Input SHF_GROUP sections can be matched (unlike GNU ld) * Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec` will create different OutputDesc copies. GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not implemented. Pull Request: https://github.com/llvm/llvm-project/pull/94704	2024-06-07 14:19:06 -07:00
Fangrui Song	cfa97699f7	[ELF] Retain uncompressed if compressed content is larger --compress-debug-sections in GNU ld, gas, and LLVM integrated assembler retain the uncompressed content if the compressed content is larger. This patch also updates the manpage (-O2 does not enable zlib level 6) and fixes a crash of --compress-sections when the uncompressed section is empty.	2024-05-22 15:55:21 -07:00
Daniel Thornburgh	66466ff151	Reland: [LLD] Implement --enable-non-contiguous-regions (#90007 ) When enabled, input sections that would otherwise overflow a memory region are instead spilled to the next matching output section. This feature parallels the one in GNU LD, but there are some differences from its documented behavior: - /DISCARD/ only matches previously-unmatched sections (i.e., the flag does not affect it). - If a section fails to fit at any of its matches, the link fails instead of discarding the section. - The flag --enable-non-contiguous-regions-warnings is not implemented, as it exists to warn about such occurrences. The implementation places stubs at possible spill locations, and replaces them with the original input section when effecting spills. Spilling decisions occur after address assignment. Sections are spilled in reverse order of assignment, with each spill naively decreasing the size of the affected memory regions. This continues until the memory regions are brought back under size. Spilling anything causes another pass of address assignment, and this continues to fixed point. Spilling after rather than during assignment allows the algorithm to consider the size effects of unspillable input sections that appear later in the assignment. Otherwise, such sections (e.g. thunks) may force an overflow, even if spilling something earlier could have avoided it. A few notable feature interactions occur: - Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the input section were actually placed there. - SHF_MERGE synthetic sections use the spill list of their first contained input section (the one that gives the section its name). - ICF occurs oblivious to spill sections; spill lists for merged-away sections become inert and are removed after assignment. - SHF_LINK_ORDER and .ARM.exidx are ordered according to the final section ordering, after all spilling has completed. - INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.	2024-05-13 11:06:54 -07:00
Daniel Thornburgh	81f34afa5c	Revert "[LLD] Implement --enable-non-contiguous-regions" (#92005 ) Reverts llvm/llvm-project#90007 Broke in merging I think.	2024-05-13 10:38:40 -07:00
Daniel Thornburgh	673114447b	[LLD] Implement --enable-non-contiguous-regions (#90007 ) When enabled, input sections that would otherwise overflow a memory region are instead spilled to the next matching output section. This feature parallels the one in GNU LD, but there are some differences from its documented behavior: - /DISCARD/ only matches previously-unmatched sections (i.e., the flag does not affect it). - If a section fails to fit at any of its matches, the link fails instead of discarding the section. - The flag --enable-non-contiguous-regions-warnings is not implemented, as it exists to warn about such occurrences. The implementation places stubs at possible spill locations, and replaces them with the original input section when effecting spills. Spilling decisions occur after address assignment. Sections are spilled in reverse order of assignment, with each spill naively decreasing the size of the affected memory regions. This continues until the memory regions are brought back under size. Spilling anything causes another pass of address assignment, and this continues to fixed point. Spilling after rather than during assignment allows the algorithm to consider the size effects of unspillable input sections that appear later in the assignment. Otherwise, such sections (e.g. thunks) may force an overflow, even if spilling something earlier could have avoided it. A few notable feature interactions occur: - Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the input section were actually placed there. - SHF_MERGE synthetic sections use the spill list of their first contained input section (the one that gives the section its name). - ICF occurs oblivious to spill sections; spill lists for merged-away sections become inert and are removed after assignment. - SHF_LINK_ORDER and .ARM.exidx are ordered according to the final section ordering, after all spilling has completed. - INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.	2024-05-13 10:30:50 -07:00
Fangrui Song	04d0a691af	[ELF] Fix --compress-debug-sections=zstd when zlib is disabled	2024-05-07 16:56:45 -07:00
Fangrui Song	6d44a1ef55	[ELF] Adjust --compress-sections to support compression level zstd excels at scaling from low-ratio-very-fast to high-ratio-pretty-slow. Some users prioritize speed and prefer disk read speed, while others focus on achieving the highest compression ratio possible, similar to traditional high-ratio codecs like LZMA. Add an optional `level` to `--compress-sections` (#84855) to cater to these diverse needs. While we initially aimed for a one-size-fits-all approach, this no longer seems to work. (https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html) When --compress-debug-sections is used together, make --compress-sections take precedence since --compress-sections is usually more specific. Remove the level distinction between -O/-O1 and -O2 for --compress-debug-sections=zlib for a more consistent user experience. Pull Request: https://github.com/llvm/llvm-project/pull/90567	2024-05-01 11:40:46 -07:00
Fangrui Song	f02a27df2f	[ELF] Add --default-script/-dT GNU ld added --default-script (alias: -dT) in 2007. The option specifies a default script that is processed if --script/-T is not specified. -dT can be used to override GNU ld's internal linker script, but only when the application does not specify -T. In addition, dynamorio's CMakeLists.txt may use -dT. The implementation is simple and the feature can be useful to dabble with different section layouts. Pull Request: https://github.com/llvm/llvm-project/pull/89327	2024-04-19 09:09:41 -07:00
cmtice	16711b431b	[lld][ELF] Add --debug-names to create merged .debug_names. (#86508 ) `clang -g -gpubnames` (with optional -gsplit-dwarf) creates the `.debug_names` section ("per-CU" index). By default lld concatenates input `.debug_names` sections into an output `.debug_names` section. LLDB can consume the concatenated section but the lookup performance is not good. This patch adds --debug-names to create a per-module index by combining the per-CU indexes into a single index that covers the entire load module. The produced `.debug_names` is a replacement for `.gdb_index`. Type units (-fdebug-types-section) are not handled yet. Co-authored-by: Fangrui Song <i@maskray.me> --------- Co-authored-by: Fangrui Song <i@maskray.me>	2024-04-18 14:41:14 -07:00
Fangrui Song	e115c00565	[ELF] Reject certain unknown section types (#85173 ) Unknown section sections may require special linking rules, and rejecting such sections for older linkers may be desired. For example, if we introduce a new section type to replace a control structure (e.g. relocations), it would be nice for older linkers to reject the new section type. GNU ld allows certain unknown section types: * [SHT_LOUSER,SHT_HIUSER] and non-SHF_ALLOC * [SHT_LOOS,SHT_HIOS] and non-SHF_OS_NONCONFORMING but reports errors and stops linking for others (unless --no-warn-mismatch is specified). Port its behavior. For convenience, we additionally allow all [SHT_LOPROC,SHT_HIPROC] types so that we don't have to hard code all known types for each processor. Close https://github.com/llvm/llvm-project/issues/84812	2024-03-15 09:50:23 -07:00
Fangrui Song	f1ca2a0967	[ELF] Add --compress-section to compress matched non-SHF_ALLOC sections --compress-sections <section-glib>=[none\|zlib\|zstd] is similar to --compress-debug-sections but applies to broader sections without the SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is matched. An interesting use case is to compress `.strtab`/`.symtab`, which consume a significant portion of the file size (15.1% for a release build of Clang). An older revision is available at https://reviews.llvm.org/D154641 . This patch focuses on non-allocated sections for safety. Moving `maybeCompress` as D154641 does not handle STT_SECTION symbols for `-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s` from #66804). Since different output sections may use different compression algorithms, we need CompressedData::type to generalize config->compressDebugSections. GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452 Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674 Pull Request: https://github.com/llvm/llvm-project/pull/84855	2024-03-12 10:56:14 -07:00
Rahman Lavaee	acec6419e8	[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128 ) Today `-split-machine-functions` and `-fbasic-block-sections={all,list}` cannot be combined with `-basic-block-sections=labels` (the labels option will be ignored). The inconsistency comes from the way basic block address map -- the underlying mechanism for basic block labels -- encodes basic block addresses (https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). Specifically, basic block offsets are computed relative to the function begin symbol. This relies on functions being contiguous which is not the case for MFS and basic block section binaries. This means Propeller cannot use binary profiles collected from these binaries, which limits the applicability of Propeller for iterative optimization. To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section binaries, we propose modifying the encoding of this section as follows. First let us review the current encoding which emits the address of each function and its number of basic blocks, followed by basic block entries for each basic block. \| \| \| \|--\|--\| \| Address of the function \| Function Address \| \| Number of basic blocks in this function \| NumBlocks \| \| BB entry 1 \| BB entry 2 \| ... \| BB entry #NumBlocks To make this work for basic block sections, we treat each basic block section similar to a function, except that basic block sections of the same function must be encapsulated in the same structure so we can map all of them to their single function. We modify the encoding to first emit the number of basic block sections (BB ranges) in the function. Then we emit the address map of each basic block section section as before: the base address of the section, its number of blocks, and BB entries for its basic block. The first section in the BB address map is always the function entry section. \| \| \| \|--\|--\| \| Number of sections for this function \| NumBBRanges \| \| Section 1 begin address \| BaseAddress[1] \| \| Number of basic blocks in section 1 \| NumBlocks[1] \| \| BB entries for Section 1 \|..................\| \| Section #NumBBRanges begin address \| BaseAddress[NumBBRanges] \| \| Number of basic blocks in section #NumBBRanges \| NumBlocks[NumBBRanges] \| \| BB entries for Section #NumBBRanges The encoding of basic block entries remains as before with the minor change that each basic block offset is now computed relative to the begin symbol of its containing BB section. This patch adds a new boolean codegen option `-basic-block-address-map`. Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD flag `--lto-basic-block-address-map` are introduced. Analogously, we add a new TargetOption field `BBAddrMap`. This means BB address maps are either generated for all functions in the compiling unit, or for none (depending on `TargetOptions::BBAddrMap`). This patch keeps the functionality of the old `-fbasic-block-sections=labels` option but does not remove it. A subsequent patch will remove the obsolete option. We refactor the `BasicBlockSections` pass by separating the BB address map and BB sections handing to their own functions (named `handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers basic blocks and places them in their assigned sections. `handleBBAddrMap` is invoked after `handleBBSections` (if requested) and only renumbers the blocks. - New tests added: - Two tests basic-block-address-map-with-basic-block-sections.ll and basic-block-address-map-with-mfs.ll to exercise the combination of `-basic-block-address-map` with `-basic-block-sections=list` and '-split-machine-functions`. - A driver sanity test for the `-fbasic-block-address-map` option (basic-block-address-map.c). - An LLD test for testing the `--lto-basic-block-address-map` option. This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`. - Renamed and modified the two existing codegen tests for basic block address map (`basic-block-sections-labels-functions-sections.ll` and `basic-block-sections-labels.ll`) - Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of `SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2 will happen in a separate PR in a few months.	2024-02-01 17:50:46 -08:00
spupyrev	b53c04a8da	Reapply [ELF] Making cdsort default for function reordering (#68638 ) Edited lld/ELF/Options.td to cdsort as well CDSort function reordering outperforms the existing default heuristic ( hfsort/C^3) in terms of the performance of generated binaries while being (almost) as fast. Thus, the suggestion is to change the default. The speedup is up to 1.5% perf for large front-end binaries, and can be moderate/neutral for "small" benchmarks. High-level perf impact on two selected binaries: clang-10 binary (built with LTO+AutoFDO/CSSPGO): wins on top of C^3 in [0.3%..0.8%] rocksDB-8 binary (built with LTO+CSSPGO): wins on top of C^3 in [0.8%..1.5%] More detailed measurements on the clang binary is at [here](https://reviews.llvm.org/D152834#4445042)	2023-11-03 16:03:06 -07:00
spupyrev	904b3f66f5	[ELF] A new code layout algorithm for function reordering [3a/3] We are brining a new algorithm for function layout (reordering) based on the call graph (extracted from a profile data). The algorithm is an improvement of top of a known heuristic, C^3. It tries to co-locate hot and frequently executed together functions in the resulting ordering. Unlike C^3, it explores a larger search space and have an objective closely tied to the performance of instruction and i-TLB caches. Hence, the name CDS = Cache-Directed Sort. The algorithm can be used at the linking or post-linking (e.g., BOLT) stage. Refer to https://reviews.llvm.org/D152834 for the actual implementation of the reordering algorithm. This diff adds a linker option to replace the existing C^3 heuristic with CDS. The new behavior can be turned on by passing "--use-cache-directed-sort". (the plan is to make it default in a next diff) Perf-impact clang-10 binary (built with LTO+AutoFDO/CSSPGO): wins on top of C^3 in [0.3%..0.8%] rocksDB-8 binary (built with LTO+CSSPGO): wins on top of C^3 in [0.8%..1.5%] Note that function layout affects the perf the most on older machines (with smaller instruction/iTLB caches) and when huge pages are not enabled. The impact on newer processors with huge pages enabled is likely neutral/minor. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D152840	2023-09-26 06:24:34 -07:00
Fangrui Song	8c556b7e2b	[ELF] Change --call-graph-profile-sort to accept an argument Change the FF form --call-graph-profile-sort to --call-graph-profile-sort={none,hfsort}. This will be extended to support llvm/lib/Transforms/Utils/CodeLayout.cpp. --call-graph-profile-sort is not used in the wild but --no-call-graph-profile-sort is (Chromium). Make --no-call-graph-profile-sort an alias for --call-graph-profile-sort=none. Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D159544	2023-09-25 09:49:40 -07:00
modimo	272bd6f9cc	[WPD][LLD] Add option to validate RTTI is enabled on all native types and prevent devirtualization on types with native RTTI Discussion about this approach: https://discourse.llvm.org/t/rfc-safer-whole-program-class-hierarchy-analysis/65144/18 When enabling WPD in an environment where native binaries are present, types we want to optimize can be derived from inside these native files and devirtualizing them can lead to correctness issues. RTTI can be used as a way to determine all such types in native files and exclude them from WPD providing a safe checked way to enable WPD. The approach is: 1. In the linker, identify if RTTI is available for all native types. If not, under `--lto-validate-all-vtables-have-type-infos` `--lto-whole-program-visibility` is automatically disabled. This is done by examining all .symtab symbols in object files and .dynsym symbols in DSOs for vtable (_ZTV) and typeinfo (_ZTI) symbols and ensuring there's always a match for every vtable symbol. 2. During thinlink, if `--lto-validate-all-vtables-have-type-infos` is set and RTTI is available for all native types, identify all typename (_ZTS) symbols via their corresponding typeinfo (_ZTI) symbols that are used natively or outside of our summary and exclude them from WPD. Testing: ninja check-all large Meta service that uses boost, glog and libstdc++.so runs successfully with WPD via --lto-whole-program-visibility. Previously, native types in boost caused incorrect devirtualization that led to crashes. Reviewed By: MaskRay, tejohnson Differential Revision: https://reviews.llvm.org/D155659	2023-09-18 15:51:49 -07:00
aeubanks	a07d4c0365	[lld/ELF,gold] Remove transitionary opaque pointer flags (#65529 ) This was only useful during the transition when mixing non-opaque-pointer and opaque-pointer IR, now everything uses opaque pointers.	2023-09-06 15:07:37 -07:00
Shoaib Meenai	97e39f96c8	[ELF] Add -Bsymbolic-non-weak This adds a new -Bsymbolic option that directly binds all non-weak symbols. There's a couple of reasons motivating this: * The new flag will match the default behavior on Mach-O, so you can get consistent behavior across platforms. * We have use cases for which making weak data preemptible is useful, but we don't want to pessimize access to non-weak data. (For a large internal app, we measured 2000+ data symbols whose accesses would be unnecessarily pessimized by `-Bsymbolic-functions`.) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D158322	2023-08-21 09:11:51 -07:00
Paul Kirth	14e3bec8fc	Reland "[lld] Preliminary fat-lto-object support" This patch adds support to lld for --fat-lto-objects. We add a new --fat-lto-objects option to LLD, and slightly change how it chooses input files in the driver when the option is set. Fat LTO objects contain both LTO compatible IR, as well as generated object code. This allows users to defer the choice of whether to use LTO or not to link-time. This is a feature available in GCC for some time, and makes the existing -ffat-lto-objects option functional in the same way as GCC's. If the --fat-lto-objects option is passed to LLD and the input files are fat object files, then the linker will chose the LTO compatible bitcode sections embedded within the fat object and link them together using LTO. Otherwise, standard object file linking is done using the assembly section in the object files. The previous version of this patch had a missing `REQUIRES: x86` line in `fatlto.invalid.s`. Additionally, it was reported that this patch caused a test failure in `export-dynamic-symbols.s`, however, 29112a994694baee070a2021e00f772f1913d214 disabled the `export-dynamic-symbols.s` test on Windows due to a quotation difference between platforms, unrelated to this patch. Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D146778	2023-08-18 22:51:25 +00:00
Paul Kirth	1733d94963	Revert "[lld] Preliminary fat-lto-object support" This reverts commit c9953d9891a6067549a78e7d07ca8eb6a7596792 and a forward fix in 3a45b843dec1bca195884aa1c5bc56bd0e6755b4. D14677 causes some failure on windows bots that the forward fix did not address. Thus I'm reverting until the underlying cause can me triaged.	2023-07-20 03:37:48 +00:00
Paul Kirth	3a45b843de	[lld] Preliminary fat-lto-object support This patch adds support to lld for --fat-lto-objects. We add a new --fat-lto-objects flag to LLD, and slightly change how it chooses input files in the driver when the flag is set. Fat LTO objects contain both LTO compatible IR, as well as generated object code. This allows users to defer the choice of whether to use LTO or not to link-time. This is a feature available in GCC for some time, and makes the existing -ffat-lto-objects flag functional in the same way as GCC's. If the --fat-lto-objects option is passed to LLD and the input files are fat object files, then the linker will chose the LTO compatible bitcode sections embedded within the fat object and link them together using LTO. Otherwise, standard object file linking is done using the assembly section in the object files. Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977 Depends on D146777 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D146778	2023-07-19 23:07:42 +00:00
Matthew Voss	ab9b3c84a5	[lld] A Unified LTO Bitcode Frontend The unified LTO pipeline creates a single LTO bitcode structure that can be used by Thin or Full LTO. This means that the LTO mode can be chosen at link time and that all LTO bitcode produced by the pipeline is compatible, from an optimization perspective. This makes the behavior of LTO a bit more predictable by normalizing the set of LTO features supported by each LTO bitcode file. Example usage: clang -flto -funified-lto -fuse-ld=lld foo.c clang -flto=thin -funified-lto -fuse-ld=lld foo.c clang -c -flto -funified-lto foo.c # -flto={full,thin} are identical in terms of compilation actions clang -flto=thin -fuse-ld=lld foo.o # pass --lto=thin to ld.lld clang -c -flto -funified-lto foo.c clang -flto -fuse-ld=lld foo.o The RFC discussing the details and rational for this change is here: https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774 Differential Revision: https://reviews.llvm.org/D123805	2023-07-18 16:13:58 -07:00
Amilendra Kodithuwakku	9acbab60e5	[LLD][ELF] Cortex-M Security Extensions (CMSE) Support This commit provides linker support for Cortex-M Security Extensions (CMSE). The specification for this feature can be found in ARM v8-M Security Extensions: Requirements on Development Tools. The linker synthesizes a security gateway veneer in a special section; `.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`, defined relative to the same text section and having the same address. The address of `<entry>` is retargeted to the starting address of the linker-synthesized security gateway veneer in section `.gnu.sgstubs`. In summary, the linker translates input: ``` .text entry: __acle_se_entry: [entry_code] ``` into: ``` .section .gnu.sgstubs entry: SG B.W __acle_se_entry .text __acle_se_entry: [entry_code] ``` If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker considers that `<entry>` already defines a secure gateway veneer so does not synthesize one. If `--out-implib=<out.lib>` is specified, the linker writes the list of secure gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with value `<addr>`. If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import library `<in.lib>` and preserves the entry function addresses in the resulting executable and new import library. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D139092	2023-07-06 11:34:07 +01:00
Simi Pallipurath	f146763e07	Revert "Revert "[lld][Arm] Big Endian - Byte invariant support."" This reverts commit d8851384c6ac2a1cea15e05228dbde5f13654e23. Reason: Applied the fix for the Asan buildbot failures.	2023-06-22 16:10:18 +01:00
Mitch Phillips	cd116e0460	Revert "Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support""" This reverts commit 9246df7049b0bb83743f860caff4221413c63de2. Reason: This patch broke the UBSan buildbots. See more information in the original phabricator review: https://reviews.llvm.org/D139092	2023-06-22 14:33:57 +02:00
Amilendra Kodithuwakku	9246df7049	Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support"" This reverts commit a685ddf1d104b3ce9d53cf420521f5aaff429630. This relands Arm CMSE support (D139092) and fixes the GCC build bot errors.	2023-06-21 22:27:13 +01:00
Amilendra Kodithuwakku	a685ddf1d1	Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support" This reverts commit c4fea3905617af89d1ad87319893e250f5b72dd6. I am reverting this for now until I figure out how to fix the build bot errors and warnings. Errors: llvm-project/lld/ELF/Arch/ARM.cpp:1300:29: error: expected primary-expression before ‘>’ token osec->writeHeaderTo<ELFT>(++sHdrs); Warnings: llvm-project/lld/ELF/Arch/ARM.cpp:1306:31: warning: left operand of comma operator has no effect [-Wunused-value]	2023-06-21 16:13:44 +01:00
Amilendra Kodithuwakku	c4fea39056	[LLD][ELF] Cortex-M Security Extensions (CMSE) Support This commit provides linker support for Cortex-M Security Extensions (CMSE). The specification for this feature can be found in ARM v8-M Security Extensions: Requirements on Development Tools. The linker synthesizes a security gateway veneer in a special section; `.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`, defined relative to the same text section and having the same address. The address of `<entry>` is retargeted to the starting address of the linker-synthesized security gateway veneer in section `.gnu.sgstubs`. In summary, the linker translates input: ``` .text entry: __acle_se_entry: [entry_code] ``` into: ``` .section .gnu.sgstubs entry: SG B.W __acle_se_entry .text __acle_se_entry: [entry_code] ``` If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker considers that `<entry>` already defines a secure gateway veneer so does not synthesize one. If `--out-implib=<out.lib>` is specified, the linker writes the list of secure gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with value `<addr>`. If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import library `<in.lib>` and preserves the entry function addresses in the resulting executable and new import library. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D139092	2023-06-21 14:47:34 +01:00
Simi Pallipurath	d8851384c6	Revert "[lld][Arm] Big Endian - Byte invariant support." This reverts commit 8cf8956897ce9bca3176c6339077b1ca17b27abc.	2023-06-20 17:27:44 +01:00
Simi Pallipurath	8cf8956897	[lld][Arm] Big Endian - Byte invariant support. Arm has BE8 big endian configuration called a byte-invariant(every byte has the same address on little and big-endian systems). When in BE8 mode: 1. Instructions are big-endian in relocatable objects but little-endian in executables and shared objects. 2. Data is big-endian. 3. The data encoding of the ELF file is ELFDATA2MSB. To support BE8 without an ABI break for relocatable objects,the linker takes on the responsibility of changing the endianness of instructions. At a high level the only difference between BE32 and BE8 in the linker is that for BE8: 1. The linker sets the flag EF_ARM_BE8 in the ELF header. 2. The linker endian reverses the instructions, but not data. This patch adds BE8 big endian support for Arm. To endian reverse the instructions we'll need access to the mapping symbols. Code sections can contain a mix of Arm, Thumb and literal data. We need to endian reverse Arm instructions as words, Thumb instructions as half-words and ignore literal data.The only way to find these transitions precisely is by using mapping symbols. The instruction reversal will need to take place after relocation. For Arm BE8 code sections (Section has SHF_EXECINSTR flag ) we inserted a step after relocation to endian reverse the instructions. The implementation strategy i have used here is to write all sections BE32 including SyntheticSections then endian reverse all code in InputSections via mapping symbols. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D150870	2023-06-20 14:08:21 +01:00
Petr Hosek	811cbfc262	[lld][ELF] Implement –print-memory-usage This option was introduced in GNU ld in https://sourceware.org/legacy-ml/binutils/2015-06/msg00086.html and is often used in embedded development. This change implements this option in LLD matching the GNU ld output verbatim. Differential Revision: https://reviews.llvm.org/D150644	2023-05-25 07:14:18 +00:00
Fangrui Song	39c20a63b1	[ELF] Add --remap-inputs= and --remap-inputs-file= --remap-inputs-file= can be specified multiple times, each naming a remap file that contains `from-glob=to-file` lines or `#`-led comments. ('=' is used a separator a la -fdebug-prefix-map=) --remap-inputs-file= can be used to: * replace an input file. E.g. `"/libz.so=exp/libz.so"` can replace a resolved `-lz` without updating the input file list or (if used) a response file. When debugging an application where a bug is isolated to one single input file, this option gives a convenient way to test fixes. remove an input file with `/dev/null` (changed to `NUL` on Windows), e.g. `"a.o=/dev/null"`. A build system may add unneeded dependencies. This option gives a convenient way to test the result removing some inputs. `--remap-inputs=a.o=aa.o` can be specified to provide one pattern without using an extra file. (bash/zsh process substitution is handy for specifying a pattern without using a remap file, e.g. `--remap-inputs-file=<(printf 'a.o=aa.o')`, but it may be unavailable in some systems. An extra file can be inconvenient for a build system.) Exact patterns are tested before wildcard patterns. In case of a tie, the first patterns wins. This is an implementation detail that users should not rely on. Co-authored-by: Marco Elver <elver@google.com> Link: https://discourse.llvm.org/t/rfc-support-exclude-inputs/70070 Reviewed By: melver, peter.smith Differential Revision: https://reviews.llvm.org/D148859	2023-04-26 13:18:55 -07:00
Craig Topper	85444794cd	[lld][RISCV] Implement GP relaxation for R_RISCV_HI20/R_RISCV_LO12_I/R_RISCV_LO12_S. This implements support for relaxing these relocations to use the GP register to compute addresses of globals in the .sdata and .sbss sections. This feature is off by default and must be enabled by passing --relax-gp to the linker. The GP register might not always be the "global pointer". It can be used for other purposes. See discussion here https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/371 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D143673	2023-04-13 10:52:15 -07:00
Scott Linder	45ee0a9afc	[LLD] Add --lto-CGO[0-3] option Allow controlling the CodeGenOpt::Level independent of the LTO optimization level in LLD via new options for the COFF, ELF, MachO, and wasm frontends to lld. Most are spelled as --lto-CGO[0-3], but COFF is spelled as -opt:lldltocgo=[0-3]. See D57422 for discussion surrounding the issue of how to set the CG opt level. The ultimate goal is to let each function control its CG opt level, but until then the current default means it is impossible to specify a CG opt level lower than 2 while using LTO. This option gives the user a means to control it for as long as it is not handled on a per-function basis. Reviewed By: MaskRay, #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D141970	2023-02-15 17:34:35 +00:00
Fangrui Song	c8aedf98ec	[ELF] Fix help message for --lto-pgo-warn-mismatch	2023-02-03 00:04:50 -08:00
Arthur Eubanks	ae5efe9761	[lld] Remove transitional legacy pass manager flags Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D142571	2023-01-26 11:12:21 -08:00
Nikita Popov	d7cf7ab61c	[LLD] Remove no-opaque-pointers plugin option We always use opaque pointers. The opaque-pointers option is retained as a no-op, same as no-lto-legacy-pass-manager.	2023-01-25 12:29:59 +01:00
Dan Albert	241dbd3105	[ELF] Enable --no-undefined-version by default Allowing incorrect version scripts is not a helpful default. Flip that to help users find their bugs at build time rather than at run time. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D135402	2022-12-08 01:41:18 +00:00
Nico Weber	9ddce2834c	[lld/ELF] Make plugin-opt=jobs= help text refer to --thinlto-jobs=	2022-11-21 10:57:16 -05:00

1 2 3 4 5 ...

455 Commits