llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-25 00:06:05 +00:00

Author	SHA1	Message	Date
alx32	4ac79a8c98	[lld-macho] Use Symbols as branch target for safe_thunks ICF (#126835 ) ## Problem The `safe_thunks` ICF optimization in `lld-macho` was creating thunks that pointed to `InputSection`s instead of `Symbol`s. While, generally, branch relocations can point to symbols or input sections, in this case we need them to point to symbols as subsequently the branch extension algorithm expects branches to always point to `Symbol`'s. ## Solution This patch changes the ICF implementation so that safe thunks point to `Symbol`'s rather than `InputSection`s. ## Testing The existing `arm64-thunks.s` test is modified to include `--icf=safe_thunks` to explicitly verify the interaction between ICF and branch range extension thunks. Two functions were added that will be merged together via a thunk. Before this patch, this test would generate an assert - now this scenario is correctly handled.	2025-02-13 11:07:12 -08:00
Ellis Hoag	79fff6aa32	[lld][BP] Avoid ordering ICF'ed sections (#126327 ) ICF runs before BPSectionOrderer. When a section is ICF'ed, it seems that the original sections are marked as not live, but are still kept around. Prior to this patch, those ICF'ed sections would be passed to BP and ordered before being skipped when writing the output. Now, these sections are no longer passed to BP, saving runtime and possibly improving BP's output. In a large binary, I found that the number of sections ordered using BP decreased, while the number of duplicate sections drastically decreased as expected. ``` Functions for startup: 50755 -> 50520 Functions for compression: 165734 -> 105328 Duplicate functions: 1827231 -> 55230 ```	2025-02-13 08:57:44 -08:00
Kazu Hirata	8686e677ff	[lld] Migrate away from PointerUnion::dyn_cast (NFC) (#125458 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect referent to be nonnull.	2025-02-03 12:28:14 -08:00
Fangrui Song	2f6e3df08a	BPSectionOrderer: stabilize iteration order and node order Exposed by the test added in the reverted #120514 * Fix libstdc++/libc++ differences due to nth_element. https://github.com/llvm/llvm-project/pull/125450#issuecomment-2631404178 * Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences * Fix potential issue in `currentSize += D::getSize(sections[sectionIdxs.begin()])` where DenseSet was used, though not covered by a test	2025-02-03 10:36:51 -08:00
Hans Wennborg	f3c4b58f4b	Revert "[ELF] Add BPSectionOrderer options (#120514 )" The ELF/bp-section-orderer.s test is failing on some buildbots due to what seems like non-determinism issues, see comments on the original PR and #125450 Reverting to green the build. This reverts commit 0154dce8d39d2688b09f4e073fe601099a399365 and follow-up commits 046dd4b28b9c1a75a96cf63465021ffa9fe1a979 and c92f20416e6dbbde9790067b80e75ef1ef5d0fa4.	2025-02-03 11:41:23 +01:00
Fangrui Song	046dd4b28b	[lld] BPSectionOrderer: stabilize iteration order	2025-02-02 21:58:29 -08:00
Fangrui Song	115bb87ad0	[lld] BPSectionOrderer: replace Symbol with Defined and optimize getSymbols. NFC	2025-02-02 15:43:01 -08:00
Fangrui Song	e0c7f081f1	[lld-macho] Refactor BPSectionOrderer with CRTP. NFC PR #117514 refactored BPSectionOrderer to be used by the ELF port but introduced some inefficiency: * BPSectionBase/BPSymbol are wrappers around a single pointer. The numbers of sections and symbols could be huge, and the extra allocations are memory inefficient. * Reconstructing the returned DenseMap (since BPSectionBase != InputSectin) is wasteful. This patch refactors BPSectionOrderer with Curiously Recurring Template Pattern and eliminates the inefficiency. In addition, `symbolToSectionIdxs` is removed and `rootSymbolToSectionIdxs` building is moved to lld/MachO: while getting sections for symbols is cheap in Mach-O, it is awkward and inefficient in the ELF port. While here, add a file-level comment and replace some `StringMap<>` (which copies strings) with `DenseMap<CachedHashStringRef, >`. Pull Request: https://github.com/llvm/llvm-project/pull/124482	2025-01-27 18:24:59 -08:00
Kazu Hirata	5d24341667	[lld] Migrate away from PointerUnion::dyn_cast (NFC) (#124504 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> This patch migrates uses of PointerUnion::dyn_cast to dyn_cast_if_present (see the definition of PointerUnion::dyn_cast). Note that we cannot use dyn_cast in any of the migrations in this patch; placing assert(!X.isNull()); just before any of dyn_cast_if_present in this patch triggers some failure in check-lld.	2025-01-27 10:34:54 -08:00
Fangrui Song	e8e75e08c9	[lld-macho] Remove unneeded functions from BPSectionOrderer. NFC	2025-01-26 09:46:38 -08:00
alx32	c676104875	[lld-macho] Implement symbol string deduplication (#123874 ) The symbol string table does not have deduplication. Here we add code to deduplicate the symbol string table. This has a rather large size impact (20-30%) on unstripped binaries (typically debug binaries) but no size impact on stripped binaries(typically release binaries). We enable deduplication by default and add a flag to disable it (`-no-deduplicate-symbol-strings`).	2025-01-23 15:48:11 -08:00
Kazu Hirata	aaf0643dd5	[lld] Migrate away from PointerUnion::dyn_cast (NFC) (#123891 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses cast because we know expect isa<Symbol *>(rel.referent) to be true.	2025-01-22 00:17:04 -08:00
Kazu Hirata	a0ec385873	[lld] Migrate away from PointerUnion::dyn_cast (NFC) (#123721 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses cast because we know expect isa<InputSection *>(reloc.referent) to be true.	2025-01-21 11:57:53 -08:00
Ellis Hoag	8b0c774f8a	[lld][InstrProf][NFC] Fix typo in help message (#123390 )	2025-01-17 13:04:28 -08:00
alx32	1c3c65590d	[lld-macho] Document '-icf' flag options (#123372 ) Adding the `safe_thunks` option in `Options.td` as it was missing there - mentioned by @Colibrow in https://github.com/llvm/llvm-project/pull/106573 Also documenting what the various options mean. Help now looks like this: ``` .......... --error-limit=<value> Maximum number of errors to print before exiting (default: 20) --help-hidden Display help for hidden options --icf=[none,safe,safe_thunks,all] Set level for identical code folding (default: none). Possible values: none - Disable ICF safe - Only folds non-address significant functions (as described by `__addrsig` section) safe_thunks - Like safe, but replaces address-significant functions with thunks all - Fold all identical functions --ignore-auto-link-option=<value> Ignore a single auto-linked library or framework. Useful to ignore invalid options that ld64 ignores --irpgo-profile-sort=<profile> Deprecated. Please use --irpgo-profile and --bp-startup-sort=function .......... ```	2025-01-17 10:48:32 -08:00
Fangrui Song	60e4d24963	[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash xxHash, inferior to xxh3, is discouraged. We try not to use xxhash in lld. Switch to read32le for content hash and xxh3/stable_hash_combine for relocation hash. Remove the intermediate std::string for relocation hash. Change the tail hashing scheme to consider individual bytes instead. This helps group 0102 and 0201 together. The benefit is negligible, though. Pull Request: https://github.com/llvm/llvm-project/pull/121729	2025-01-16 09:31:42 -08:00
alx32	95d21f6015	[lld-macho] Reduce memory usage of printing thunks in map file (#122785 ) This commit improves the memory efficiency of the lld-macho linker by optimizing how thunks are printed in the map file. Previously, merging vectors of input sections required creating a temporary vector, which increased memory usage and in some cases caused the linker to run out of memory as reported in comments on https://github.com/llvm/llvm-project/pull/120496. The new approach interleaves the printing of two arrays of ConcatInputSection in sorted order without allocating additional memory for a merged array.	2025-01-15 22:58:24 -08:00
Fangrui Song	cc88a5e615	[lld-macho,NFC] Switch to increasing priorities --order_file, call graph profile, and BalancedPartitioning currently build the section order vector by decreasing priority (from SIZE_MAX to 0). However, it's conventional to use an increasing key (see OutputSection::inputOrder). Switch to increasing priorities, remove the global variable highestAvailablePriority, and remove the highestAvailablePriority parameter from BPSectionOrderer. Change size_t to int. This improves consistenty with the ELF and COFF ports. The ELF port utilizes negative priorities for --symbol-ordering-file and call graph profile, and non-negative priorities for --shuffle-sections (no Mach-O counterpart yet). Pull Request: https://github.com/llvm/llvm-project/pull/121727	2025-01-10 09:32:03 -08:00
alx32	156e605163	[lld-macho] Fix branch extension thunk estimation logic (#120529 ) This patch improves the linker’s ability to estimate stub reachability in the `TextOutputSection::estimateStubsInRangeVA` function. It does so by including thunks that have already been placed ahead of the current call site address when calculating the threshold for direct stub calls. Before this fix, the estimation process overlooked existing forward thunks. This could result in some thunks not being inserted where needed. In rare situations, particularly with large and specially arranged codebases, this might lead to branch instructions being out of range, causing linking errors. Although this patch successfully addresses the problem, it is not feasible to create a test for this issue. The specific layout and order of thunk creation required to reproduce the corner case are too complex, making test creation impractical. Example error messages the issue could generate: ``` ld64.lld: error: banana.o:(symbol OUTLINED_FUNCTION_24949_3875): relocation BRANCH26 is out of range: 134547892 is not in [-134217728, 134217727]; references objc_autoreleaseReturnValue ld64.lld: error: main.o:(symbol _main+0xc): relocation BRANCH26 is out of range: 134544132 is not in [-134217728, 134217727]; references objc_release ```	2025-01-09 14:14:13 -08:00
alx32	162814a7ec	[lld-macho] Include branch extension thunks in linker map (#120496 ) This patch extends the MachO linker's map file generation to include branch extension thunk symbols. Previously, thunks were omitted from the map file, making it difficult to understand the final layout of the binary, especially when debugging issues related to long branch thunks. This change ensures thunks are included and correctly interleaved with other symbols based on their address, providing an accurate representation of the linked output.	2025-01-07 21:07:51 -08:00
Fangrui Song	c2f7745b4e	[lld-macho] Remove redundant hasValidData. NFC lld::macho::runBalancedPartitioning ensures that all sections satisfy `hasValidData`.	2025-01-05 15:59:17 -08:00
Anutosh Bhat	ba93eccded	[lld][MachO] Fix warning while building for wasm (#120889 ) While building clang & lld against emscripten for wasm, I see the following ``` │ │ /home/runner/work/recipes/recipes/output/bld/rattler-build_llvm_1734801187/work/lld/MachO/SyntheticSections.cpp:2075:25: warning: comparison of integers of │ │ different signs: 'long' and 'const uint32_t' (aka 'const unsigned int') [-Wsign-compare] │ │ 2075 \| assert(buf - bufStart == sectionSize && │ │ \| ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~ │ │ $BUILD_PREFIX/opt/emsdk/upstream/emscripten/cache/sysroot/include/assert.h:8:28: note: expanded from macro 'assert' │ │ 8 \| #define assert(x) ((void)((x) \|\| (__assert_fail(#x, __FILE__, __LINE__, __func__),0))) │ │ \| ^ ``` Casting `sectionSize` should be enough I think	2025-01-05 17:49:50 +08:00
Ellis Hoag	40e734e041	[lld][MachO] Allow separate --irpgo-profile flag (#121354 )	2024-12-31 09:07:00 -08:00
Carlo Cabrera	a0f0a69b62	[lld][MachO] Fix symbol insertion in `transplantSymbolsAtOffset` (#120737 ) The existing comparison does not insert symbols in the intended place. Closes #120559. --------- Co-authored-by: Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>	2024-12-22 21:50:15 +08:00
Max	79e859e049	[lld] Move BPSectionOrderer from MachO to Common for reuse in ELF (#117514 ) Add lld/Common/BPSectionOrdererBase from MachO for reuse in ELF	2024-12-18 09:24:25 -08:00
Kazu Hirata	e04fde193b	[lld] Migrate away from PointerUnion::{is,get} (NFC) (#119993 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2024-12-14 20:07:08 -08:00
Chandler Carruth	dd647e3e60	Rework the `Option` library to reduce dynamic relocations (#119198 ) Apologies for the large change, I looked for ways to break this up and all of the ones I saw added real complexity. This change focuses on the option's prefixed names and the array of prefixes. These are present in every option and the dominant source of dynamic relocations for PIE or PIC users of LLVM and Clang tooling. In some cases, 100s or 1000s of them for the Clang driver which has a huge number of options. This PR addresses this by building a string table and a prefixes table that can be referenced with indices rather than pointers that require dynamic relocations. This removes almost 7k dynmaic relocations from the `clang` binary, roughly 8% of the remaining dynmaic relocations outside of vtables. For busy-boxing use cases where many different option tables are linked into the same binary, the savings add up a bit more. The string table is a straightforward mechanism, but the prefixes required some subtlety. They are encoded in a Pascal-string fashion with a size followed by a sequence of offsets. This works relatively well for the small realistic prefixes arrays in use. Lots of code has to change in order to land this though: both all the option library code has to be updated to use the string table and prefixes table, and all the users of the options library have to be updated to correctly instantiate the objects. Some follow-up patches in the works to provide an abstraction for this style of code, and to start using the same technique for some of the other strings here now that the infrastructure is in place.	2024-12-11 15:44:44 -08:00
Max	a2959071be	[lld][MachO] Rename to bp-* options for SectionOrderer (#118594 ) Rename options related to profile guided function order (#96268) to prepare for the addition to the ELF port.	2024-12-09 22:50:21 -08:00
Carlo Cabrera	d668304998	[lld][MachO] Support `-allowable_client` (#117155 ) Closes #117113. Follow-up to #114638.	2024-11-27 11:23:49 -05:00
Tom Lin	b4e000e600	[LLD][MachO] Enable plugin support for LTO (#115690 ) Add new CLI options for feature parity with ELF w.r.t pass plugins. Most of the changes are ported directly from `0c86198b27`. With this change, it is now possible to load and run external pass plugins during the LTO phase.	2024-11-22 15:01:59 -08:00
Carlo Cabrera	1de9bc1a27	[lld][MachO] Respect dylibs linked with `-allowable_client` (#114638 ) ld64.lld would previously allow you to link against dylibs linked with `-allowable_client`, even if the client's name does not match any allowed client. This change fixes that. See #114146 for related discussion. The test binary `liballowable_client.dylib` was created on macOS with: echo \| clang -xc - -dynamiclib -mmacosx-version-min=10.11 -arch x86_64 -Wl,-allowable_client,allowed -o lib/liballowable_client.dylib	2024-11-20 20:02:17 -05:00
alx32	7404685598	[lld-macho] Fix compatibility between --icf=safe_thunks and --keep-icf-stabs (#116687 ) Currently when `--icf=safe_thunks` is used, `STABS` entries cannot be generated for ICF'ed functions. This is because if ICF converts a full function into a thunk and then we generate a `STABS` entry for the thunk, `dsymutil` will expect to find the entire function body at the location of the thunk. Because just a thunk will be present at the location of the `STABS` entry - dsymutil will generate invalid debug info for such scenarios. With this change, if `--icf=safe_thunks` is used and `--keep-icf-stabs` is also specified, STABS entries will be created for all functions, even merged ones. However, the STABS entries will point at the actual (full) function body while having the name of the thunk. This way we still get program correctness as well as correct DWARF data. When doing this, the debug data will be identical to the scenario where we're using `--icf=all` and `--keep-icf-stabs`, but the actual program will also contain thunks, which won't show up in the DWARF data.	2024-11-20 09:36:52 -08:00
Fangrui Song	fcb6b132fa	[lld] Use context-aware outs() and errs() For COFF and ELF that are mostly free of global states, lld::errs() and lld::outs() should not be used. This migration change allows us to remove lld::errs, which uses the global errorHandler().	2024-11-16 21:37:34 -08:00
Kyungwoo Lee	ab27253ad3	[CGData][lld-macho] Merge CG Data by LLD (#112674 ) LLD now processes raw CG data for stable functions, similar to how it handles raw CG data for the outliner's hash tree. This data is encoded in the custom section (`__llvm_merge`) within object files. LLD merges this information into the indexed CG data file specified by the `-codegen-data-generate-path={path}` option. For the linker that does not support this feature, we could use `llvm-cgdata` tool -- https://github.com/llvm/llvm-project/blob/main/llvm/docs/CommandGuide/llvm-cgdata.rst. Depends on #115750. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-11-15 17:24:35 -08:00
SharonXSharon	6827a00d4d	[lld][InstrProf] Do not use cstring offset hashes in function order for compression (#113606 )	2024-10-28 09:47:21 -07:00
alx32	f9d3e98207	[lld-macho] Improve robustness of ObjC category merging (#112618 ) This patch enhances the robustness of lld's Objective-C category merging. Currently, the category merger assumes it can fully parse and understand the format of all categories in the input, triggering an assert if any invalid category data is encountered. This will end up causing asserts in certain rare corner cases that are difficult to reproduce in small test cases. The proposed changes modify the behavior so that if invalid category data is detected, category merging is skipped for that specific class and all other categories sharing the same base class. This approach allows the linker to continue processing other categories without failing entirely due to a single problematic input. We also add a LIT test to where we corrupt category data and check that category merging for that class was skipped but the link was successful.	2024-10-18 11:03:16 -07:00
alx32	97a4324224	[lld-macho] Fix ICF differentiation of safe_thunks relocs (#111811 ) In `--icf=safe_thunks` mode, the linker differentiates `keepUnique` functions by creating thunks during a post-processing step after Identical Code Folding (ICF). While this ensures that `keepUnique` functions themselves are not incorrectly merged, it overlooks functions that reference these `keepUnique` symbols. If two functions are identical except for references to different `keepUnique` functions, the current ICF algorithm incorrectly considers them identical because it doesn't account for the future differentiation introduced by thunks. This leads to incorrect deduplication of functions that should remain distinct. To address this issue, we modify the ICF comparison to explicitly check for references to `keepUnique` functions during deduplication. By doing so, functions that reference different `keepUnique` symbols are correctly identified as distinct, preventing erroneous merging and ensuring the correctness of the linked output.	2024-10-10 08:22:48 -07:00
Nuri Amari	2edd897a42	Make WriteIndexesThinBackend multi threaded (#109847 ) We've noticed that for large builds executing thin-link can take on the order of 10s of minutes. We are only using a single thread to write the sharded indices and import files for each input bitcode file. While we need to ensure the index file produced lists modules in a deterministic order, that doesn't prevent us from executing the rest of the work in parallel. In this change we use a thread pool to execute as much of the backend's work as possible in parallel. In local testing on a machine with 80 cores, this change makes a thin-link for ~100,000 input files run in ~2 minutes. Without this change it takes upwards of 10 minutes. --------- Co-authored-by: Nuri Amari <nuriamari@fb.com>	2024-10-07 08:16:46 -07:00
alx32	9e862ae321	[lld-macho] Fix invalid DWARF with --icf=safe_thunks (#111097 ) There is a bug in the current implementation of `--icf=safe_thunks` where a STABS entry is emitted for generated thunks. This is problematic as we end up generating invalid DWARF as dsymutil will think the entire function body is at the thunk location, when in actuality there will only be a single branch present. This will end up causing overlapping DWARF entries. To fix this we never generate STABS entries for such thunks. The existing `--icf=safe_thunks` test is updated to also generate debug info and we add a check that no corrupt DWARF is generated. As a future TODO we need to make `--keep-icf-stabs` compatible with `--icf=safe_thunks`.	2024-10-04 21:48:09 -07:00
Kazu Hirata	9ed46fbe9f	[lld] Use StringRef idioms (NFC) (#109584 )	2024-09-22 20:45:25 -07:00
Kyungwoo Lee	f4763b3d24	Reland [CGData] LLD for MachO #90166 (#108733 ) It reads raw CG data encoded in the custom section (__llvm_outline) in object files and merges them into the indexed codegen data file specified by -codegen-data-generate-path={path}. This depends on https://github.com/llvm/llvm-project/pull/90074. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.	2024-09-15 06:39:06 -07:00
JOE1994	4b27b5800f	[lld] Nits on uses of raw_string_ostream (NFC) * Don't call raw_string_ostream::flush(), which is essentially a no-op. * Strip calls to raw_string_ostream::str(), to avoid excess layer of indirection.	2024-09-15 04:23:11 -04:00
Kyungwoo Lee	9de260364b	Revert "[CGData] LLD for MachO (#90166 )" This reverts commit 00c0b1ae20358a9e55ff8eda20c4e0546ee81b5b.	2024-09-14 21:15:53 -07:00
Kyungwoo Lee	00c0b1ae20	[CGData] LLD for MachO (#90166 ) It reads raw CG data encoded in the custom section (__llvm_outline) in object files and merges them into the indexed codegen data file specified by `-codegen-data-generate-path={path}`. This depends on https://github.com/llvm/llvm-project/pull/90074. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.	2024-09-14 20:38:14 -07:00
Ellis Hoag	ce91e2153f	[lld][InstrProf] Sort startup functions for compression (#107348 )	2024-09-06 09:22:03 -07:00
Daniel Bertalan	691e3c64d0	[lld-macho] Fix `Defined` size increase with `-mms-bitfields` (#107545 ) Under the Microsoft ABI, only those bit fields can be merged whose underlying types have the same size. d175616 (`[lld-macho][arm64] Enhance safe ICF with thunk-based deduplication`) added an enum field (`identicalCodeFoldingKind`) next to booleans in the `Defined` class, which increased the size under the MS ABI. On MinGW targets, this triggered the `static_assert` which checks the size of `Defined` (for MSVC targets, the check is disabled due to another problem). Let's store it as a `uint8_t` to allow merging to take place. Fixes #107511	2024-09-06 10:58:19 +02:00
alx32	d1756165a9	[lld-macho][arm64] Enhance safe ICF with thunk-based deduplication (#106573 ) Currently, our `safe` ICF mode only merges non-address-significant code, leaving duplicate address-significant functions in the output. This patch introduces `safe_thunks` ICF mode, which keeps a single master copy of each function and replaces address-significant duplicates with thunks that branch to the master copy. Currently `--icf=safe_thunks` is only supported for `arm64` architectures. Perf stats for a large binary: \| ICF Option \| Total Size \| __text Size \| __unwind_info \| % total \| \|-------------------\|------------\|-------------\|---------------------\|---------------------------\| \| `--icf=none` \| 91.738 MB \| 55.220 MB \| 1.424 MB \| 0% \| \| `--icf=safe` \| 85.042 MB \| 49.572 MB \| 1.168 MB \| 7.30% \| \| `--icf=safe_thunks` \| 84.650 MB \| 49.219 MB \| 1.143 MB \| 7.72% \| \| `--icf=all` \| 82.060 MB \| 48.726 MB \| 1.111 MB \| 10.55% \| So overall we can expect a `~0.45%` binary size reduction for a typical large binary compared to the `--icf=safe` option. Runtime: Linking the above binary took ~10 seconds. Comparing the link performance of --icf=safe_thunks vs --icf=safe, a ~2% slowdown was observed.	2024-09-05 16:36:21 -07:00
Ellis Hoag	3380dae2f0	[lld][InstrProf] Refactor BPSectionOrderer.cpp (#107347 ) Refactor some code in `BPSectionOrderer.cpp` in preparation for https://github.com/llvm/llvm-project/pull/107348. * Rename `constructNodesForCompression()` -> `getUnsForCompression()` and return a `SmallVector` directly rather than populating a vector alias * Pass `duplicateSectionIdxs` as a pointer to make it possible to skip finding (nearly) duplicate sections * Combine `duplicate{Function,Data}SectionIdxs` into one variable * Compute all `BPFunctionNode` vectors at the end (like `nodesForStartup`) There should be no functional change.	2024-09-05 14:55:05 -07:00
Nico Weber	62e6c1ead7	[lld/mac] Allow -segprot having stricter initprot than maxprot on mac (#107269 ) ...including for catalyst. The usecase for this is to put certain security-critical variables into a special segment/section that's mapped as read-only most of the time, and that temporary gets remapped as writeable when these variables are written to be the program. This protects against them being written to by heap spraying attacks. This special section should be mapped as read-only at program start, so using `-segprot MY_PROTECTED_MEMORY_THINGER rw r` to mark that segment as rw maxprot and r initprot is exactly what we want. lld has so far rejected mismatching initprot and maxprot. ld64 doesn't reject this, but silently writes initprot into both fields (!) It looks like this might not be fully intentional, see https://crbug.com/41495919#comment5 and http://crbug.com/41495919#comment8. In any case, when postprocessing ld64's output to have different values for initprot and maxprot, the dynamic loader seems to do the right thing (see also the previous two links). The same technique also works on Windows, using both link.exe and lld-link.exe using `/SECTION:myprotsect,R`. So, since this is useful, allow it when targeting macOS, and make it do what you'd expect. Since loader support for this on iOS is less clear, keep disallowing it there for now. See the PR for the program I used to check that this seems to work. (I only checked on arm64 macOS 14.5 so far; will run this on many more systems on bots once this is merged and rolled in.)	2024-09-05 12:29:46 -04:00
Daniel Bertalan	b24a304435	[lld-macho] Always store symbol name length eagerly (NFC) (#106906 ) The only instance where we weren't already passing a `StringRef` with a known length to `Symbol`'s constructor is where the argument is a string literal. Even in that case, lazy `strlen` calls don't make sense, as the compiler can constant-evaluate the `StringRef(const char*)` constructor. For symbols that go into the symbol table we need the length when calculating the hash anyway. We could get away with not calling `getName()` for local symbols, but the total contribution of `strlen` to the run time is already below 1%, so that would just complicate the code for a negligible benefit.	2024-09-04 01:05:15 +02:00

1 2 3 4 5 ...

1163 Commits