llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 20:36:05 +00:00

Author	SHA1	Message	Date
Amir Ayupov	6735ce9d25	[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 ) Fix the bug where merge-fdata unconditionally outputs boltedcollection line, regardless of whether input files have it set. Test Plan: Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this fix.	2024-01-18 20:00:47 -08:00
Amir Ayupov	9fec33aadc	Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 )" This reverts commit 82bc33ea3f1a539be50ed46919dc53fc6b685da9. Accidentally pushed unrelated changes.	2024-01-18 19:59:09 -08:00
Amir Ayupov	82bc33ea3f	[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 ) Fix the bug where merge-fdata unconditionally outputs boltedcollection line, regardless of whether input files have it set. Test Plan: Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this fix.	2024-01-18 19:44:16 -08:00
Amir Ayupov	dcba077146	[BOLT] Embed cold mapping info into function entry in BAT (#76903 ) Reduces BAT section size: - large binary: to 12283500 bytes (0.32x original size), - medium binary: to 1616020 bytes (0.27x original size), - small binary: to 404 bytes (0.28x original size). Test Plan: Updated bolt/test/X86/bolt-address-translation.test	2024-01-12 13:02:32 -08:00
Amir Ayupov	8fb8ad66c9	[BOLT] Delta-encode function start addresses in BAT (#76902 ) Further reduce the size of BAT section: - large binary: to 12716312 bytes (0.33x original), - medium binary: to 1649472 bytes (0.28x original), - small binary: to 428 bytes (0.30x original). Test Plan: Updated bolt/test/X86/bolt-address-translation.test	2024-01-11 14:35:37 -08:00
Amir Ayupov	bbe07989d7	[BOLT] Delta-encode offsets in BAT (#76900 ) This change further reduces the size of BAT: - large binary: to 13073904 bytes (0.34x original), - medium binary: to 1703116 bytes (0.29x original), - small binary: to 436 bytes (0.30x original). Test Plan: Updated bolt/test/X86/bolt-address-translation.test	2024-01-11 14:29:46 -08:00
Amir Ayupov	565f40d66b	[BOLT] Encode BAT using ULEB128 (#76899 ) Reduces BAT section size, bytes: - large binary: 38676872 -> 23262524 (0.60x), - medium binary (trunk clang): 5938004 -> 3213504 (0.54x), - small binary (X86/bolt-address-translation.test): 1436 -> 680 (0.47x). Test Plan: Updated bolt/test/X86/bolt-address-translation.test	2024-01-11 12:16:30 -08:00
Amir Ayupov	2bb511e277	[BOLT][NFC] Print BAT section size (#76897 ) Test Plan: Updated bolt/test/X86/bolt-address-translation.test	2024-01-11 11:04:04 -08:00
Min-Yih Hsu	23e03a85dc	[BOLT] Update test case after #77253 PR #77253 removed the '@plt' suffix from callee symbols. Update RISCV/relax.s accordingly.	2024-01-08 11:05:38 -08:00
ShatianWang	1577483413	[BOLT] Don't split likely fallthrough in CDSplit (#76164 ) This diff speeds up CDSplit by not considering any hot-warm splitting point that could break a fall-through branch from a basic block to its most likely successor. Co-authored-by: spupyrev <spupyrev@fb.com>	2023-12-21 16:17:10 -05:00
Jon Roelofs	d6f772074c	fixup! fixup! [GlobalISel] Always direct-call IFuncs and Aliases (#74902 ) Apparently some BOLT bots build with a pre-installed system clang, and others use the just-built one. These two clangs now behave slightly differently when it comes to ifunc codegen after https://github.com/llvm/llvm-project/pull/74902 Change the test to accept both patterns.	2023-12-15 12:48:11 -07:00
Jon Roelofs	3017adb37e	fixup! [GlobalISel] Always direct-call IFuncs and Aliases (#74902 ) The codegen change broke one of the BOLT tests.	2023-12-15 12:17:07 -07:00
Wang Yaduo	c532ba4edd	[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053 ) Enable the llvm-objdump to disassemble the immediate of RISCV instruction in hexadecimal format with --print-imm-hex flag.	2023-12-14 22:42:11 -08:00
Vitaly Buka	fc3adf74d3	Revert "[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format" (#75561 ) Reverts llvm/llvm-project#74053 Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39291 Co-authored-by: Wang Yaduo <wangyaduo@linux.alibaba.com> Issue #75563	2023-12-14 22:05:47 -08:00
Wang Yaduo	3dde0d0256	[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053 ) Enable the llvm-objdump to disassemble the immediate of RISCV instruction in hexadecimal format with --print-imm-hex flag.	2023-12-15 10:13:20 +08:00
Alexander Yermolovich	bf2b035e58	[BOLT][DWARF] Fix handling .debug_str_offsets for type units (#75522 ) There was an assumpiton that TUs and CUs share .debug_str_offsets contribution. For ThinLTO builds it is not the case. Changed so that we parse contributions for TUs also, and did some refactoring so that we don't re-parse contributions that were not modified.	2023-12-14 17:27:21 -08:00
Rafael Auler	a26aa79a3b	[BOLT] Fix some dwarf tests affected by 75095 (#75327 ) PR 75095 introduced some changes to lld that broke some dwarf tests that were being incorrectly linked as a PIE. Add flags to disable any PIC/PIE compilation, so the linker can succeed and the tests can run as intended.	2023-12-13 06:11:15 -08:00
Alexander Yermolovich	fb9a851224	[BOLT][DWARF] Fix handling of debug_str_offsets (#75100 ) We were not setting size field of .debug_str_offsets correctly. Fixed it, and added a test.	2023-12-11 15:56:32 -08:00
Amir Ayupov	b039ccc684	[BOLT] Provide backwards compatibility for YAML profile with std::hash (#74253 ) Provide backwards compatibility for YAML profile that uses `std::hash`: xxh3 hash is the default for newly produced profile (sets `std-hash: false`), whereas the profile that doesn't specify `std-hash` will be treated as `std-hash: true`, preserving old behavior.	2023-12-11 12:27:32 -08:00
sinan	b304873134	[BOLT] Fix a wrong compiler option in test (#74420 ) -nopie is an option for OpenBSD, and other linux distribution might report an `unsupported option '-nopie' for target` error.	2023-12-06 17:16:48 +08:00
eleviant	f20af7372f	[bolt] Support arm64 FP register spills (#73021 ) At the moment llvm-bolt fails when analyzing jump tables on aarch64 in case FP register spill/reload is used.	2023-12-05 20:32:58 +01:00
ShatianWang	4483cf2d8b	[BOLT] CDSplit main logic part 2/2 (#74032 ) This diff implements the main splitting logic of CDSplit. CDSplit processes functions in a binary in parallel. For each function BF, it assumes that all other functions are hot-cold split. For each possible hot-warm split point of BF, it computes its corresponding SplitScore, and chooses the split point with the best SplitScore. The SplitScore of each split point is computed in the following way: each call edge or jump edge has an edge score that is proportional to its execution count, and inversely proportional to its distance. The SplitScore of a split point is a sum of edge scores over a fixed set of edges whose distance can change due to hot-warm splitting BF. This set contains all cover calls in the form of X->Y or Y->X given function order [... X ... BF ... Y ...]; we refer to the sum of edge scores over the set of cover calls as CoverCallScore. This set also contains all jump edges (branches) within BF as well as all call edges originated from BF; we refer to the sum of edge scores over this set of edges as LocalScore. CDSplit finds the split index maximizing CoverCallScore + LocalScore.	2023-11-30 23:17:11 -05:00
Alexander Yermolovich	52be47b890	[BOLT][DWARF] Add support to create path (#73884 ) When option --dwarf-output-path is specified, if the path does not exist BOLT will now create it. This is what also happens when --plugin-opt=dwo_dir=<value> is specified to LLD.	2023-11-30 09:41:01 -08:00
Maksim Panchenko	0acfe8483a	[BOLT][DWARF] Fix output ranges for deleted code (#73464 ) Set range low_pc to 0 for DIEs that correspond to deleted code. Fixes #73428	2023-11-28 22:40:53 -08:00
Alexander Yermolovich	b47b3bee7b	[BOLT][DWARF] Fix handling of DWARF5 DWP (#72729 ) Fixed handling of DWP as input. Before BOLT crashed. Now it will write out correct CU, and all the TUs. Potential future improvement is to scan all the TUs used in this CU, and only include those.	2023-11-28 15:54:14 -08:00
Amir Ayupov	af4d8d5af6	[BOLT][test] Update perf2bolt/perf_test.test (#73482 )	2023-11-28 07:00:07 -08:00
spupyrev	e7dd596c68	[BOLT] Use deterministic xxh3 for computing BF/BB hashes (#72542 ) std::hash and ADT/Hashing::hash_value are non-deterministic functions whose results might vary across implementation/process/execution. Using xxh3 instead for computing hashes of BinaryFunctions and BinaryBasicBlock for stale profile matching. (A possible alternative is to use ADT/StableHashing.h based on FNV hashing but xxh3 seems to be more popular in LLVM) This is to address https://github.com/llvm/llvm-project/issues/65241.	2023-11-27 14:45:46 -08:00
Amir Ayupov	ab14eb23b6	[BOLT][test] Replace /dev/null with temp file (#73485 ) NFC processing time script identifies tests by output filename. When `/dev/null` is used as output filename, we're unable to tell the source test, and the reports are unhelpful. Replace `/dev/null/` with `%t.null` which resolves the issue.	2023-11-27 10:53:18 -08:00
ShatianWang	d333c0e062	[BOLT] Extend calculateEmittedSize() for block size calculation (#73076 ) This commit modifies BinaryContext::calculateEmittedSize() to update the BinaryBasicBlock::OutputAddressRange of each basic block in the function in place. BinaryBasicBlock::getOutputSize() now gives the emitted size of the basic block.	2023-11-23 15:28:31 -05:00
Maksim Panchenko	84602066a6	[BOLT] Fix C++ exceptions when LPStart is specified (#72737 ) Whenever LPStartEncoding was different from DW_EH_PE_omit, we used to miscalculate LPStart. As a result, landing pads were assigned wrong addresses. Fix that.	2023-11-20 20:55:38 -08:00
Maksim Panchenko	445f6f1373	[BOLT][TEST] Remove LTO flag from a test (#72896 ) The LTO flag is not needed for the test to work properly. However, it may not build on a system where compiler and linker versions don't match one another. Remove the LTO flag.	2023-11-20 10:24:34 -08:00
JohnLee1243	ae51ec84bb	[Bolt] Solving pie support issue (#65494 ) Now PIE is default supported after clang 14. It cause parsing error when using perf2bolt. The reason is the base address can not get correctly. Fix the method of geting base address. If SegInfo.Alignment is not equal to pagesize, alignDown(SegInfo.FileOffset, SegInfo.Alignment) can not equal to FileOffset. So the SegInfo.FileOffset and FileOffset should be aligned by SegInfo.Alignment first and then judge whether they are equal. The .text segment's offset from base address in VAS is aligned by pagesize. So MMapAddress's offset from base address is alignDown(SegInfo.Address, pagesize) instead of alignDown(SegInfo.Address, SegInfo.Alignment). So the base address calculate way should be changed. Co-authored-by: Li Zhuohang <lizhuohang3@huawei.com>	2023-11-16 15:05:06 +08:00
Vladislav Khmelevsky	c5a306f07e	[BOLT] Fix LSDA section handling (#71821 ) Currently BOLT finds LSDA secition by it's name .gcc_except_table.main . But sometimes it might have suffix e.g. .gcc_except_table.main. Find LSDA section by it's address, rather by it's name. Fixes #71804	2023-11-15 23:21:50 +04:00
Maksim Panchenko	f633f325a1	[BOLT] Fix NOP instruction emission on x86 (#72186 ) Use MCAsmBackend::writeNopData() interface to emit NOP instructions on x86. There are multiple forms of NOP instruction on x86 with different sizes. Currently, LLVM's assembly/disassembly does not support all forms correctly which can lead to a breakage of input code semantics, e.g. if the program relies on NOP instructions for reserving a patch space. Add "--keep-nops" option to preserve NOP instructions.	2023-11-13 18:12:39 -08:00
Alexander Yermolovich	ce17c6d3ba	[BOLT][DWARF] Fix --dwarf-output-path (#71886 ) Fixed a bug where when --dwarf-output-path is specified and DW_AT_dwo_name contains part of the path the output path would contain both. Which lead to llvm-bolt crash, because the path didn't exist. Example: llvm-bolt .... --dwarf-output-path=/some/path/ DW_AT_dwo_name ("objects/o1/split.dwo") It would try to write .dwo file to /some/path/objects/o1/split.dwo.dwo instead of to /some/path/split.dwo.dwo	2023-11-10 13:18:57 -08:00
Vladislav Khmelevsky	cf18f142c0	[BOLT] Read .rela.dyn in static non-pie binary (#71635 ) Static non-pie binary doesn't have DYNAMIC segment and BOLT skips reading .rela.dyn section because of it. But such binaries might have this section for example to store IFUNC relocation which is resolved by linked-in startup files, so force reading this section for static executables.	2023-11-10 11:47:12 +04:00
Vladislav Khmelevsky	abec50cb93	[BOLT][AArch64] Fix strict usage during ADR Relax (#71377 ) Currently strict mode is used to expand number of optimized functions, not to shrink it. Revert the option usage in the pass, so passing strict option would relax adr instruction even if there are no nops around it. Also add check for nop after adr instruction.	2023-11-10 11:46:36 +04:00
spaette	1a2f83366b	[BOLT] Fix typos (#68121 ) Closes https://github.com/llvm/llvm-project/issues/63097 Before merging please make sure the change to bolt/include/bolt/Passes/StokeInfo.h is correct. bolt/include/bolt/Passes/StokeInfo.h ```diff // This Pass solves the two major problems to use the Stoke program without - // proting its code: + // probing its code: ``` I'm still not happy about the awkward wording in this comment. bolt/include/bolt/Passes/FixRelaxationPass.h ``` $ ed -s bolt/include/bolt/Passes/FixRelaxationPass.h <<<'9,12p' // This file declares the FixRelaxations class, which locates instructions with // wrong targets and fixes them. Such problems usually occures when linker // relaxes (changes) instructions, but doesn't fix relocations types properly // for them. $ ``` bolt/docs/doxygen.cfg.in bolt/include/bolt/Core/BinaryContext.h bolt/include/bolt/Core/BinaryFunction.h bolt/include/bolt/Core/BinarySection.h bolt/include/bolt/Core/DebugData.h bolt/include/bolt/Core/DynoStats.h bolt/include/bolt/Core/Exceptions.h bolt/include/bolt/Core/MCPlusBuilder.h bolt/include/bolt/Core/Relocation.h bolt/include/bolt/Passes/FixRelaxationPass.h bolt/include/bolt/Passes/InstrumentationSummary.h bolt/include/bolt/Passes/ReorderAlgorithm.h bolt/include/bolt/Passes/StackReachingUses.h bolt/include/bolt/Passes/StokeInfo.h bolt/include/bolt/Passes/TailDuplication.h bolt/include/bolt/Profile/DataAggregator.h bolt/include/bolt/Profile/DataReader.h bolt/lib/Core/BinaryContext.cpp bolt/lib/Core/BinarySection.cpp bolt/lib/Core/DebugData.cpp bolt/lib/Core/DynoStats.cpp bolt/lib/Core/Relocation.cpp bolt/lib/Passes/Instrumentation.cpp bolt/lib/Passes/JTFootprintReduction.cpp bolt/lib/Passes/ReorderData.cpp bolt/lib/Passes/RetpolineInsertion.cpp bolt/lib/Passes/ShrinkWrapping.cpp bolt/lib/Passes/TailDuplication.cpp bolt/lib/Rewrite/BoltDiff.cpp bolt/lib/Rewrite/DWARFRewriter.cpp bolt/lib/Rewrite/RewriteInstance.cpp bolt/lib/Utils/CommandLineOpts.cpp bolt/runtime/instr.cpp bolt/test/AArch64/got-ld64-relaxation.test bolt/test/AArch64/unmarked-data.test bolt/test/X86/Inputs/dwarf5-cu-no-debug-addr-helper.s bolt/test/X86/Inputs/linenumber.cpp bolt/test/X86/double-jump.test bolt/test/X86/dwarf5-call-pc-function-null-check.test bolt/test/X86/dwarf5-split-dwarf4-monolithic.test bolt/test/X86/dynrelocs.s bolt/test/X86/fallthrough-to-noop.test bolt/test/X86/tail-duplication-cache.s bolt/test/runtime/X86/instrumentation-ind-calls.s	2023-11-09 11:29:46 -08:00
Maksim Panchenko	11f52f783a	[BOLT][DWARF] Fix invalid address ranges (#71474 ) When NOP instructions are removed by BOLT and a DWARF address range falls past the removed instructions, it may lead to invalid DWARF ranges in the output binary. E.g. the range may fall outside of the basic block boundaries. This fix makes sure the modified range fits within the containing basic block. A proper fix requires tracking instructions within the block and will come in a different PR.	2023-11-09 09:55:49 -08:00
Job Noorman	c4b096a343	[BOLT] Fix typo in test	2023-11-09 09:14:27 +01:00
Rafael Auler	4c9f6d6f02	[BOLT][AArch64] Fix ifuncs test header inclusion (#71741 ) Summary: Do not include stdlib headers as these tests are built with -nostdlib. Tests outside of runtime folder also run cross-platforms, so an x86 machine wouldn't have access to the correct headers used in the aarch64 toolchain, even if it has an aarch64 compiler (clang itself).	2023-11-08 16:42:21 -08:00
Job Noorman	96b5e092dc	[BOLT] Support instrumentation hook via DT_FINI_ARRAY (#67348 ) BOLT currently hooks its its instrumentation finalization function via `DT_FINI`. However, this method of calling finalization routines is not supported anymore on newer ABIs like RISC-V. `DT_FINI_ARRAY` is preferred there. This patch adds support for hooking into `DT_FINI_ARRAY` instead if the binary does not have a `DT_FINI` entry. If it does, `DT_FINI` takes precedence so this patch should not change how the currently supported instrumentation targets behave. `DT_FINI_ARRAY` points to an array in memory of `DT_FINI_ARRAYSZ` bytes. It consists of pointer-length entries that contain the addresses of finalization functions. However, the addresses are only filled-in by the dynamic linker at load time using relative relocations. This makes hooking via `DT_FINI_ARRAY` a bit more complicated than via `DT_FINI`. The implementation works as follows: - While scanning the binary: find the section where `DT_FINI_ARRAY` points to, read its first dynamic relocation and use its addend to find the address of the fini function we will use to hook; - While writing the output file: overwrite the addend of the dynamic relocation with the address of the runtime library's fini function. Updating the dynamic relocation required a bit of boiler plate: since dynamic relocations are stored in a `std::multiset` which doesn't support getting mutable references to its items, functions were added to `BinarySection` to take an existing relocation and insert a new one.	2023-11-08 11:01:10 +00:00
Vladislav Khmelevsky	e2f1a95f2a	[BOLT][AArch64] Handle IFUNCS properly (#71104 ) Currently we were testing only the binaries compiled with O0, which results in indirect call to the IFUNC trampoline and the trampoline has associated IFUNC symbol with it. Compile with O3 results in direct calling the IFUNC trampoline and no symbols are associated with it, the IFUNC symbol address becomes the same as IFUNC resolver address. Since no symbol was associated the BF was not created before PLT analyze and be the algorithm we're going to analyze target relocation. As we're expecting the JUMP relocation we're also expecting the associated symbol with it to be presented. But for IFUNC relocation the IRELATIVE relocation is used and no symbol is associated with it, the addend value is pointing on the target symbol, so we need to find BF using it and use it's symbol in this situation. Currently this is checked only for AArch64 platform, so I've limited it in code to use this logic only for this platform, although I wouldn't be surprised if other platforms needs to activate this logic too.	2023-11-08 11:41:43 +04:00
Vladislav Khmelevsky	485075c095	[BOLT][AArch64] Don't change layout in PatchEntries (#71278 ) Due to LongJmp pass that is executed before PatchEntries we can't ignore the function here since it would change pre-calculated output layout. The test reloc-26 relied on the wrong behavior, rewritten to unittest. This is also attemp to fix #70771	2023-11-08 11:38:46 +04:00
maksfb	7f031d1c7c	[BOLT] Fix address mapping for ICP code (#70136 ) When we create new code for indirect code promotion optimization, we should mark it as originating from the indirect jump instruction for BOLT address translation (BAT) to map it to the original instruction.	2023-11-06 11:25:49 -08:00
J. Ryan Stinnett	d5e33cc147	[DebugInfo] Use human-friendly printing for DWARF column attributes (#71062 )	2023-11-04 17:08:42 +00:00
Vladislav Khmelevsky	888742a121	[BOLT][AArch64] Handle .plt.got section (#71216 ) It seems that currently this section is only created by the mold linker if 2 conditions are met: 1. The PLT function was called directly. 2. The indirect access to PLT function was found (e.g. through ADRP relocation). Although mold created symbol for every plt entry I've removed them in yaml file to check that .plt.got was truly disassembled by bolt.	2023-11-04 00:47:24 +04:00
maksfb	8244ff6739	[BOLT] Fix incorrect basic block output addresses (#70000 ) Some optimization passes may duplicate basic blocks and assign the same input offset to a number of different blocks in a function. This is done e.g. to correctly map debugging ranges for duplicated code. However, duplicate input offsets present a problem when we use AddressMap to generate new addresses for basic blocks. The output address is calculated based on the input offset and will be the same for blocks with identical offsets. The result is potentially incorrect debug info and BAT records. To address the issue, we have to eliminate the dependency on input offsets while generating output addresses for a basic block. Each block has a unique label, hence we extend AddressMap to include address lookup based on MCSymbol and use the new functionality to update block addresses.	2023-10-24 12:22:43 -07:00
Job Noorman	b6b492880f	[BOLT][RISCV] Set minimum function alignment to 2 for RVC (#69837 ) In #67707, the minimum function alignment on RISC-V was set to 4. When RVC (compressed instructions) is enabled, the minimum alignment can be reduced to 2. This patch implements this by delegating the choice of minimum alignment to a new `MCPlusBuilder::getMinFunctionAlignment` function. This way, the target-dependent code in `BinaryFunction` is minimized.	2023-10-23 08:09:11 +00:00
Job Noorman	86bc486785	[BOLT][RISCV] Use target features from object file (#69836 ) We used to hard-code target features for RISC-V. However, most features (with the exception of relax) are stored in the object file. This patch extracts those features to ensure BOLT's output doesn't use any features not present in the input file.	2023-10-23 06:40:25 +00:00

1 2 3 4 5 ...

453 Commits