llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-17 02:06:33 +00:00

Author	SHA1	Message	Date
chrisPyr	038fff3f24	[NFC][BOLT] Make file-local cl::opt global variables static (#126472 ) #125983	2025-03-05 22:11:05 -08:00
Nikita Popov	e235fcb582	[BOLT] Only link and initialize supported targets (#127509 ) Bolt currently links and initializes all LLVM targets. This substantially increases the binary size, and link time if LTO is used. Instead, only link the targets specified by BOLT_TARGETS_TO_BUILD. We also have to only initialize those targets, so generate a TargetConfig.def file with the necessary information. The way the initialization is done mirrors what llvm-exegesis does. This reduces llvm-bolt size from 137MB to 78MB for me.	2025-02-18 09:17:51 +01:00
Nikita Popov	0abe058d7f	[BOLT] Use getMainExecutable() (#126698 ) Use LLVM's getMainExecutable() helper instead of rolling our own. This will result in standard behavior across platforms, such as making sure that symlinks are always resolved.	2025-02-12 09:44:26 +01:00
Amir Ayupov	8652608404	[BOLT] Fix counts aggregation in merge-fdata (#119652 ) merge-fdata used to consider misprediction count as part of "signature", or the aggregation key. This prevented it from collapsing profile lines with different misprediction counts, which resulted in duplicate `(from, to)` pairs with different misprediction and execution counts. Fix that by splitting out misprediction count and accumulating it separately. Test Plan: updated bolt/test/merge-fdata-lbr-mode.test	2024-12-14 22:38:24 -08:00
Amir Ayupov	97f43364cc	[BOLT][NFC] Speedup merge-fdata (#119942 ) Eliminate splitting the buffer into lines, and use `std::getline` directly. Simplify no_lbr and boltedcollection handling as well. Test Plan: For a large fdata file (200MB), fstream version is ~10% faster.	2024-12-14 22:26:20 -08:00
Tibor Dusnoki	5225f1b435	[BOLT][merge-fdata] Fix basic sample profile aggregation without LBR info (#118481 ) When a basic sample profile is gathered without LBR info, the generated profile contains a "no-lbr" tag in the first line of the fdata file. This PR fixes merge-fdata to recognize and save this tag to the output file.	2024-12-13 16:28:37 +00:00
Kristof Beyls	ceb7214be0	[BOLT] Introduce binary analysis tool based on BOLT (#115330 ) This initial commit does not add any specific binary analyses yet, it merely contains the boilerplate to introduce a new BOLT-based tool. This basically combines the 4 first patches from the prototype pac-ret and stack-clash binary analyzer discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and published at https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype The introduction of such a BOLT-based binary analysis tool was proposed and discussed in at least the following places: - The RFC pointed to above - EuroLLVM 2024 round table https://discourse.llvm.org/t/summary-of-bolt-as-a-binary-analysis-tool-round-table-at-eurollvm/78441 The round table showed quite a few people interested in being able to build a custom binary analysis quickly with a tool like this. - Also at the US LLVM dev meeting a few weeks ago, I heard interest from a few people, asking when the tool would be available upstream. - The presentation "Adding Pointer Authentication ABI support for your ELF platform" (https://llvm.swoogo.com/2024devmtg/session/2512720/adding-pointer-authentication-abi-support-for-your-elf-platform) explicitly mentioned interest to extend the prototype tool to verify correct implementation of pauthabi.	2024-12-12 10:06:27 +00:00
Amir Ayupov	6ee5ff95ab	[BOLT] Add profile density computation Reuse the definition of profile density from llvm-profgen (#92144): - the density is computed in perf2bolt using raw samples (perf.data or pre-aggregated data), - function density is the ratio of dynamically executed function bytes to the static function size in bytes, - profile density: - functions are sorted by density in decreasing order, accumulating their respective sample counts, - profile density is the smallest density covering 99% of total sample count. In other words, BOLT binary profile density is the minimum amount of profile information per function (excluding functions in tail 1% sample count) which is sufficient to optimize the binary well. The density threshold of 60 was determined through experiments with large binaries by reducing the sample count and checking resulting profile density and performance. The threshold is conservative. perf2bolt would print the warning if the density is below the threshold and suggest to increase the sampling duration and/or frequency to reach a given density, e.g.: ``` BOLT-WARNING: BOLT is estimated to optimize better with 2.8x more samples. ``` Test Plan: updated pre-aggregated-perf.test Reviewers: maksfb, wlei-llvm, rafaelauler, ayermolo, dcci, WenleiHe Reviewed By: WenleiHe, wlei-llvm Pull Request: https://github.com/llvm/llvm-project/pull/101094	2024-10-24 18:30:59 -07:00
Amir Ayupov	3c4f00905e	[BOLT] Support perf2bolt-N in the driver Check invoked tool with `starts_with`. Addresses the issue where `perf2bolt` invoked using a distro symlink `perf2bolt-16` fails to run in perf2bolt mode and runs in llvm-bolt mode instead. The issue is mentioned in https://vondra.me/posts/playing-with-bolt-and-postgres/ Test Plan: ``` ln -sf perf2bolt perf2bolt-20 perf2bolt-20 clang -p perf.data -o fdata.clang -w yaml.clang ... PERF2BOLT: wrote 188593 objects and 0 memory objects to fdata.clang ``` Reviewers: ayermolo, rafaelauler, dcci, maksfb Reviewed By: maksfb Pull Request: https://github.com/llvm/llvm-project/pull/111072	2024-10-14 10:17:31 -07:00
Maksim Panchenko	6fb39ac77b	[BOLT][merge-fdata] Initialize YAML profile header (#109613 ) While merging profiles, some fields in the input header, e.g. HashFunction, could be uninitialized leading to a UMR. Initialize merged header with the first input header. Fixes #109592	2024-09-25 23:18:34 +02:00
Amir Ayupov	15fa3ba547	[BOLT][YAML] Allow unknown keys in the input (#100824 ) This ensures forward compatibility, where old BOLT versions can consume the profile created by newer versions with extra keys. Test Plan: added yaml-unknown-keys.test	2024-09-03 11:27:57 -07:00
Amir Ayupov	fd38366e45	[BOLT][NFC] Clean includes, add license headers (#87200 )	2024-03-31 19:29:45 -07:00
Mehdi Amini	716042a63f	Rename llvm::ThreadPool -> llvm::DefaultThreadPool (NFC) (#83702 ) The base class llvm::ThreadPoolInterface will be renamed llvm::ThreadPool in a subsequent commit. This is a breaking change: clients who use to create a ThreadPool must now create a DefaultThreadPool instead.	2024-03-05 18:00:46 -08:00
Mehdi Amini	744616b3ae	Rename `ThreadPool::getThreadCount()` to `getMaxConcurrency()` (NFC) (#82296 ) This is addressing a long-time TODO to rename this misleading API. The old one is preserved for now but marked deprecated.	2024-02-19 18:07:12 -08:00
Amir Ayupov	52cf07116b	[BOLT][NFC] Log through JournalingStreams (#81524 ) Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:53:53 -08:00
Amir Ayupov	6735ce9d25	[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 ) Fix the bug where merge-fdata unconditionally outputs boltedcollection line, regardless of whether input files have it set. Test Plan: Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this fix.	2024-01-18 20:00:47 -08:00
Amir Ayupov	9fec33aadc	Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 )" This reverts commit 82bc33ea3f1a539be50ed46919dc53fc6b685da9. Accidentally pushed unrelated changes.	2024-01-18 19:59:09 -08:00
Amir Ayupov	82bc33ea3f	[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 ) Fix the bug where merge-fdata unconditionally outputs boltedcollection line, regardless of whether input files have it set. Test Plan: Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this fix.	2024-01-18 19:44:16 -08:00
Kazu Hirata	6da4a7a8e2	[BOLT] Use SmallString::operator std::string (NFC)	2024-01-15 21:59:06 -08:00
Kazu Hirata	ad8fd5b185	[BOLT] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 23:34:49 -08:00
Petr Hosek	f3269a94e7	[BOLT][CMake] Redo the build and install targets The existing BOLT install targets are broken on Windows becase they don't properly handle the output extension. We cannot use the existing LLVM macros since those make assumptions that don't hold for BOLT. This change instead implements custom macros following the approach used by Clang and LLD. Differential Revision: https://reviews.llvm.org/D151595	2023-06-01 14:48:01 +00:00
Petr Hosek	1d6a2c5357	Revert "[BOLT][CMake] Redo the build and install targets" This reverts commit f99a7d3e38095cfdaf7e729289a8894dd31c7efa since it broke the bolt-aarch64-ubuntu-clang-shared bot.	2023-06-01 08:03:50 +00:00
Petr Hosek	f99a7d3e38	[BOLT][CMake] Redo the build and install targets The existing BOLT install targets are broken on Windows becase they don't properly handle the output extension. We cannot use the existing LLVM macros since those make assumptions that don't hold for BOLT. This change instead implements custom macros following the approach used by Clang and LLD. Differential Revision: https://reviews.llvm.org/D151595	2023-06-01 06:01:39 +00:00
Petr Hosek	99a1aeefb3	Revert "[BOLT][CMake] Use LLVM macros for install targets" This reverts commit 627d5e16127bd8034b893e66ab0c86eacf2d939a.	2023-05-30 19:28:14 +00:00
Petr Hosek	627d5e1612	[BOLT][CMake] Use LLVM macros for install targets The existing BOLT install targets are broken on Windows becase they don't properly handle output extension. Rather than reimplementing this logic in BOLT, reuse the existing LLVM macros which already handle this aspect correctly. Differential Revision: https://reviews.llvm.org/D151595	2023-05-30 19:23:11 +00:00
Yi Kong	67cf01bd37	Reland^2 "[BOLT] Parallelize legacy profile merging" Resovled the issue that when number of tasks is fewer than cores, we end up creating as many threads as the number of cores, making the performance worse than the single thread version.	2023-05-22 13:37:41 -07:00
Yi Kong	65404e51bf	Revert "Reland "[BOLT] Parallelize legacy profile merging"" This reverts commit 611fb179b19857ffb87df81c926902fc7e3412ab. Broken tests	2023-05-18 16:26:43 -07:00
Yi Kong	611fb179b1	Reland "[BOLT] Parallelize legacy profile merging" This reverts commit 78d8d016490909ac759c6f76c5f8679bc7a58b2e.	2023-05-18 16:06:46 -07:00
Yi Kong	78d8d01649	Revert "[BOLT] Parallelize legacy profile merging" This reverts commit 35af20d9e036deeed250b73fd3ae86d6455173c5. The patch caused a test failure.	2023-04-28 21:24:52 +09:00
Yi Kong	35af20d9e0	[BOLT] Parallelize legacy profile merging Merging profiles is quite expensive, but easily paralleizable. 8359 profiles on n2d-standard-128: single-thread: 808s multi-thread: 200s (~75% speed up) Differential Revision: https://reviews.llvm.org/D149014	2023-04-27 15:37:14 +09:00
Yi Kong	d788db3d19	[BOLT][NFC] Simplify code using std::optional Use std::optional instead of tracking if it is the first profile seen. Differential Revision: https://reviews.llvm.org/D147308	2023-04-01 13:47:36 +08:00
Amir Ayupov	16492a6143	[BOLT][NFC] Rename {MachO,}RewriteInstance::create methods Follow the code style of fallible constructors in [LLVM Programmer's Manual] (https://llvm.org/docs/ProgrammersManual.html#fallible-constructors) and rename `RewriteInstance::createRewriteInstance` to `RewriteInstance::create` Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D143119	2023-02-02 12:30:45 -08:00
Amir Ayupov	72e5b14fe7	[BOLT][NFC] Use llvm::make_second_range Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D143019	2023-02-02 12:02:31 -08:00
serge-sans-paille	984b800a03	Move from llvm::makeArrayRef to ArrayRef deduction guides - last part This is a follow-up to https://reviews.llvm.org/D140896, split into several parts as it touches a lot of files. Differential Revision: https://reviews.llvm.org/D141298	2023-01-10 11:47:43 +01:00
Amir Ayupov	be08bb7755	[BOLT][CMake] Add merge-fdata to bolt component Build and install `merge-fdata` tool as part of `bolt` component: ``` $ ninja bolt # builds llvm-bolt, perf2bolt and merge-fdata $ cmake --install . --component bolt --prefix $HOME/test-install-bolt -- Install configuration: "Release" -- Install configuration: "Release" -- Installing: /home/aaupov/test-install-bolt/lib/libbolt_rt_instr.a -- Installing: /home/aaupov/test-install-bolt/lib/libbolt_rt_hugify.a -- Installing: /home/aaupov/test-install-bolt/lib/libbolt_rt_instr_osx.a -- Installing: /home/aaupov/test-install-bolt/bin/llvm-bolt -- Installing: /home/aaupov/test-install-bolt/bin/perf2bolt -- Installing: /home/aaupov/test-install-bolt/bin/llvm-boltdiff -- Installing: /home/aaupov/test-install-bolt/bin/merge-fdata ``` Fixes #57249. Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D139972	2023-01-03 17:40:36 -08:00
serge-sans-paille	61cff9079c	[BOLT] Support building bolt when LLVM_LINK_LLVM_DYLIB is ON This does not link with libLLVM, but with static archives instead. Not super-great, but at least the build works, which is probably better than failing. Related to #57551 Differential Revision: https://reviews.llvm.org/D134434	2022-09-23 07:59:30 +02:00
serge-sans-paille	9029ed2e4b	[BOLT] Fix (part of) dylib compatibility Non-LLVM components should not be listed as part of LLVM_LINK_COMPONENTS. Differential Revision: https://reviews.llvm.org/D134278	2022-09-22 10:41:40 +02:00
serge-sans-paille	3ca61941c1	Revert "[bolt] Fix (part of) dylib compatibility" This reverts commit 34ad83d883cc4505412a7c3e1e3da74e5408aa82.	2022-09-22 10:41:21 +02:00
serge-sans-paille	34ad83d883	[bolt] Fix (part of) dylib compatibility Non-LLVM component should not be listed as part of LLVM_LINK_COMPONENTS Differential Revision: https://reviews.llvm.org/D134278	2022-09-22 10:32:40 +02:00
Nicolai Hähnle	f7872cdce1	CommandLine: add and use cl::SubCommand::get{All,TopLevel} Prefer using these accessors to access the special sub-commands corresponding to the top-level (no subcommand) and all sub-commands. This is a preparatory step towards removing the use of ManagedStatic: with a subsequent change, these global instances will be moved to be regular function-scope statics. It is split up to give downstream projects a (albeit short) window in which they can switch to using the accessors in a forward-compatible way. Differential Revision: https://reviews.llvm.org/D129118	2022-08-02 23:49:16 +02:00
Rafael Auler	fc0ced73dc	Add BAT testing framework This patch refactors BAT to be testable as a library, so we can have open-source tests on it. This further fixes an issue with basic blocks that lack a valid input offset, making BAT omit those when writing translation tables. Test Plan: new testcases added, new testing tool added (llvm-bat-dump) Differential Revision: https://reviews.llvm.org/D129382	2022-07-29 14:55:04 -07:00
John Ericson	07b749800c	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in d0e1c2a550ef348aae036d0fe78cab6f038c420c to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-07-21 19:04:00 +00:00
Amir Ayupov	d2c8769936	[BOLT][NFC] Use range-based STL wrappers Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128154	2022-06-23 22:16:27 -07:00
John Ericson	0bb317b7bf	Revert "[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore" This reverts commit d5daa5c5b091cafb9b7ffd19b5dfa2daadef3229.	2022-06-10 19:26:12 +00:00
John Ericson	d5daa5c5b0	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in d0e1c2a550ef348aae036d0fe78cab6f038c420c to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-06-10 14:35:18 +00:00
Yi Kong	716d428ab5	[BOLT] Add `-o` option to merge-fdata Differential Revision: https://reviews.llvm.org/D126788	2022-06-02 01:29:04 +08:00
Yi Kong	2a42f7f72a	[BOLT] Allow merge-fdata to take a directory as input and recursively merge all files under said directory. This is similar to `llvm-profdata merge`. Differential Revision: https://reviews.llvm.org/D126695	2022-06-01 03:01:14 +08:00
Yi Kong	97715104c5	[BOLT][NFC] Don't over-specify the size of SmallVector This is the recommended way, should make merging profiles ever so slightly faster.	2022-05-31 16:16:38 +08:00
Rafael Auler	2fdc5d336e	[BOLT] Fix merge-fdata handling of BAT profiles When a profile is collected in a BOLTed binary, the generated profile is tagged with a header string "boltedcollection" in the first line of the fdata file. Fix merge-fdata to recognize this header string and preserve it into the output. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D125591	2022-05-13 19:41:55 -07:00
Amir Ayupov	bdba3d091c	[BOLT][CMAKE] Fix DYLIB build Move BOLT libraries out of `LLVM_LINK_COMPONENTS` to `target_link_libraries`. Addresses issue #55432. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125568	2022-05-13 13:27:21 -07:00

1 2

68 Commits