llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-26 03:26:06 +00:00

Author	SHA1	Message	Date
Haohai Wen	3f7f446d38	[llvm-profgen] Remove temporary perf script files (#86668 ) The temporary perf script files converted from perf data will occupy lots of space for large project. This patch removes them when llvm-profgen exits normally or receives signals.	2024-04-11 15:28:32 +08:00
Haohai Wen	8c03f400a8	[llvm-profgen] Support COFF binary (#83972 ) Intel Vtune/SEP has supported collecting LBR on Windows and generating perf-script file which is same format as Linux perf script. This patch teaches llvm-profgen to disassemble COFF binary so that we can do Sampling based PGO on Windows.	2024-03-15 09:02:26 +08:00
Matthias Braun	8466ab98ca	llvm-profgen: Fix race condition (#83489 ) Fix race condition when multiple instances of `llvm-progen` read from the same inputs.	2024-02-29 14:53:11 -08:00
Lei Wang	24f025175d	[llvm-profgen] Filter out ambiguous cold profiles during profile generation (#81803 ) For the built-in local initialization function(`__cxx_global_var_init`, `__tls_init` prefix), there could be multiple versions of the functions in the final binary, e.g. `__cxx_global_var_init`, which is a wrapper of global variable ctors, the compiler could assign suffixes like `__cxx_global_var_init.N` for different ctors. However, in the profile generation, we call `getCanonicalFnName` to canonicalize the names which strip the suffixes. Therefore, samples from different functions queries the same profile(only `__cxx_global_var_init`) and the counts are merged. As the functions are essentially different, entries of the merged profile are ambiguous. In sample loading, for each version of this function, the IR from one version would be attributed towards a merged entries, which is inaccurate, especially for fuzzy profile matching, it gets multiple callsites(from different function) but using to match one callsite, which mislead the matching and report a lot of false positives. Hence, we want to filter them out from the profile map during the profile generation time. The profiles are all cold functions, it won't have perf impact.	2024-02-16 14:29:24 -08:00
Nathan Lanza	7ff2dc3b49	[profgen] Use a 64bit integer for &'ing the loadable address (#79930 ) For the linux kernel, the loadable segments start at 0xffff... and thus the 32 bit integer here was truncating all the meaningful bits. Grow it to 64 bits.	2024-01-30 13:10:22 -05:00
Benjamin Kramer	9423e45987	[ProfileData] Copy CallTargetMaps a bit less. NFCI	2023-12-24 17:48:18 +01:00
Kazu Hirata	586ecdf205	[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-11 21:01:36 -08:00
Kazu Hirata	92c2529ccd	[llvm] Stop including vector (NFC) Identified with clangd.	2023-12-03 22:32:21 -08:00
Kazu Hirata	a7627f721f	[llvm] Stop including list (NFC) Identified with clangd.	2023-12-02 10:47:26 -08:00
Hongtao Yu	7a3db658d9	[llvm-profgen] More tweaks to warnings (#68608 ) Tweaking warnings more to avoid flooding user log.	2023-10-22 17:00:14 -07:00
William Junda Huang	ef0e0adccd	[llvm-profdata] Do not create numerical strings for MD5 function names read from a Sample Profile. (#66164 ) This is phase 2 of the MD5 refactoring on Sample Profile following https://reviews.llvm.org/D147740 In previous implementation, when a MD5 Sample Profile is read, the reader first converts the MD5 values to strings, and then create a StringRef as if the numerical strings are regular function names, and later on IPO transformation passes perform string comparison over these numerical strings for profile matching. This is inefficient since it causes many small heap allocations. In this patch I created a class `ProfileFuncRef` that is similar to `StringRef` but it can represent a hash value directly without any conversion, and it will be more efficient (I will attach some benchmark results later) when being used in associative containers. ProfileFuncRef guarantees the same function name in string form or in MD5 form has the same hash value, which also fix a few issue in IPO passes where function matching/lookup only check for function name string, while returns a no-match if the profile is MD5. When testing on an internal large profile (> 1 GB, with more than 10 million functions), the full profile load time is reduced from 28 sec to 25 sec in average, and reading function offset table from 0.78s to 0.7s	2023-10-17 21:09:39 +00:00
Hongtao Yu	967767830d	[llvm-profgen] Print DWP related warnings under show-detailed-warning (#68019 ) Printing DWP related warnings under show-detailed-warning so that they won't flood user log.	2023-10-03 22:23:12 -07:00
Tom Stellard	e7247f1010	[profiling] Move option declarations into headers This will make it possible to add visibility attributes to these variables. This also fixes some type mismatches between the declaration and the definition. Reviewed By: bogner, huangjd Differential Revision: https://reviews.llvm.org/D156599	2023-09-30 18:51:28 -07:00
Hongtao Yu	47669af47f	[llvm-profgen] Ignore inline frames with an emtpy function name (#66678 ) Broken debug information can give empty names for an inlined frame, e.g, ``` 0x1d605c68: ryKeyINS7_17SmartCounterTypesEEESt10shared_ptrINS7_15AsyncCacheValueIS9_EEESaIhESt6atomicEEE9fetch_subElSt12memory_order at Filename: edata.h Function start filename: edata.h Function start line: 266 Function start address: 0x1d605c68 Line: 267 Column: 0 (inlined by) at Filename: edata.h Function start filename: edata.h Function start line: 274 Function start address: 0x1d605c68 Line: 275 Column: 0 (inlined by) _EEEmmEv at Filename: arena.c Function start filename: arena.c Function start line: 1303 Line: 1308 Column: 0 ``` This patch avoids creating a sample context with an empty function name by stopping tracking at that frame. This prevents a hash failure that leads to an ICE, where empty context serves at an empty key for the underlying MapVector `7624de5bea/llvm/lib/ProfileData/SampleProfWriter.cpp (L261)`	2023-09-18 12:40:06 -07:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Takuya Shimizu	01b88dd66d	[NFC] Remove unused variables declared in conditions D152495 makes clang warn on unused variables that are declared in conditions like `if (int var = init) {}` This patch is an NFC fix to suppress the new warning in llvm,clang,lld builds to pass CI in the above patch. Differential Revision: https://reviews.llvm.org/D158016	2023-08-30 10:05:06 +09:00
William Huang	7624de5bea	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-08-17 20:10:45 +00:00
Aaron Ballman	1a53b5c367	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 66ba71d913df7f7cd75e92c0c4265932b7c93292. Addressing issues found by: https://lab.llvm.org/buildbot/#/builders/245/builds/11732 https://lab.llvm.org/buildbot/#/builders/187/builds/12251 https://lab.llvm.org/buildbot/#/builders/186/builds/11099 https://lab.llvm.org/buildbot/#/builders/182/builds/6976	2023-07-28 09:41:38 -04:00
William Huang	66ba71d913	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-07-27 23:08:27 +00:00
Hongtao Yu	4bdc938ce3	[CSSPGO][Preinliner] Always inline zero-sized functions. Zero-sized functions should be cost-free in term of size budget, so they should be considered during inlining even if we run out of size budget. This appears to give 0.5% win for one of our internal services. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D153820	2023-06-27 17:06:24 -07:00
Haojian Wu	58056ae299	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 12e9c7aaa66b7624b5d7666ce2794d912bf9e4b7. The commit has broken the buildbot, see comment https://reviews.llvm.org/D147740#4451540	2023-06-27 15:19:35 +02:00
Hongtao Yu	5cbcaf1678	[CSSPGO][Preinliner] Bump up the threshold to favor previous compiler inline decision. The compiler has more insight and knowledge about functions based on their IR and attribures and should make a better inline decision than the offline preinliner does which is purely based on callsites hotness and code size. Therefore I'm making changes to favor previous compiler inline decision by bumping up the callsite allowance. This should improve the performance by more than 1% according to testing on Meta services. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D153797	2023-06-26 17:21:22 -07:00
William Huang	12e9c7aaa6	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-06-27 00:06:05 +00:00
Wenlei He	1a0d23efe1	[NFC] Generalize llvm-profgen message to cover both AutoFDO and CSSPGO Update llvm-profgen profile density message to cover both AutoFDO and CSSPGO. Differential Revision: https://reviews.llvm.org/D153730	2023-06-26 09:47:56 -07:00
Douglas Yung	c9a8a0e8a9	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 31af18bccea95fe1ae8aa2c51cf7c8e92a1c208e. This change is causing build failures on many Windows build bots: https://lab.llvm.org/buildbot/#/builders/216/builds/22833 https://lab.llvm.org/buildbot/#/builders/123/builds/19602 https://lab.llvm.org/buildbot/#/builders/172/builds/28315 https://lab.llvm.org/buildbot/#/builders/119/builds/13870 https://lab.llvm.org/buildbot/#/builders/233/builds/794 https://lab.llvm.org/buildbot/#/builders/235/builds/387 https://lab.llvm.org/buildbot/#/builders/13/builds/36921 https://lab.llvm.org/buildbot/#/builders/127/builds/50510	2023-06-23 17:58:22 -07:00
William Huang	31af18bcce	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-06-23 21:48:52 +00:00
Hongtao Yu	09742be818	[llvm-profgen] Remove target triple check to allow for more targets Llvm-profgen internally uses the llvm libraries and the MCDesc interface to do disassembling and symblization and it never checks against target-specific instruction operators. This makes it quite transparent to targets and a first attempt for an aarch64 binary just works. Therefore I'm removing the unnecessary triple check to unblock for new targets. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D153449	2023-06-23 10:16:24 -07:00
Kazu Hirata	a3b9c1533e	[tools] Use llvm::is_contained (NFC)	2023-06-19 23:36:14 -07:00
Mark Santaniello	27c37327da	Avoid pointless canonicalize when using Dwarf names CPU profile indicated memcmp was hot due to the two rfind calls in getCanonicalFnName. If UseSymbolTable is false, we can avoid the cost entirely. For CSSPGO profiles I've measured ~5% speedup with this change. Profile similarity before/after matches 100%. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D151441	2023-05-25 08:14:11 -07:00
Hongtao Yu	345fd0c10e	[FS-AFDO] Generate pseudo-probe-based profiles with FS-discriminators. This change enables generating pseudo-probe-based FS-AFDO profiles. The change is straightforward based-on previous change {D147651} by just injecting FS-discriminators into various profile generation spot. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D147957	2023-05-10 11:28:54 -07:00
William Huang	d38d6ca179	[llvm-profdata] Deprecate Compact Binary Sample Profile Format Remove support for compact binary sample profile format Reviewed By: davidxl, wenlei Differential Revision: https://reviews.llvm.org/D149400	2023-05-01 17:10:08 +00:00
wlei	339b8a0019	[AutoFDO] Use flattened profiles for profile staleness metrics For profile staleness report, before it only counts for the top-level function samples in the nested profile, the samples in the inlinees are ignored. This could affect the quality of the metrics when there are heavily inlined functions. This change adds a feature to flatten the nested profile and we're changing to use flatten profile as the input for stale profile detection and matching. Example for profile flattening: ``` Original profile: _Z3bazi:20301:1000 1: 1000 3: 2000 5: inline1:1600 1: 600 3: inline2:500 1: 500 Flattened profile: _Z3bazi:18701:1000 1: 1000 3: 2000 5: 600 inline1:600 inline1:1100:600 1: 600 3: 500 inline2: 500 inline2:500:500 1: 500 ``` This feature could be useful for offline analysis, like understanding the hotness of each individual function. So I'm adding the support to `llvm-profdata merge` under `--gen-flattened-profile`. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D146452	2023-03-30 11:05:10 -07:00
Hongtao Yu	5c2ae37bbe	[CSSPGO][Preinliner] Trim cold call edges of the profiled call graph for a more stable profile generation. I've noticed that for some services CSSPGO profile is less stable than non-CS AutoFDO profile from profiling to profiling without source changes. This is manifested by comparing profile similarities. For example in my experiments, AutoFDO profiles are always 99+% similar over same binary but different inputs (very close dynamic traffics) while CSSPGO profile similarity is around 90%. The main source of the profile stability is the top-down order computed on the profiled call graph in the llvm-profgen CS preinliner. The top-down order is used to guide the CS preinliner to pre-compute an inline decision that is later on fulfilled by the compiler. A subtle change in the top-down order from run to run could cause a different inline decision computed. A deeper look in the diversion of the top-down order revealed that: - The topological sorting inside one SCC isn't quite right. This is fixed by {D130717}. - The profiled call graphs of the two sides of the A/B run isn't 100% the same. The call edges in the two runs do not subsume each other, and edges appear in both graphs may not have exactly the same weight. This is due to the nature that the graphs are dynamic. However, I saw that the graphs can be made more close by removing the cold edges from them and this bumped up the CSSPGO profile stableness to the same level of the AutoFDO profile. Removing cold call edges from the dynamic call graph may have an impact on cold inlining, but so far I haven't seen any performance issues since the CS preinliner mainly targets hot callsites, and cold inlining can always be done by the compiler CGSCC inliner. Also fixing an issue where the largest weight instead of the accumulated weight for a call edge is used in the profiled call graph. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D147013	2023-03-28 16:24:38 -07:00
Fangrui Song	1e6921131a	Move global namespace cl::opt inside llvm::	2023-02-14 00:09:44 -08:00
Hongtao Yu	39eb1c6145	[CSSPGO][Preinliner] Set default value of sample-profile-inline-limit-max to 50000. The previous threshold 3000 is too small to enable any inlining for giant functions which come in with bigger size than that. In real world, I've seen a big hot function with 34000 dissasembly size. Motivated by that I'm changing the value to 50000. With the new value the allowance size growth should still be reasonable, as it is also bounded by another threshold, i.e, --sample-profile-inline-growth-limit , which defaults to 12. The new value should mostly only affect giant functions. I've seen for serveral internal services, the new threshold boosts performance, and it has neutral impact for other services without hot giant functions. So far I haven't seen any performance regression with that. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D143696	2023-02-13 09:17:50 -08:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Steven Wu	516e301752	[NFC][Profile] Access profile through VirtualFileSystem Make the access to profile data going through virtual file system so the inputs can be remapped. In the context of the caching, it can make sure we capture the inputs and provided an immutable input as profile data. Reviewed By: akyrtzi, benlangmuir Differential Revision: https://reviews.llvm.org/D139052	2023-02-01 09:25:02 -08:00
Elena Lepilkina	537cdf92c4	[llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute Differential Revision: https://reviews.llvm.org/D139553	2023-01-16 16:57:55 +03:00
Archibald Elliott	f09cf34d00	[Support] Move TargetParsers to new component This is a fairly large changeset, but it can be broken into a few pieces: - `llvm/Support/TargetParser` are all moved from the LLVM Support component into a new LLVM Component called "TargetParser". This potentially enables using tablegen to maintain this information, as is shown in https://reviews.llvm.org/D137517. This cannot currently be done, as llvm-tblgen relies on LLVM's Support component. - This also moves two files from Support which use and depend on information in the TargetParser: - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting the current Host machine for info about it, primarily to support getting the host triple, but also for `-mcpu=native` support in e.g. Clang. This is fairly tightly intertwined with the information in `X86TargetParser.h`, so keeping them in the same component makes sense. - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains the target triple parser and representation. This is very intertwined with the Arm target parser, because the arm architecture version appears in canonical triples on arm platforms. - I moved the relevant unittests to their own directory. And so, we end up with a single component that has all the information about the following, which to me seems like a unified component: - Triples that LLVM Knows about - Architecture names and CPUs that LLVM knows about - CPU detection logic for LLVM Given this, I have also moved `RISCVISAInfo.h` into this component, as it seems to me to be part of that same set of functionality. If you get link errors in your components after this patch, you likely need to add TargetParser into LLVM_LINK_COMPONENTS in CMake. Differential Revision: https://reviews.llvm.org/D137838	2022-12-20 11:05:50 +00:00
David Blaikie	09e79659bf	llvm-profgen: Fix use of stats to be under LLVM_ENABLE_STATS This caused a -Wunused-variable warning in a without-assert+with-stats build (because the stats were included but their use was not). Stat use is meant to be gated by LLVM_ENABLE_STATS which can be set independently of assertions.	2022-12-18 17:46:01 +00:00
Fangrui Song	21c4dc7997	std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This fixes clang.	2022-12-17 00:42:05 +00:00
Florian Hahn	e2b9cd796b	[llvm-profgen] Fix build failure after 5d7950a403bec25e52. This fixes a build failure with libc++ (`error: no matching function for call to 'max')`	2022-12-16 17:21:12 +00:00
Hongtao Yu	5d7950a403	[CSSPGO][llvm-profgen] Missing frame inference. This change introduces a missing frame inferrer aiming at fixing missing frames. It current only handles missing frames due to the compiler tail call elimination (TCE) but could also be extended to supporting other scenarios like frame pointer omission. When a tail called function is sampled, the caller frame will be missing from the call chain because the caller frame is reused for the callee frame. While TCE is beneficial to both perf and reducing stack overflow, a workaround being made in this change aims to find back the missing frames as much as possible. The idea behind this work is to build a dynamic call graph that consists of only tail call edges constructed from LBR samples and DFS-search for a unique path for a given source frame and target frame on the graph. The unique path will be used to fill in the missing frames between the source and target. Note that only a unique path counts. Multiple paths are treated unreachable since we don't want to overcount for any particular possible path. A switch --infer-missing-frame is introduced and defaults to be on. Some testing results: - 0.4% perf win according to three internal benchmarks. - About 2/3 of the missing tail call frames can be recovered, according to an internal benchmark. - 10% more profile generation time. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D139367	2022-12-16 08:44:43 -08:00
Kazu Hirata	6eb0b0a045	Don't include Optional.h These files no longer use llvm::Optional.	2022-12-14 21:16:22 -08:00
Fangrui Song	da2f5d0a41	[tools] llvm::Optional => std::optional	2022-12-14 08:01:04 +00:00
Fangrui Song	89fab98e88	[DebugInfo] llvm::Optional => std::optional https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-05 00:09:22 +00:00
Kazu Hirata	b4482f7ca0	[tools] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 21:11:40 -08:00
Matt Arsenault	e748db0f7f	Support: Convert Program APIs to std::optional	2022-12-01 17:00:44 -05:00
Kazu Hirata	286223edc6	[llvm-profgen] Use std::optional in ProfiledBinary.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-26 19:01:24 -08:00
Fangrui Song	c2250d8bc0	[CSSPGO] Move cl::opt inside llvm:: after D100528 and D108342	2022-11-23 23:08:49 -08:00

1 2 3 4 5

240 Commits