A std::vector causes a heap allocation and a memcpy of static
initialization data from the rodata section.
Use a static array instead so we can use the static data directly.
The performance of cold functions shouldn't matter too much, so if we
care about binary sizes, add an option to mark cold functions as
optsize/minsize for binary size, or optnone for compile times [1]. Clang
patch will be in a future patch.
This is intended to replace `shouldOptimizeForSize(Function&, ...)`.
We've seen multiple cases where calls to this expensive function, if not
careful, can blow up compile times. I will clean up users of that
function in a followup patch.
Initial version: https://reviews.llvm.org/D149800
[1]
https://discourse.llvm.org/t/rfc-new-feature-proposal-de-optimizing-cold-functions-using-pgo-info/56388
A colleague observes that switching the default value of
LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS to "On" hasn't flipped the value
in their CMakeCache.txt. This probably means that everyone with an
existing build tree is going to not have support built in, meaning
everyone in LLVM would need to clean+rebuild their worktree when we flip
the switch on... which doesn't sound good.
So instead, just delete the flag and everything it does, making everyone
build and run ~400 lit tests in RemoveDIs mode. None of the buildbots
have had trouble with this, so it Should Be Fine (TM).
(Sending for review as this is changing various comments, and touches
several different areas -- I don't want to get too punchy).
In Bazel, Clang current separates the clang executable into a
clang-driver library, and the actual clang executable. This allows
downstream users to make their own variations of clang, without having
to redo/maintain separate build pipelines.
This adds the same for opt for both CMake and Bazel.
This flag (--try-experimental-debuginfo-iterators) only exists for testing
purposes, to get some RUNlines running in new-debug-info mode before it's
properly supported. The flag isn't something that's going to be useful to
people using llvm 18, so hide it from the options list.
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #75380
Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)"
Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #64560
Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Our option to turn on the non-intrinsic form of debug-info
(`--experimental-debuginfo-iterators`) currently requires that LLVM is
built with the `LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS` cmake flag
enabled, so that some (slight) performance regressions aren't
on-by-default during the prototype/testing period. However, we still
want to be able to _optionally_ run tests, if support is built into
LLVM.
To allow optionally exercising the non-intrinsic debug-info code, this
patch adds `--try-experimental-debuginfo-iterators` to opt, which turns
the `--experimental-debuginfo-iterators` flag on if support is built in,
or leaves it off. This means we can run tests that:
* Use normal dbg.value intrinsics if there's no support, or
* Uses non-instruction DPValues if there is support.
Which means we can start getting test coverage of DPValues/RemoveDIs
behaviour, from in-tree tests, on our RemoveDIs buildbot. All the code
to do with automagically converting from one form to the other landed in
10a9e7442c4c4e.
There are many tests that specify a target triple/CPU flags but no
DataLayout which can lead to IR being generated that has unusual
behaviour. This commit attempts to use the default DataLayout based
on the relevant flags if there is no explicit override on the command
line or in the IR file.
One thing that is not currently possible to differentiate from a missing
datalayout `target datalayout = ""` in the IR file since the current
APIs don't allow detecting this case. If it is considered useful to
support this case (instead of passing "-data-layout=" on the command
line), I can change IR parsers to track whether they have seen such a
directive and change the callback type.
Differential Revision: https://reviews.llvm.org/D141060
In https://reviews.llvm.org/D125075, we switched to use
FastPreTileConfig in O0 and abandoned X86PreAMXConfigPass.
we can remove related code of X86PreAMXConfigPass safely.
This more closly matches the previous behaviour that silently ignored the
failure. We now print a warning instead of exiting.
Fixes: a8f8613dec435cfb78bf202c392f2acf150a5937
This creates a TargetMachine with the default options (from the command
line flags). This allows us to share a bit more code between tools.
Differential Revision: https://reviews.llvm.org/D141057
Implement a new pass to combine multiple image_load_2dmsaa and
2darraymsaa intrinsic calls into a single image_msaa_load if:
- they refer to the same vaddr except for sample_id,
- they use a constant sample_id and they fall into the same group,
- they have the same dmask and the number of instructions and the
number of vaddr/vdata dword transfers is reduced by the combine
This should be valid on all GFX11 but a hardware bug renders it
unworkable on GFX11.0.* so it is only enabled for GFX11.5.
Based on a patch by Rodrigo Dominguez!
Discussion about this approach: https://discourse.llvm.org/t/rfc-safer-whole-program-class-hierarchy-analysis/65144/18
When enabling WPD in an environment where native binaries are present, types we want to optimize can be derived from inside these native files and devirtualizing them can lead to correctness issues. RTTI can be used as a way to determine all such types in native files and exclude them from WPD providing a safe checked way to enable WPD.
The approach is:
1. In the linker, identify if RTTI is available for all native types. If not, under `--lto-validate-all-vtables-have-type-infos` `--lto-whole-program-visibility` is automatically disabled. This is done by examining all .symtab symbols in object files and .dynsym symbols in DSOs for vtable (_ZTV) and typeinfo (_ZTI) symbols and ensuring there's always a match for every vtable symbol.
2. During thinlink, if `--lto-validate-all-vtables-have-type-infos` is set and RTTI is available for all native types, identify all typename (_ZTS) symbols via their corresponding typeinfo (_ZTI) symbols that are used natively or outside of our summary and exclude them from WPD.
Testing:
ninja check-all
large Meta service that uses boost, glog and libstdc++.so runs successfully with WPD via --lto-whole-program-visibility. Previously, native types in boost caused incorrect devirtualization that led to crashes.
Reviewed By: MaskRay, tejohnson
Differential Revision: https://reviews.llvm.org/D155659
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted
in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.
Differential Revision: https://reviews.llvm.org/D154856
Previously the MemProf profile was expected to be in the same profile
file as a normal PGO profile, passed via the usual -fprofile-use=
option, and was matched in the same pass. To simplify profile
preparation, since the raw MemProf profile requires the binary for
symbolization and may be simpler to index separately from the raw PGO
profile, and also to enable providing a MemProf profile for a SamplePGO
build, separate out the MemProf feedback option and matching pass.
This patch adds the -fmemory-profile-use=${file} option, and the
provided file is passed down to LLVM and ultimately used in a new
MemProfUsePass which performs the matching of just the memory profile
contents of that file.
Note that a single profile file containing both normal PGO and MemProf
profile data is still supported, and the relevant profile data is
matched by the appropriate matching pass(es) based on which option(s)
the profile is provided with (the same profile file can be supplied to
both feedback options).
Differential Revision: https://reviews.llvm.org/D154856
Here's a high level summary of the changes in this patch. For more
information on rational, see the RFC.
(https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774).
- Add config parameter to LTO backend, specifying which LTO mode is
desired when using unified LTO.
- Add unified LTO flag to the summary index for efficiency. Unified
LTO modules can be detected without parsing the module.
- Make sure that the ModuleID is generated by incorporating more types
of symbols.
Differential Revision: https://reviews.llvm.org/D123803
SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.
Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.
This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.
Reviewed By: MaskRay, arsenm
Differential Revision: https://reviews.llvm.org/D150549
Remove dead code related to "FPasses". This was a leftover from
commit 7a5332b9b5584ce.
Do not mention -enable-new-pm in error messages. The option does
not exist any longer.
Remove the addPass helper. Only one use remained, so we can just
"inline" it manually to keep the code related to legacy PM a bit
less spread out.
Differential Revision: https://reviews.llvm.org/D148082
This commit is removing the last pieces of AnalysisWrapper.cpp
(including the ExternalFunctionsPassedConstants pass, aka
print-externalfnconstants).
The pass only existed for the legacy PM, and it was not regression
tested. And since the pass did not force the use of the legacy pass
manager there was no simply way to run the pass nowadays, at least
not by using opt.
Differential Revision: https://reviews.llvm.org/D148081
This removed the option print-breakpoints-for-testing in opt, as well
as the related BreakpointPrinter pass.
The functionality only existed for the legacy PM, but was not verified
to be working by any test cases. And the named "llvm.dbg.sp" metadata
that the pass was looking for is not something that I really can find
any information about (unless perhaps if I dive really deep into the
commit history), so not sure exactly if this functionality has been
relevant for several years.
Differential Revision: https://reviews.llvm.org/D148080
-enable-new-pm is no longer necessary except for bugpoint. Make the name more clunky so it hopefully won't be used.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D146103
With the NPM, we're now defaulting to preserving LCSSA, so a couple
of tests have changed slightly.
Differential Revision: https://reviews.llvm.org/D140982
The forwarding header is left in place because of its use in
`polly/lib/External/isl/interface/extract_interface.cc`, but I have
added a GCC warning about the fact it is deprecated, because it is used
in `isl` from where it is included by Polly.
Make the access to profile data going through virtual file system so the
inputs can be remapped. In the context of the caching, it can make sure
we capture the inputs and provided an immutable input as profile data.
Reviewed By: akyrtzi, benlangmuir
Differential Revision: https://reviews.llvm.org/D139052
This is an issue reported inside the NewPMDriver module. Static analyzer reported that Null pointer 'P' may be dereferenced at line 371 and two more sites. Proposed change guards this use.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D142047
When opaque pointers are enabled and old IR with typed pointers is read,
the BitcodeReader automatically upgrades all typed pointers to opaque
pointers. This is a lossy conversion, i.e. when a function argument is a
pointer and unused, it’s impossible to reconstruct the original type
behind the pointer.
There are cases where the type information of pointers is needed. One is
reading DXIL, which is bitcode of old LLVM IR and makes a lot of use of
pointers in function signatures.
We’d like to keep using up-to-date llvm to read in and process DXIL, so
in the face of opaque pointers, we need some way to access the type
information of pointers from the read bitcode.
This patch allows extracting type information by supplying functions to
parseBitcodeFile that get called for each function signature or metadata
value. The function can access the type information via the reader’s
type IDs and the getTypeByID and getContainedTypeID functions.
The tests exemplarily shows how type info from pointers can be stored in
metadata for use after the BitcodeReader finished.
Differential Revision: https://reviews.llvm.org/D127728