Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
This diff splits out (from LLVMCore) IR printing passes into IRPrinter.
This structure is similar to what we already have for IRReader and
enables us to avoid circular dependencies between LLVMCore and Analysis
(this is a preparation for https://reviews.llvm.org/D137768).
The legacy interface is left unchanged, once the legacy pass manager
is removed (in the future) we will be able to clean it up further.
The bazel build configuration has been updated as well.
Test plan:
1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc
2/ bazel build --config=generic_clang @llvm-project//...
Differential revision: https://reviews.llvm.org/D138081
This reverts commit eb2a57ebc7aaad551af30462097a9e06c96db925.
llvm/include/llvm/Transforms/Instrumentation/KCFI.h including
llvm/CodeGen is a layering violation. We should use an approach where
Instrumementation/ doesn't need to include CodeGen/.
Sorry for not spotting this in the review.
The KCFI sanitizer emits "kcfi" operand bundles to indirect
call instructions, which the LLVM back-end lowers into an
architecture-specific type check with a known machine instruction
sequence. Currently, KCFI operand bundle lowering is supported only
on 64-bit X86 and AArch64 architectures.
As a lightweight forward-edge CFI implementation that doesn't
require LTO is also useful for non-Linux low-level targets on
other machine architectures, add a generic KCFI operand bundle
lowering pass that's only used when back-end lowering support is not
available and allows -fsanitize=kcfi to be enabled in Clang on all
architectures.
Reviewed By: nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D135411
The conditions for which Clang emits the `unsafe-fp-math` function
attribute has been modified as part of
`84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`.
In the backend code generators `"unsafe-fp-math"="true"` enable floating
point contraction for the whole function.
The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
`unsafe-fp-math` to be set also when only in-statement contraction is
allowed.
Consider the following example
```
float foo(float a, float b, float c) {
float tmp = a * b;
return tmp + c;
}
```
and compile it with the command line
```
clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
-O2 -mavx512f -S -o -
```
The resulting assembly has a `vfmadd213ss` instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.
The optimized IR is:
```
define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
%mul = fmul reassoc nsz arcp afn float %b, %a
%add = fadd reassoc nsz arcp afn float %mul, %c
ret float %add
}
attributes #0 = {
[...]
"no-signed-zeros-fp-math"="true"
"no-trapping-math"="true"
[...]
"unsafe-fp-math"="true"
}
```
The `"unsafe-fp-math"="true"` function attribute allows the backend code
generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`.
In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the `contract` fast-math flag and
`llvm.fmuladd` is being used to represent contractions opportunities
that occur within a single statement.
Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction
across statements is allowed.
Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see `CGFPOptionsRAII`).
For consistency `unsafe-fp-math` needs to be disabled if the contraction
mode for any scope/operation is not `fast`.
Similarly for consistency reason the initialization of `UnsafeFPMath` of
in `TargetOptions` for the backend code generation should take into
account the contraction mode as well.
Reviewed By: zahiraam
Differential Revision: https://reviews.llvm.org/D136786
This reverts commit bf8381a8bce28fc69857645cc7e84a72317e693e.
There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
This reverts commit ef9e624694c0f125c53f7d0d3472fd486bada57d
for further investigation offline.
It appears to break the buildbot
llvm-clang-x86_64-sie-ubuntu-fast.
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
Move these to the new PM if they're used there.
Part of removing the legacy pass manager for optimization pipeline.
Reland with UseNewGVN usage in clang removed.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D137915
Reverted in 98fa95492f3bbd5befdeb36c88a3ac5ef2740b4e.
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in
the previous patch in this set, into the optimisation pipeline from
clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the
main test for this patch.
Note: while clang (with the help of the declare-to-assign pass) can now emit
Assignment Tracking metadata, the llvm middle and back ends don't yet
understand it.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D132226
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in
the previous patch in this set, into the optimisation pipeline from
clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the
main test for this patch.
Note: while clang (with the help of the declare-to-assign pass) can now emit
Assignment Tracking metadata, the llvm middle and back ends don't yet
understand it.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D132226
Avoid calling getenv in the MC layer and let the clang driver do it so
that it is reflected in the command-line as an -mllvm option.
rdar://101558354
Differential Revision: https://reviews.llvm.org/D136888
Restore GlobalsAA if sanitizers inserted at early optimize callback.
The analysis can be useful for the following FunctionPassManager.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D133537
Set the EmulatedTLS option based on `Triple::hasDefaultEmulatedTLS()`
if the user didn't specify it; set `ExplicitEmulatedTLS` to true
in `llvm::TargetOptions` and set `EmulatedTLS` to Clang's
opinion of what the default or preference is.
This avoids any risk of deviance between the two.
This affects one check of `getCodeGenOpts().EmulatedTLS` in
`shouldAssumeDSOLocal` in CodeGenModule, but as that check only
is done for `TT.isWindowsGNUEnvironment()`, and
`hasDefaultEmulatedTLS()` returns false for such environments
it doesn't make any current testable difference - thus NFC.
Some mingw distributions carry a downstream patch, that enables
emulated TLS by default for mingw targets in `hasDefaultEmulatedTLS()`
- and for such cases, this patch does make a difference and fixes the
detection of emulated TLS, if it is implicitly enabled.
Differential Revision: https://reviews.llvm.org/D132916
Introduces the frontend flag -fexperimental-sanitize-metadata=, which
enables SanitizerBinaryMetadata instrumentation.
The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. The plan is to open source a stable and production
quality version of GWP-TSan. The development of which, however, requires
upstream compiler support.
[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
Until the tool has been open sourced, we mark this kind of
instrumentation as "experimental", and reserve the option to change
binary format, remove features, and similar.
Reviewed By: vitalybuka, MaskRay
Differential Revision: https://reviews.llvm.org/D130888
Debugify in OriginalDebugInfo mode, introduced with D82545,
runs only with legacy PassManager.
This patch enables this utility for the NewPM.
Differential Revision: https://reviews.llvm.org/D115351
Now that we have the sanitizer metadata that is actually on the global
variable, and now that we use debuginfo in order to do symbolization of
globals, we can delete the 'llvm.asan.globals' IR synthesis.
This patch deletes the 'location' part of the __asan_global that's
embedded in the binary as well, because it's unnecessary. This saves
about ~1.7% of the optimised non-debug with-asserts clang binary.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D127911
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
We use the `OffloadBinary` to create binary images of offloading files
and their corresonding metadata. This patch changes this to inherit from
the base `Binary` class. This allows us to create and insepect these
more generically. This patch includes all the necessary glue to
implement this as a new binary format, along with added the magic bytes
we use to distinguish the offloading binary to the `file_magic`
implementation.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D126812
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125165
Default behavior for .file directory was changed in D105856, but
ptxas (CUDA 11.5 release) refuses to parse it:
$ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll
$ ptxas debug-file-loc.s
ptxas debug-file-loc.s, line 42; fatal : Parsing error near
'"foo.h"': syntax error
Added a new field to MCAsmInfo to control default value of
UseDwarfDirectory. This value is used if -dwarf-directory command line
option is not specified.
Differential Revision: https://reviews.llvm.org/D121299
The legacy passes are deprecated now and would be removed in near
future. This patch tries to remove legacy passes in coroutines.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D123918
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.
Differential Revision: https://reviews.llvm.org/D122683
This removes the -flegacy-pass-manager and
-fno-experimental-new-pass-manager options, and the corresponding
support code in BackendUtil. The -fno-legacy-pass-manager and
-fexperimental-new-pass-manager options are retained as no-ops.
Differential Revision: https://reviews.llvm.org/D123609
The code to check if the regular LTO summary should be emitted and to
add the corresponding module flags was duplicated in the
'EmitAssemblyHelper::EmitAssemblyWithLegacyPassManager' and
'EmitAssemblyHelper::RunOptimizationPipeline' methods.
In order to eliminate these code duplications, the
'EmitAssemblyHelper::shouldEmitRegularLTOSummary' method has been
extracted. The method returns a bool value, the value is 'true' if the
module summary should be emitted. The patch keeps the setting of the
module flags inline.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D123026
Few times in different methods of the EmitAssemblyHelper class the following
code snippet is used to get the TargetTriple and then use it's single method
to check some conditions:
TargetTriple(TheModule->getTargetTriple())
The parsing of a target triple string is not a trivial operation and it takes
time to repeat the parsing many times in different methods of the class and
even numerous times in one method just to call a getter
(llvm::Triple(TheModule->getTargetTriple()).getVendor()), for example.
The patch extracts the TargetTriple member of the EmitAssemblyHelper class to
parse the triple only once in the class' constructor.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D122587
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.
Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be
removed in the future.
Differential Revision: https://reviews.llvm.org/D121736