812 Commits

Author SHA1 Message Date
Fangrui Song
36b4a9ccd9
[Driver,CodeGen] Support -mtls-dialect= (#79256)
GCC supports -mtls-dialect= for several architectures to select TLSDESC.
This patch supports the following values

* x86: "gnu". "gnu2" (TLSDESC) is not supported yet.
* RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915)

AArch64 toolchains seem to support TLSDESC from the beginning, and the
general dynamic model has poor support. Nobody seems to use the option
-mtls-dialect= at all, so we don't bother with it.
There also seems very little interest in AArch32's TLSDESC support.

TLSDESC does not change IR, but affects object file generation. Without
a backend option the option is a no-op for in-process ThinLTO.

There seems no motivation to have fine-grained control mixing trad/desc
for TLS, so we just pass -mllvm, and don't bother with a modules flag
metadata or function attribute.

Co-authored-by: Paul Kirth <paulkirth@google.com>
2024-01-26 09:25:38 -08:00
Paul Kirth
9d476e1e1a
[clang][FatLTO] Avoid UnifiedLTO until it can support WPD/CFI (#79061)
Currently, the UnifiedLTO pipeline seems to have trouble with several
LTO features, like SplitLTO units, which means we cannot use important
optimizations like Whole Program Devirtualization or security hardening
instrumentation like CFI.

This patch reverts FatLTO to using distinct pipelines for Full LTO and
ThinLTO. It still avoids module cloning, since that was error prone.
2024-01-23 14:04:52 -08:00
smanna12
bbe1b06fbb
[NFC][CLANG] Fix static analyzer bugs about unnecessary object copies with auto keyword (#75082)
Reported by Static Analyzer Tool:

In ​EmitAssemblyHelper::​RunOptimizationPipeline(): Using the auto
keyword without an & causes the copy of an object of type function.

 /// List of pass builder callbacks ("CodeGenOptions.h").
std::vector<std::function<void(llvm::PassBuilder &)>>
PassBuilderCallbacks;
2023-12-22 20:39:22 -06:00
Paul Kirth
d1e2b96b60
[clang][fatlto] Don't set ThinLTO module flag with FatLTO (#75079)
Since FatLTO now uses the UnifiedLTO pipeline, we should not set the
ThinLTO module flag to true, since it may cause an assertion failure.
See https://github.com/llvm/llvm-project/issues/70703 for context.
2023-12-18 13:03:13 -08:00
Zequan Wu
ab3430f891
[Profile] Add binary profile correlation for code coverage. (#69493)
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.

## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.

## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +14.6Mi  [NEW] +14.6Mi    __llvm_prf_data
  [NEW] +10.6Mi  [NEW] +10.6Mi    __llvm_prf_names
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
  [ = ]       0   +65% +1.23Ki    .relro_padding
   +62% +1.20Ki  [ = ]       0    [Unmapped]
   +13%    +448   +19%    +448    .init_array
  +8.8%    +192  [ = ]       0    [ELF Section Headers]
  +0.0%    +136  +0.0%     +80    [7 Others]
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.5%     +80  +1.2%     +64    .plt
  [ = ]       0 -99.2% -3.68Ki    [LOAD #5 [RW]]
  +195% +64.0Mi  +194% +64.0Mi    TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
   +13%    +448   +19%    +448    .init_array
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.2%     +64  +1.2%     +64    .plt
  +2.9%     +64  [ = ]       0    [ELF Section Headers]
  +0.0%     +40  +0.0%     +40    .data
  +1.2%     +32  +1.2%     +32    .got.plt
  +0.0%     +24  +0.0%      +8    [5 Others]
  [ = ]       0 -22.9%    -872    [LOAD #5 [RW]]
 -74.5% -1.44Ki  [ = ]       0    [Unmapped]
  [ = ]       0 -76.5% -1.45Ki    .relro_padding
  +118% +38.8Mi  +117% +38.8Mi    TOTAL
```

A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
2023-12-14 14:16:38 -05:00
Mircea Trofin
1d608fc755
[NFC][InstrProf] Refactor InstrProfiling lowering pass (#74970)
Akin other passes - refactored the name to `InstrProfilingLoweringPass` to better communicate what it does, and split the pass part and the transformation part to avoid needing to initialize object state during `::run`.

A subsequent PR will move `InstrLowering` to the .cpp file and rename it to `InstrLowerer`.
2023-12-10 18:03:08 -08:00
Paul Kirth
cfe1ece833
[clang][llvm][fatlto] Avoid cloning modules in FatLTO (#72180)
https://github.com/llvm/llvm-project/issues/70703 pointed out that
cloning LLVM modules could lead to miscompiles when using FatLTO.

This is due to an existing issue when cloning modules with labels (see
#55991 and #47769). Since this can lead to miscompilation, we can avoid
cloning the LLVM modules, which was desirable anyway.

This patch modifies the EmbedBitcodePass to no longer clone the module
or run an input pipeline over it. Further, it make FatLTO always perform
UnifiedLTO, so we can still defer the Thin/Full LTO decision to
link-time. Lastly, it removes dead/obsolete code related to now defunct
options that do not work with the EmbedBitcodePass implementation any
longer.
2023-11-30 17:09:34 -08:00
Jacob Lambert
8a4b90321f
[CodeGen] Add conditional to module cloning in bitcode linking (#72478)
Now that we have a commandline option dictating a second link step,
(-relink-builtin-bitcode-postop), we can condition the module creation
when linking in bitcode modules. This aims to improve performance by
avoiding unnecessary linking
2023-11-29 16:44:40 -08:00
Tom Eccles
a207e6307a
[flang] add fveclib flag (#71734)
-fveclib= allows users to choose a vectorized libm so that loops
containing math functions are vectorized.

This is implemented as much as possible in the same way as in clang. The
driver test in veclib.f90 is copied from the clang test.
2023-11-13 10:04:50 +00:00
Fangrui Song
557b064dbf Reorder HipStdPar.h after D155856. NFC 2023-11-09 23:10:49 -08:00
Jacob Lambert
c6cf329502
[CodeGen] Implement post-opt linking option for builtin bitocdes (#69371)
In this patch, we create a new ModulePass that mimics the LinkInModules
API from CodeGenAction.cpp, and a new command line option to enable the
pass. As part of the implementation, we needed to refactor the
BackendConsumer class definition into a new separate header (instead of
embedded in CodeGenAction.cpp). With this new pass, we can now re-link
bitcodes supplied via the -mlink-built-in bitcodes as part of the
RunOptimizationPipeline.

With the re-linking pass, we now handle cases where new device library
functions are introduced as part of the optimization pipeline.
Previously, these newly introduced functions (for example a fused sincos
call) would result in a linking error due to a missing function
definition. This new pass can be initiated via:

      -mllvm -relink-builtin-bitcode-postop

Also note we intentionally exclude bitcodes supplied via the
-mlink-bitcode-file option from the second linking step
2023-11-08 10:53:49 -08:00
William Moses
3c11ac5118
[Clang] Add codegen option to add passbuilder callback functions (#70171) 2023-11-06 22:26:38 -06:00
Stefan Pintilie
423ad04c67
[PowerPC] Add an alias for -mregnames so that full register names used in assembly. (#70255)
This option already exists on GCC and so it is being added to LLVM so
that we use the same option as them.
2023-11-06 12:30:19 -05:00
Zequan Wu
db7a1ed9a2 Revert "[Profile] Refactor profile correlation. (#70712)"
This reverts commit 4b383d0af93136b80841fc140da0823dfc441dd4.
2023-10-31 10:53:45 -04:00
Zequan Wu
4b383d0af9
[Profile] Refactor profile correlation. (#70712)
Refactor some code from https://github.com/llvm/llvm-project/pull/69493.

Rebase of https://github.com/llvm/llvm-project/pull/69656 on top of main
as it was messed up.
2023-10-31 10:41:01 -04:00
Alex Voicu
dd5d65adb6 [HIP][Clang][CodeGen] Add CodeGen support for hipstdpar
This patch adds the CodeGen changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. This change relaxes restrictions on what gets emitted on the device path, when compiling in `hipstdpar` mode:

1. Unless a function is explicitly marked `__host__`, it will get emitted, whereas before only `__device__` and `__global__` functions would be emitted;
2. Unsupported builtins are ignored as opposed to being marked as an error, as the decision on their validity is deferred to the `hipstdpar` specific code selection pass;
3. We add a `hipstdpar` specific pass to the opt pipeline, independent of optimisation level:
    - When compiling for the host, iff the user requested it via the `--hipstdpar-interpose-alloc` flag, we add a pass which replaces canonical allocation / deallocation functions with accelerator aware equivalents.

A test to validate that unannotated functions get correctly emitted is added as well.

Reviewed by: yaxunl, efriedma

Differential Revision: https://reviews.llvm.org/D155850
2023-10-17 11:41:36 +01:00
Matheus Izvekov
1ab68d708a
[clang][codegen] Add a verifier IR pass before any further passes. (#68015)
This helps check clang generated good IR in the first place, as
otherwise this can cause UB in subsequent passes, with the final
verification pass not catching any issues.

This for example would have helped catch
https://github.com/llvm/llvm-project/issues/67937 earlier.
2023-10-03 18:05:54 +02:00
Tom Stellard
e7247f1010 [profiling] Move option declarations into headers
This will make it possible to add visibility attributes to these
variables.  This also fixes some type mismatches between the
declaration and the definition.

Reviewed By: bogner, huangjd

Differential Revision: https://reviews.llvm.org/D156599
2023-09-30 18:51:28 -07:00
Arthur Eubanks
a42787d108
[clang] Add -mlarge-data-threshold for x86_64 medium code model (#66839)
Error if not used with x86_64.
Warn if not used with the medium code model (can update if other code
models end up using this).

Set TargetMachine option and add module flag.
2023-09-26 09:44:31 -07:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Joshua Cranmer
bf49237103 [Clang] Enable -print-pipeline-passes in clang.
Reviewed By: arsenm, aeubanks

Differential Revision: https://reviews.llvm.org/D127221
2023-09-13 08:57:10 -07:00
Paul T Robinson
a4605af26f
[CodeGen][LTO] Rename some misleading variables (#65185)
Some flags named "IsLTO" and "IsThinLTO" implied they described
compilation modes, but with Unified LTO this is no longer true. Rename
these to "PrepForXXX" to be less confusing to readers. Also, deleted
"IsThinOrUnifiedLTO" because Unified implies PrepareForThinLTO.
2023-09-05 08:57:16 -07:00
Takuya Shimizu
01b88dd66d [NFC] Remove unused variables declared in conditions
D152495 makes clang warn on unused variables that are declared in conditions like `if (int var = init) {}`
This patch is an NFC fix to suppress the new warning in llvm,clang,lld builds to pass CI in the above patch.

Differential Revision: https://reviews.llvm.org/D158016
2023-08-30 10:05:06 +09:00
Zequan Wu
f52f8e817e [Profile] Allow online merging with debug info correlation.
When using debug info correlation, value profiling needs to be switched off.
So, we are only merging counter sections. In that case the existance of data
section is just used to provide an extra check in case of corrupted profile.

This patch performs counter merging by iterating the counter section by counter
size and add them together.

Reviewed By: ellis, MaskRay

Differential Revision: https://reviews.llvm.org/D157632
2023-08-28 16:35:41 -04:00
Vitaly Buka
cd591e02d4 Revert "Reland "[Profile] Allow online merging with debug info correlation.""
Breaks windows https://lab.llvm.org/buildbot/#/builders/127/builds/54239

This reverts commit d80992032fd0d7da70b2bd1a59066703c3c21fbb.
2023-08-28 13:01:25 -07:00
Zequan Wu
d80992032f Reland "[Profile] Allow online merging with debug info correlation."
Original patch: D157632.

Fixed the issue when data is 0 by relaxing the condition.

Reviewed By: ellis

Differential Revision: https://reviews.llvm.org/D159000
2023-08-28 13:32:55 -04:00
Arthur Eubanks
9b6b6bbc91 Revert "[Profile] Allow online merging with debug info correlation."
This reverts commit cf2cf195d5fb0a07c122c5c274bd6deb0790e015.
This breaks merging profiles when nothing is instrumented, see comments in https://reviews.llvm.org/D157632.

This also reverts follow-up commit bfc965c54fdbfe33a0785b45903b53fc11165f13.
2023-08-22 14:34:46 -07:00
Qiongsi Wu
611ce24114 [PGO] Enable -fprofile-update for -fprofile-generate
Currently, the `-fprofile-udpate` is ignored when `-fprofile-generate` is in effect. This patch enables `-fprofile-update` for `-fprofile-generate`. This patch continues the work from https://reviews.llvm.org/D87737, which added `-fprofile-update` in the first place.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D157280
2023-08-15 10:10:03 -04:00
Zequan Wu
cf2cf195d5 [Profile] Allow online merging with debug info correlation.
When using debug info correlation, value profiling needs to be switched off.
So, we are only merging counter sections. In that case the existance of data
section is just used to provide an extra check in case of corrupted profile.

This patch performs counter merging by iterating the counter section by counter
size and add them together.

Reviewed By: ellis, MaskRay

Differential Revision: https://reviews.llvm.org/D157632
2023-08-14 15:15:01 -04:00
Teresa Johnson
65e57bbed0 [FunctionImport] Reduce string duplication (NFC)
The import/export maps, and the ModuleToDefinedGVSummaries map, are all
indexed by module paths, which are StringRef obtained from the module
summary index, which already has a data structure than owns these
strings (the ModulePathStringTable). Because these other maps are also
StringMap, which makes a copy of the string key, we were keeping
multiple extra copies of the module paths, leading to memory overhead.

Change these to DenseMap keyed by StringRef, and document that the
strings are owned by the index.

The only exception is the llvm-link tool which synthesizes an import list
from command line options, and I have added a string cache to maintain
ownership there.

I measured around 5% memory reduction in the thin link of a large
binary.

Differential Revision: https://reviews.llvm.org/D156580
2023-08-04 14:43:11 -07:00
Paul Kirth
610fc5cbcc [clang] Preliminary fat-lto-object support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

This patch adds support for that flag in the driver, as well as setting the
necessary codegen options for the backend. Largely, this means we select
the newly added pass pipeline for generating fat objects.

Users are expected to pass -ffat-lto-objects to clang in addition to one
of the -flto variants. Without the -flto flag, -ffat-lto-objects has no
effect.

// Compile and link. Use the object code from the fat object w/o LTO.
clang -fno-lto -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Select full LTO at link time.
clang -flto -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Select ThinLTO at link time.
clang -flto=thin -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Use ThinLTO  with the UnifiedLTO pipeline.
clang -flto=thin -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c

// Compile and link. Use full LTO  with the UnifiedLTO pipeline.
clang -flto -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c

// Link separately, using ThinLTO.
clang -c -flto=thin -ffat-lto-objects foo.c
clang -flto=thin -fuse-ld=lld foo.o -ffat-lto-objects  # pass --lto=thin --fat-lto-objects to ld.lld

// Link separately, using full LTO.
clang -c -flto -ffat-lto-objects foo.c
clang -flto -fuse-ld=lld foo.o  # pass --lto=full --fat-lto-objects to ld.lld

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Depends on D146776

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D146777
2023-07-17 16:26:21 +00:00
Maciej Gabka
5b0e19a7ab [TLI][AArch64] Add mappings to vectorized functions from ArmPL
Arm Performance Libraries contain math library which provides
vectorized versions of common math functions.
This patch allows to use it with clang and llvm via -fveclib=ArmPL or
-vector-library=ArmPL, so loops with such calls can be vectorized.
The executable needs to be linked with the amath library.

Arm Performance Libraries are available at:
https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries

Reviewed by: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D154508
2023-07-12 12:53:18 +00:00
Matthew Voss
048a0c2469 [clang] Support Unified LTO Bitcode Frontend
The unified LTO pipeline creates a single LTO bitcode structure that can
be used by Thin or Full LTO. This means that the LTO mode can be chosen
at link time and that all LTO bitcode produced by the pipeline is
compatible, from an optimization perspective. This makes the behavior of
LTO a bit more predictable by normalizing the set of LTO features
supported by each LTO bitcode file.

Example usage:

  # Compile and link. Select regular LTO at link time.
  clang -flto -funified-lto -fuse-ld=lld foo.c

  # Compile and link. Select ThinLTO at link time.
  clang -flto=thin -funified-lto -fuse-ld=lld foo.c

  # Link separately, using ThinLTO.
  clang -c -flto -funified-lto foo.c  # -flto={full,thin} are identical in
  terms of compilation actions
  clang -flto=thin -fuse-ld=lld foo.o  # pass --lto=thin to ld.lld

  # Link separately, using regular LTO.
  clang -c -flto -funified-lto foo.c
  clang -flto -fuse-ld=lld foo.o  # pass --lto=full to ld.lld

The RFC discussing the details and rational for this change is here:
https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774
2023-07-11 15:13:57 -07:00
Teresa Johnson
546ec641b4 Restore "[MemProf] Use new option/pass for profile feedback and matching"
This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted
in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.

Differential Revision: https://reviews.llvm.org/D154856
2023-07-11 13:16:20 -07:00
JP Lehr
3ab7ef28ee Revert "[MemProf] Use new option/pass for profile feedback and matching"
This reverts commit b4a82b62258c5f650a1cccf5b179933e6bae4867.

Broke AMDGPU OpenMP Offload buildbot
2023-07-11 05:44:42 -04:00
Teresa Johnson
b4a82b6225 [MemProf] Use new option/pass for profile feedback and matching
Previously the MemProf profile was expected to be in the same profile
file as a normal PGO profile, passed via the usual -fprofile-use=
option, and was matched in the same pass. To simplify profile
preparation, since the raw MemProf profile requires the binary for
symbolization and may be simpler to index separately from the raw PGO
profile, and also to enable providing a MemProf profile for a SamplePGO
build, separate out the MemProf feedback option and matching pass.

This patch adds the -fmemory-profile-use=${file} option, and the
provided file is passed down to LLVM and ultimately used in a new
MemProfUsePass which performs the matching of just the memory profile
contents of that file.

Note that a single profile file containing both normal PGO and MemProf
profile data is still supported, and the relevant profile data is
matched by the appropriate matching pass(es) based on which option(s)
the profile is provided with (the same profile file can be supplied to
both feedback options).

Differential Revision: https://reviews.llvm.org/D154856
2023-07-10 16:42:56 -07:00
Job Noorman
8de9f2b558 Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.

Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.

This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D150549
2023-06-26 11:20:08 +02:00
Sami Tolvanen
83835e22c7 [RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.

This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.

`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.

Relands commit 62fa708ceb027713b386c7e0efda994f8bdc27e2 with fixed
tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148385
2023-06-23 22:57:56 +00:00
Sami Tolvanen
e809ebeb6c Revert "[RISCV] Implement KCFI operand bundle lowering"
This reverts commit 62fa708ceb027713b386c7e0efda994f8bdc27e2.

Reverting to investigate -verify-machineinstrs errors in MIR tests.
2023-06-23 21:42:57 +00:00
Sami Tolvanen
62fa708ceb [RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.

This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.

`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148385
2023-06-23 18:25:24 +00:00
Arthur Eubanks
da7f212f4a [clang][LTO] Add flag to run verifier after every pass
Helps with debugging issues caught by the verifier.

Plumbed through both normal clang compile and ThinLTO.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D153468
2023-06-22 08:51:29 -07:00
Fangrui Song
849f1dd15e [XRay] Rename XRayOmitFunctionIndex to XRayFunctionIndex
Apply my post-commit comment on D81995. The negative name misguided commit
d8a8e5d6240a1db809cd95106910358e69bbf299 (`[clang][cli] Remove marshalling from
Opt{In,Out}FFlag`) to:

* accidentally flip the option to not emit the xray_fn_idx section.
* change -fno-xray-function-index (instead of -fxray-function-index) to emit xray_fn_idx

This patch renames XRayOmitFunctionIndex and makes -fxray-function-index emit
xray_fn_idx, but the default remains -fno-xray-function-index .
2023-06-11 15:27:22 -07:00
Vy Nguyen
e60b30d5e3 Reland "D144999 [MC][MachO]Only emits compact-unwind format for "canonical" personality symbols. For the rest, use DWARFs."
Reasons for rolling forward:
    - the crash reported from Chromium was fixed in D151824 (not related to this patch at all)
    - since D152824 was committed, it should now be safe to roll this forward.

New change:
    - add an additional _ in name check

This reverts commit 4980eead4d0b4666d53dad07afb091375b3a13a0.
2023-06-07 10:03:50 -04:00
Teresa Johnson
f354e971b0 [MemProf] Clean up MemProf instrumentation pass invocation
First, removes the invocation of the memprof instrumentation passes from
the end of the module simplification pass builder, where it doesn't
really belong. However, it turns out that this was never being invoked,
as it is guarded by an internal option not used anywhere (even tests).

These passes are actually added via clang under the -fmemory-profile
option. Changed this to add via the EP callback interface, similar to
the sanitizer passes. They are added to the EP for the end of the
optimization pipeline, which is roughly where they were being added
already (end of the pre-LTO link pipelines and non-LTO optimization
pipeline).

Ideally we should plumb the output file through to LLVM and set it up
there, so I have added a TODO.

Differential Revision: https://reviews.llvm.org/D151593
2023-05-26 17:38:49 -07:00
Nico Weber
4980eead4d Revert "[RFC][MC][MachO]Only emits compact-unwind format for "canonical" personality symbols. For the rest, use DWARFs."
This reverts commit 09aaf53a05e3786eea374f3ce57574225036412d.
Causes toolchain asserts building libc++ for x86_64,
see https://reviews.llvm.org/D144999#4356215
2023-05-19 09:40:54 -04:00
Vy Nguyen
09aaf53a05 [RFC][MC][MachO]Only emits compact-unwind format for "canonical" personality symbols. For the rest, use DWARFs.
Details: https://github.com/rust-lang/rust/issues/102754

The MachO format uses 2 bits to encode these personality funtions, with 0 reserved for "no-personality".
This means we can only have up to 3 personality. There are already three popular personalities:  __gxx_personality_v0, __gcc_personality_v0, and __objc_personality_v0.
As a result, any system that needs custom-personality will run into a problem.

This patch implemented jyknight's proposal to simply force DWARFs for all non-canonical personality functions.

Differential Revision: https://reviews.llvm.org/D144999
2023-05-18 13:27:47 -04:00
Fangrui Song
71a35f7e3d [gcov] Simplify cc1 options and remove CodeGenOptions EmitCovNotes/EmitCovArcs
After a07b135ce0c0111bd83450b5dc29ef0381cdbc39, we always pass
-coverage-notes-file/-coverage-data-file for driver options
-ftest-coverage/-fprofile-arcs/--coverage. As a bonus, we can make the following
simplification to cc1 options:

* `-ftest-coverage -coverage-notes-file a.gcno` => `-coverage-notes-file a.gcno`
* `-fprofile-arcs -coverage-data-file a.gcda` => `-coverage-data-file a.gcda`

and remove EmitCovNotes/EmitCovArcs.
2023-05-17 16:09:12 -07:00
Qiongsi Wu
9715af4345 [AIX][clang] Storage Locations for Constant Pointers
This patch adds clang options `-mxcoff-roptr` and `-mno-xcoff-roptr` to specify storage locations for constant pointers on AIX.

When the `-mxcoff-roptr` option is in effect, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. When the `-mno-xcoff-roptr` option is in effect, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.

This patch depends on https://reviews.llvm.org/D144189.

Reviewed By: hubert.reinterpretcast, stephenpeckham

Differential Revision: https://reviews.llvm.org/D144190
2023-05-15 11:31:00 -04:00
Zarko Todorovski
99fe4d3826 Set EnableAIXExtendedAltivecABI in BackendUtils from LangOpts
Fix a bug where after
github.com/llvm/llvm-project/commit/68dd51421f16f1e17cd453cb1730fcca99a6cfb7
refactor where we are not passing -mabi=vec-extabi to th backend.
2023-05-01 10:39:14 -04:00
Fangrui Song
0d333bf0e3 Remove ExplicitEmulatedTLS and simplify -femulated-tls handling
Currently clangDriver passes -femulated-tls and -fno-emulated-tls to cc1.
cc1 forwards the option to LLVMCodeGen and ExplicitEmulatedTLS is used
to decide the value. Simplify this by moving the Clang decision to
clangDriver and moving the LLVM decision to InitTargetOptionsFromCodeGenFlags.
2023-04-23 11:55:12 -07:00