This patch does a few things:
* Fix aggregate initialization.
When an aggregate has an initializer that is immediate-escalating,
the context in which it is used automatically becomes an immediate function.
The wording does that by rpretending an aggregate initialization is itself
an invocation which is not really how clang works, so my previous attempt
was... wrong.
* Fix initialization of defaulted constructors with immediate escalating
default member initializers.
The wording was silent about that case and I did not handled it fully
https://cplusplus.github.io/CWG/issues/2760.html
* Fix diagnostics
In some cases clang would produce additional and unhelpful
diagnostics by listing the invalid references to consteval
function that appear in immediate escalating functions
Fixes https://github.com/llvm/llvm-project/issues/63742
Reviewed By: aaron.ballman, #clang-language-wg, Fznamznon
Differential Revision: https://reviews.llvm.org/D155175
The conversion iterates over CodeGenModule::Replacements (a StringMap)
and replaces C2/D2 and moves C1/D1 (
commit 0196a1d98f8a206259a4b5ce93c21807243af92f in 2013, to make the
output look nicer). The iteration order is not guaranteed to be
deterministic, and may cause destructors.cpp to exhibit different
function orders. Use a MapVector instead.
While here, fix an IWYU issue by adding an explicit include, though
MapVector is already used in CodeGenModule.h.
- When the destination is a final class type that does not derive from
the source type, the cast always fails and is now emitted as a null
pointer or call to __cxa_bad_cast.
- When the destination is a final class type that does derive from the
source type, emit a direct comparison against the corresponding base
class vptr value(s). There may be more than one such value in the case
of multiple inheritance; check them all.
For now, this is supported only for the Itanium ABI. I expect the same thing is
possible for the MS ABI too, but I don't know what guarantees are made about
vfptr uniqueness.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D154658
All data structures and values associated with handling virtual functions / inheritance, as well as RTTI, are globals and thus can only reside in the global address space. This was not taken fully taken into account because for most targets, global & generic appear to coincide. However, on targets where global & generic ASes differ (e.g. AMDGPU), this was problematic, since it led to the generation of invalid bitcasts (which would trigger asserts in Debug) and less than optimal code. This patch does two things:
ensures that vtables, vptrs, vtts, typeinfo are generated in the right AS, and populated accordingly;
removes a bunch of bitcasts which look like left-overs from the typed ptr era.
Reviewed By: yxsamliu
Differential Revision: https://reviews.llvm.org/D153092
This patch adds a new option -fkeep-persistent-storage-variables to emit
all variables that have a persistent storage duration, including global,
static and thread-local variables. This could be useful in cases where
the presence of all these variables as symbols in the object file are
required, so that they can be directly addressed.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D150221
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
The attribute is a proper enum attribute, strictfp. We were getting
strictfp and "strictfp" set on every function with
-fexperimental-strict-floating-point.
https://reviews.llvm.org/D139629
With `-fsanitize=kcfi`, Clang currently won't emit type hashes for
C++ member functions, which leads to check failures if they are
indirectly called. As there's no reason to exclude member functions
in CodeGenModule::setKCFIType, emit type hashes also for them to fix
member function pointer calls with KCFI, and add a test to confirm
that types are emitted correctly.
The itanium ABI for certain platforms requires a minimum alignments for
member function pointers to reserve certain bits for distinguishing
virtual and non-virtual functions.
Our implementation of this however depends on the alignment of the
function involved, which may however not reflect the true alignment of
function pointers on certain targets for which the alignment is
independent of the function (e.g. AIX). Worse, the 2-byte alignment
we use may be less than the ABI minimum for the target, and in the case
we are using explicit sections will result in invalid codegen.
This patch attempts to correct this situation by considering the target
alignment of function pointers as part of making the decision about
whether we need to adjust the function alignment to conform to the ABI.
Targets which do not provide the function ptr alignment information
will return a value of 1 when queried and will conservatively retain
the old alignment.
Differential Revision: https://reviews.llvm.org/D147184
This refactor patch means to remove CPU_SPECIFIC* MACROs in X86TargetParser.def
and move those information into ProcInfo of X86TargetParser.cpp. Since these
two files both maintain a table with redundant info such as cpuname and its
features supported. CPU_SPECIFIC* MACROs define some different information. This
patch dealt with them in these ways when moving:
1.mangling
This is now moved to Mangling in ProcInfo and directly initialized at array of
Processors. CPUs don't support cpu_dispatch/specific are assigned '\0' as
mangling.
2.CPU alias
The alias cpu will also be initialized in array of Processors, its attributes
will be same as its alias target cpu. Same feature list, same mangling.
3.TUNE_NAME
Before my change, some cpu names support cpu_dispatch/specific are not
supported in X86.td, which means optimizer/backend doesn't recognize them. So
they use a different TUNE_NAME to generate in IR. In this patch, I added these
missing cpu support at X86.td by utilizing existing Features and XXXTunings, so
that each cpu name can directly use its own name as TUNE_NAME to be supported
by optimizer/backend.
4.Feature list
The feature list of one CPU maintained in X86TargetParser.def is not same as
the one in X86TargetParser.cpp. It only maintains part of features of one CPU
(features defined by X86_FEATURE_COMPAT). While X86TargetParser.cpp maintains
a complete one. This patch abandons the feature list maintained by CPU_SPECIFIC*
MACROs because assigning a CPU with a complete one doesn't affect the
functionality of cpu_dispatch/specific.
Except these four info, since some of CPUs supported by cpu_dispatch/specific
doesn's support clang options like -march, -mtune before, this patch also kept
this behavior still by adding another member OnlyForCPUDispatchSpecific in
ProcInfo.
Reviewed By: pengfei, RKSimon
Differential Revision: https://reviews.llvm.org/D151696
Move `AttributeMask` out of `llvm/IR/Attributes.h` to a new file
`llvm/IR/AttributeMask.h`. After doing this we can remove the
`#include <bitset>` and `#include <set>` directives from `Attributes.h`.
Since there are many headers including `Attributes.h`, but not needing
the definition of `AttributeMask`, this causes unnecessary bloating of
the translation units and slows down compilation.
This commit adds in the include directive for `llvm/IR/AttributeMask.h`
to the handful of source files that need to see the definition.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,917,509,187 to 1,902,982,273 - a
reduction of ~0.76%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D153728
This commit breaks up CodeGen/TargetInfo.cpp into a set of *.cpp files,
one file per target. There are no functional changes, mostly just code moving.
Non-code-moving changes are:
* A virtual destructor has been added to DefaultABIInfo to pin the vtable to a cpp file.
* A few methods of ABIInfo and DefaultABIInfo were split into declaration + definition
in order to reduce the number of transitive includes.
* Several functions that used to be static have been placed in clang::CodeGen
namespace so that they can be accessed from other cpp files.
RFC: https://discourse.llvm.org/t/rfc-splitting-clangs-targetinfo-cpp/69883
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148094
The last use was removed by:
commit 46f366494f3ca8cc98daa6fb4f29c7c446c176b6
Author: Fangrui Song <i@maskray.me>
Date: Sat May 20 08:24:20 2023 -0700
This patch also removes RTTIProxyMap, which becomes unused once I
remove GetOrCreateRTTIProxyGlobalVariable.
Differential Revision: https://reviews.llvm.org/D152782
HIP texture/image support is optional as some devices
do not have image instructions. A macro __HIP_NO_IMAGE_SUPPORT
is defined for device not supporting images (d0448aa4c4/docs/reference/kernel_language.md (L426) )
Currently the macro is defined by HIP header based on predefined macros
for GPU, e.g __gfx*__ , which is error prone. This patch let clang
emit the predefined macro.
Reviewed by: Matt Arsenault, Artem Belevich
Differential Revision: https://reviews.llvm.org/D151349
This is an alternative to currently existing hostcall implementation and uses printf buffer similar to OpenCL,
The data stored in the buffer (i.e the data frame) for each printf call are as follows,
1. Control DWord - contains info regarding stream, format string constness and size of data frame
2. Hash of the format string (if constant) else the format string itself
3. Printf arguments (each aligned to 8 byte boundary)
The format string Hash is generated using LLVM's MD5 Message-Digest Algorithm implementation and only low 64 bits are used.
The implementation still uses amdhsa metadata and hash is stored as part of format string itself to ensure
minimal changes in runtime.
Differential Revision: https://reviews.llvm.org/D150427
Device libs make use of patterns like this:
```
__attribute__((target("gfx11-insts")))
static unsigned do_intrin_stuff(void)
{
return __builtin_amdgcn_s_sendmsg_rtnl(0x0);
}
```
For functions that are assumed to be eliminated if the currennt GPU target doesn't support them.
At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function.
D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually.
This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again.
Fixes SWDEV-403642
Reviewed By: arsenm, yaxunl
Differential Revision: https://reviews.llvm.org/D152251
CUDA support can be enabled in clang-repl with --cuda flag.
Device code linking is not yet supported. inline must be used with all
__device__ functions.
Differential Revision: https://reviews.llvm.org/D146389
There are two motivations.
`-fno-pic -fstack-protector -mstack-protector-guard=global` created
`__stack_chk_guard` is referenced directly on all ELF OSes except FreeBSD.
This patch allows referencing the symbol indirectly with
-fno-direct-access-external-data.
Some Linux kernel folks want
`-fno-pic -fstack-protector -mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard`
created `__stack_chk_guard` to be referenced directly, avoiding
R_X86_64_REX_GOTPCRELX (even if the relocation may be optimized out by the linker).
https://github.com/llvm/llvm-project/issues/60116
Why they need this isn't so clear to me.
---
Add module flag "direct-access-external-data" and set the dso_local property of
the stack protector symbol. The module flag can benefit other LLVMCodeGen
synthesized symbols that are not represented in LLVM IR.
Nowadays, with `-fno-pic` being uncommon, ideally we should set
"direct-access-external-data" when it is true. However, doing so would require
~90 clang/test tests to be updated, which are too much.
As a compromise, we set "direct-access-external-data" only when it's different
from the implied default value.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D150841
CUDA support can be enabled in clang-repl with --cuda flag.
Device code linking is not yet supported. inline must be used with all
__device__ functions.
Differential Revision: https://reviews.llvm.org/D146389
After a07b135ce0c0111bd83450b5dc29ef0381cdbc39, we always pass
-coverage-notes-file/-coverage-data-file for driver options
-ftest-coverage/-fprofile-arcs/--coverage. As a bonus, we can make the following
simplification to cc1 options:
* `-ftest-coverage -coverage-notes-file a.gcno` => `-coverage-notes-file a.gcno`
* `-fprofile-arcs -coverage-data-file a.gcda` => `-coverage-data-file a.gcda`
and remove EmitCovNotes/EmitCovArcs.
The original name "ASTContext::getNamedModuleForCodeGen" is not properly
reflecting the usage of the interface. This interface can be used to
judge the current module unit in both sema analysis and code generation.
So the original name was not so correct.
Decl::isInCurrentModuleUnit
Refactor `Sema::isModuleUnitOfCurrentTU` to `Decl::isInCurrentModuleUnit`
to make code simpler a little bit. Note that although this patch
introduces a FIXME, this is an existing issue and this patch just tries
to describe it explicitly.
Non-incremental Clang can also exit with the WeakRefReferences not empty upon
such example. This patch makes clang-repl consistent to what Clang does.
Differential revision: https://reviews.llvm.org/D148435
Reported By Static Analyzer Tool, Coverity:
Big parameter passed by value
Copying large values is inefficient, consider passing by reference; Low, medium, and high size thresholds for detection can be adjusted.
1. Inside "CodeGenModule.cpp" file, in clang::CodeGen::CodeGenModule::EmitBackendOptionsMetadata(clang::CodeGenOptions): A very large function call parameter exceeding the high threshold is passed by value.
pass_by_value: Passing parameter CodeGenOpts of type clang::CodeGenOptions const (size 2168 bytes) by value, which exceeds the high threshold of 512 bytes.
2. Inside "SemaType.cpp" file, in IsNoDerefableChunk(clang::DeclaratorChunk): A large function call parameter exceeding the low threshold is passed by value.
pass_by_value: Passing parameter Chunk of type clang::DeclaratorChunk (size 176 bytes) by value, which exceeds the low threshold of 128 bytes.
3. Inside "CGNonTrivialStruct.cpp" file, in <unnamed>::getParamAddrs<1ull, <0ull...>>(std::integer_sequence<unsigned long long, T2...>, std::array<clang::CharUnits, T1>, clang::CodeGen::FunctionArgList, clang::CodeGen::CodeGenFunction *): A large function call parameter exceeding the low threshold is passed by value.
.i. pass_by_value: Passing parameter Args of type clang::CodeGen::FunctionArgList (size 144 bytes) by value, which exceeds the low threshold of 128 bytes.
4. Inside "CGGPUBuiltin.cpp" file, in <unnamed>::containsNonScalarVarargs(clang::CodeGen::CodeGenFunction *, clang::CodeGen::CallArgList): A very large function call parameter exceeding the high threshold is passed by value.
i. pass_by_value: Passing parameter Args of type clang::CodeGen::CallArgList (size 1176 bytes) by value, which exceeds the high threshold of 512 bytes.
Reviewed By: tahonermann
Differential Revision: https://reviews.llvm.org/D149163
In file `clang/lib/Basic/Module.cpp` the `Module` class had `submodule_begin()` and `submodule_end()` functions to retrieve corresponding iterators for private vector of Modules. This commit removes mentioned functions, and replaces all of theirs usages with `submodules()` function and range-based for-loops.
Differential Revision: https://reviews.llvm.org/D148954
This patch moves the Debug Options to llvm/Frontend so that it can be shared by Flang as well.
Reviewed By: kiranchandramohan, awarzynski
Differential Revision: https://reviews.llvm.org/D142347
We need to be able to distinguish individual TUs from the same module in cases
where TU-local entities either need to be hidden (or, for some cases of ADL in
template instantiation, need to be detected as exposures).
This creates a module type for the implementation which implicitly imports its
primary module interface per C++20:
[module.unit/8] 'A module-declaration that contains neither an export-keyword
nor a module-partition implicitly imports the primary module interface unit of
the module as if by a module-import-declaration.
Implementation modules are never serialized (-emit-module-interface for an
implementation unit is diagnosed and rejected).
Differential Revision: https://reviews.llvm.org/D126959
This reverts commit c6e9823724ef6bdfee262289ee34d162db436af0.
Reason: Broke the ASan buildbots, see https://reviews.llvm.org/D126959
(the original phabricator review) for more info.
We need to be able to distinguish individual TUs from the same module in cases
where TU-local entities either need to be hidden (or, for some cases of ADL in
template instantiation, need to be detected as exposures).
This creates a module type for the implementation which implicitly imports its
primary module interface per C++20:
[module.unit/8] 'A module-declaration that contains neither an export-keyword
nor a module-partition implicitly imports the primary module interface unit of
the module as if by a module-import-declaration.
Implementation modules are never serialized (-emit-module-interface for an
implementation unit is diagnosed and rejected).
Differential Revision: https://reviews.llvm.org/D126959
This follows 2b4fa53 which made Clang not emit destructor calls for such
objects. However, they would still not get emitted as constants since
CodeGenModule::isTypeConstant() returns false if the destructor is
constexpr. This change adds a param to make isTypeConstant() ignore the
dtor, allowing the caller to check it instead.
Fixes Issue #61212
Differential revision: https://reviews.llvm.org/D145369
When an alias or ifunc attribute refers to a function name that is
mangled, a diagnostic is emitted to suggest the mangled name as a
replacement for the given function name for every matching name in the
current TU.
Fixes#59164
Differential Revision: https://reviews.llvm.org/D143803
The asan functions are now attributed as used
in the device library, no need to keep the
declaration of asan device preserve function.
Patch by: Praveen Velliengiri
Reviewed by: Yaxun Liu
Differential Revision: https://reviews.llvm.org/D143495
Previously we'll set the named modules for ASTContext in ParseAST. But
this is not intuitive and we need comments to tell the intuition. This
patch moves the code the right the place, where the corrresponding
module is first created/loaded. Now it is more intuitive and we can use
the value in the earlier places.
This commit adds a new option (i.e.,
`-fsanitize-cfi-icall-normalize-integers`) for normalizing integer types
as vendor extended types for cross-language LLVM CFI/KCFI support with
other languages that can't represent and encode C/C++ integer types.
Specifically, integer types are encoded as their defined representations
(e.g., 8-bit signed integer, 16-bit signed integer, 32-bit signed
integer, ...) for compatibility with languages that define
explicitly-sized integer types (e.g., i8, i16, i32, ..., in Rust).
``-fsanitize-cfi-icall-normalize-integers`` is compatible with
``-fsanitize-cfi-icall-generalize-pointers``.
This helps with providing cross-language CFI support with the Rust
compiler and is an alternative solution for the issue described and
alternatives proposed in the RFC
https://github.com/rust-lang/rfcs/pull/3296.
For more information about LLVM CFI/KCFI and cross-language LLVM
CFI/KCFI support for the Rust compiler, see the design document in the
tracking issue https://github.com/rust-lang/rust/issues/89653.
Relands b1e9ab7438a098a18fecda88fc87ef4ccadfcf1e with fixes.
Reviewed By: pcc, samitolvanen
Differential Revision: https://reviews.llvm.org/D139395
Adding a module flag 'MaxTLSAlign' describing the maximum alignment a global TLS
variable can have. Optimizers are prevented from increasing the alignment of such
variables beyond this threshold.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D140123
Fix an issue about module linking with LTO.
When compiling with PIE, the small data limitation needs to be consistent with that in PIC, otherwise there will be linking errors due to conflicting values.
bar.c
```
int bar() { return 1; }
```
foo.c
```
int foo() { return 1; }
```
```
clang --target=riscv64-unknown-linux-gnu -flto -c foo.c -o foo.o -fPIE
clang --target=riscv64-unknown-linux-gnu -flto -c bar.c -o bar.o -fPIC
clang --target=riscv64-unknown-linux-gnu -flto foo.o bar.o -flto -nostdlib -v -fuse-ld=lld
```
```
ld.lld: error: linking module flags 'SmallDataLimit': IDs have conflicting values in 'bar.o' and 'ld-temp.o'
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
```
Use Min instead of Error for conflicting SmallDataLimit.
Authored by: @joshua-arch1
Signed-off-by: xiaojing.zhang <xiaojing.zhang@xcalibyte.com>
Signed-off-by: jianxin.lai <jianxin.lai@xcalibyte.com>
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D131230