Wrap calls to XXXTargetCodeGenInfo constructors into factory functions.
This allows moving implementations of TargetCodeGenInfo to dedicated cpp
files without a change.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D150215
Make `qualifyWindowsLibrary` and `addStackProbeTargetAttributes`
protected members of `TargetCodeGenInfo`.
These are helper functions used by `getDependentLibraryOption` and
`setTargetAttributes` methods when targeting Windows. The change will
allow these functions to be reused after splitting `TargetInfo.cpp`.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D150178
Remove `getABIInfo` overrides returning references to target-specific
implementations of `ABIInfo`.
The methods may be convenient, but they are only used in one place and
prevent from `ABIInfo` implementations from being put into anonymous
namespaces in different cpp files.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148092
The current implementation of -fsanitize=function places two words (the prolog
signature and the RTTI proxy) at the function entry, which makes the feature
incompatible with Intel Indirect Branch Tracking (IBT) that needs an ENDBR instruction
at the function entry. To allow the combination, move the two words before the
function entry, similar to -fsanitize=kcfi.
Armv8.5 Branch Target Identification (BTI) has a similar requirement.
Note: for IBT and BTI, whether a function gets a marker instruction at the entry
generally cannot be assumed (it can be disabled by a function attribute or
stronger LTO optimizations).
It is extremely unlikely for two words preceding a function entry to be
inaccessible. One way to achieve this is by ensuring that a function is
aligned at a page boundary and making the preceding page unmapped or
unreadable. This is not reasonable for application or library code.
(Think: the first text section has crt* code not instrumented by
-fsanitize=function.)
We use 0xc105cafe for all targets. .long 0xc105cafe disassembles to invalid
instructions on all architectures I have tested, except Power where it is
`lfs 8, -13570(5)` (Load Floating-Point with a weird offset, unlikely to be used in real code).
---
For the removed function in AsmPrinter.cpp, remove an assert: `mdconst::extract`
already asserts non-nullness.
For compiler-rt/test/ubsan/TestCases/TypeCheck/Function/function.cpp,
when the function doesn't have prolog/epilog (-O1 and above), after moving the two words,
the address of the function equals the address of ret instruction,
so symbolizing the function will additionally get a non-zero column number.
Adjust the test to allow an optional column number.
```
.long 3238382334
.long .L__llvm_rtti_proxy-_Z1fv
_Z1fv: // symbolizing here retrieves the line table entry from the second .loc
.file 0 ...
.loc 0 1 0
.cfi_startproc
.loc 0 2 1 prologue_end
retq
```
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D148665
Consider the following sturctures when targetting:
struct foo {
int space[4];
char a : 8;
char b : 8;
char x : 8;
char y : 8;
};
struct bar {
int space[4];
char a : 8;
char b : 8;
char : 0;
char x : 8;
char y : 8;
};
Even if both structs have the same layout in memory, they are handled
differenlty by the AMDGPU ABI.
With the following code:
// clang --target=amdgcn-amd-amdhsa -g -O1 example.c -S
char use_foo(struct foo f) { return f.y; }
char use_bar(struct bar b) { return b.y; }
For use_foo, the 'y' field is passed in v4
; v_ashrrev_i32_e32 v0, 24, v4
; s_setpc_b64 s[30:31]
For use_bar, the 'y' field is passed in v5
; v_bfe_i32 v0, v5, 8, 8
; s_setpc_b64 s[30:31]
To make this distinction, we record a single 0-size bitfield for every member that is preceded
by it.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D144870
This is the funcref counterpart to 890146b. We introduce a new attribute
that marks a function pointer as a funcref. It also implements builtin
__builtin_wasm_ref_null_func(), that returns a null funcref value.
Differential Revision: https://reviews.llvm.org/D128440
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122215
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Differential Revision: https://reviews.llvm.org/D122215
Swift calling conventions stands out in the way that they are lowered in
mostly target-independent manner, with very few customization points.
As such, swift-related methods of ABIInfo do not reference the rest of
ABIInfo and vice versa.
This change follows interface segregation principle; it removes
dependency of SwiftABIInfo on ABIInfo. Targets must now implement
SwiftABIInfo separately if they support Swift calling conventions.
Almost all targets implemented `shouldPassIndirectly` the same way. This
de-facto default implementation has been moved into the base class.
`isSwiftErrorInRegister` used to be virtual, now it is not. It didn't
accept any arguments which could have an effect on the returned value.
This is now a static property of the target ABI.
Reviewed By: rusyaev-roman, inclyc
Differential Revision: https://reviews.llvm.org/D130394
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.
Differential Revision: https://reviews.llvm.org/D94098
The recent commit 00a6254 "Stop traping on sNaN in builtin_isnan" changed the
lowering in constrained FP mode of builtin_isnan from an FP comparison to
integer operations to avoid trapping.
SystemZ has a special instruction "Test Data Class" which is the preferred
way to do this check. This patch adds a new target hook "testFPKind()" that
lets SystemZ emit the s390_tdc intrinsic instead.
testFPKind() takes the BuiltinID as an argument and is expected to soon
handle more opcodes than just 'builtin_isnan'.
Review: Thomas Preud'homme, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D96568
notail on x86-64
This is needed because the epilogue code inserted before tail calls on
x86-64 breaks the handshake between the caller and callee.
Calls to objc_retainAutoreleasedReturnValue used to have the same
problem, which was fixed in https://reviews.llvm.org/D59656.
rdar://problem/66029552
Differential Revision: https://reviews.llvm.org/D84540
The x86-64 "avx" feature changes how >128 bit vector types are passed,
instead of being passed in separate 128 bit registers, they can be
passed in 256 bit registers.
"avx512f" does the same thing, except it switches from 256 bit registers
to 512 bit registers.
The result of both of these is an ABI incompatibility between functions
compiled with and without these features.
This patch implements a warning/error pair upon an attempt to call a
function that would run afoul of this. First, if a function is called
that would have its ABI changed, we issue a warning.
Second, if said call is made in a situation where the caller and callee
are known to have different calling conventions (such as the case of
'target'), we instead issue an error.
Differential Revision: https://reviews.llvm.org/D82562
EmitTargetMetadata passed to emitTargetMD a null pointer as returned
from GetGlobalValue, for an unused inline function which has been
removed from the module at that point.
A FIXME in CodeGenModule.cpp commented that the calling code in
EmitTargetMetadata should be moved into the one target that needs it
(XCore). A review comment agreed. So the calling loop has been moved
into the XCore subclass. The check for null is done in that loop.
Differential Revision: https://reviews.llvm.org/D77068
Use unique_ptr to manage the lifetime of ABIInfo member inside TargetCodeGenInfo.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D79033
This reverts commit 6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6.
The patch breaks CUDA copmilation.
Differential Revision: https://reviews.llvm.org/D76365
Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references. For example,
[PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
compilation issue for code using `tex2D` with texture references. For
better compatibility, this patch proposes the support of
surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
`nvcc` does use builtins for texture support. From the limited NVVM
documentation[^nvvm] and NVPTX backend texture/surface related
tests[^test], it's believed that surface/texture references are
supported by replacing their reference types, which are annotated with
`device_builtin_surface_type`/`device_builtin_texture_type`, with the
corresponding handle-like object types, `cudaSurfaceObject_t` or
`cudaTextureObject_t`, in the device-side compilation. On the host
side, that global handle variables are registered and will be
established and updated later when corresponding binding/unbinding
APIs are called[^bind]. Surface/texture references are most like
device global variables but represented in different types on the host
and device sides.
- In this patch, the following changes are proposed to support that
behavior:
+ Refine `device_builtin_surface_type` and
`device_builtin_texture_type` attributes to be applied on `Type`
decl only to check whether a variable is of the surface/texture
reference type.
+ Add hooks in code generation to replace that reference types with
the correponding object types as well as all accesses to them. In
particular, `nvvm.texsurf.handle.internal` should be used to load
object handles from global reference variables[^texsurf] as well as
metadata annotations.
+ Generate host-side registration with proper template argument
parsing.
---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later.
Reviewers: tra, rjmccall, yaxunl, a.sidorin
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76365
Pass NULL to pointer arg of __cxa_atexit if addr space
is not matching with its param. This doesn't align yet
with how dtors are generated that should be changed too.
Differential Revision: https://reviews.llvm.org/D62413
llvm-svn: 366059
with notail on x86-64.
On x86-64, the epilogue code inserted before the tail jump blocks the
autoreleased return optimization.
rdar://problem/38675807
Differential Revision: https://reviews.llvm.org/D59656
llvm-svn: 356705
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Currently clang set kernel calling convention for CUDA/HIP after
arranging function, which causes incorrect kernel function type since
it depends on calling convention.
This patch moves setting kernel convention before arranging
function.
Differential Revision: https://reviews.llvm.org/D47733
llvm-svn: 334457
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D45223
llvm-svn: 330447
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
This reverts r328795 which introduced an issue with referencing __global__
function templates. More details in the original review D44747.
llvm-svn: 329099
This patch sets target specific calling convention for CUDA kernels in IR.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44747
llvm-svn: 328795
I found this while looking at the ppc failures caused by the dso_local
change.
The issue was that the patch would produce the wrong answer for
available_externally. Having ForDefinition_t available in places where
the code can just check the linkage is a bit of a foot gun.
This patch removes the ForDefiniton_t argument in places where the
linkage is already know.
llvm-svn: 324499
Summary:
Convert clang::LangAS to a strongly typed enum
Currently both clang AST address spaces and target specific address spaces
are represented as unsigned which can lead to subtle errors if the wrong
type is passed. It is especially confusing in the CodeGen files as it is
not possible to see what kind of address space should be passed to a
function without looking at the implementation.
I originally made this change for our LLVM fork for the CHERI architecture
where we make extensive use of address spaces to differentiate between
capabilities and pointers. When merging the upstream changes I usually
run into some test failures or runtime crashes because the wrong kind of
address space is passed to a function. By converting the LangAS enum to a
C++11 we can catch these errors at compile time. Additionally, it is now
obvious from the function signature which kind of address space it expects.
I found the following errors while writing this patch:
- ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address
space to TargetInfo::getPointer{Width,Align}()
- TypePrinter::printAttributedAfter() prints the numeric value of the
clang AST address space instead of the target address space.
However, this code is not used so I kept the current behaviour
- initializeForBlockHeader() in CGBlocks.cpp was passing
LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}()
- CodeGenFunction::EmitBlockLiteral() was passing a AST address space to
TargetInfo::getPointerWidth()
- CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space
to Qualifiers::addAddressSpace()
- CGOpenMPRuntimeNVPTX::getParameterAddress() was using
llvm::Type::getPointerTo() with a AST address space
- clang_getAddressSpace() returns either a LangAS or a target address
space. As this is exposed to C I have kept the current behaviour and
added a comment stating that it is probably not correct.
Other than this the patch should not cause any functional changes.
Reviewers: yaxunl, pcc, bader
Reviewed By: yaxunl, bader
Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D38816
llvm-svn: 315871
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
llvm-svn: 315804
Currently block is translated to a structure equivalent to
struct Block {
void *isa;
int flags;
int reserved;
void *invoke;
void *descriptor;
};
Except invoke, which is the pointer to the block invoke function,
all other fields are useless for OpenCL, which clutter the IR and
also waste memory since the block struct is passed to the block
invoke function as argument.
On the other hand, the size and alignment of the block struct is
not stored in the struct, which causes difficulty to implement
__enqueue_kernel as library function, since the library function
needs to know the size and alignment of the argument which needs
to be passed to the kernel.
This patch removes the useless fields from the block struct and adds
size and align fields. The equivalent block struct will become
struct Block {
int size;
int align;
generic void *invoke;
/* custom fields */
};
It also changes the pointer to the invoke function to be
a generic pointer since the address space of a function
may not be private on certain targets.
Differential Revision: https://reviews.llvm.org/D37822
llvm-svn: 314932
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082
This patch adds support for the `long_call`, `far`, and `near` attributes
for MIPS targets. The `long_call` and `far` attributes are synonyms. All
these attributes override `-mlong-calls` / `-mno-long-calls` command
line options for particular function.
Differential revision: https://reviews.llvm.org/D35479
llvm-svn: 308667
Certain targets (e.g. amdgcn) require global variable to stay in global or constant address
space. In C or C++ global variables are emitted in the default (generic) address space.
This patch introduces virtual functions TargetCodeGenInfo::getGlobalVarAddressSpace
and TargetInfo::getConstantAddressSpace to handle this in a general approach.
It only affects IR generated for amdgcn target.
Differential Revision: https://reviews.llvm.org/D33842
llvm-svn: 307470
Alloca always returns a pointer in alloca address space, which may
be different from the type defined by the language. For example,
in C++ the auto variables are in the default address space. Therefore
cast alloca to the expected address space when necessary.
Differential Revision: https://reviews.llvm.org/D32248
llvm-svn: 303370
In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space.
Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value.
Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast.
Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values.
This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way.
This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally.
Differential Revision: https://reviews.llvm.org/D26196
llvm-svn: 289252
The size of image type is reported incorrectly as size of a pointer to address space 0, which causes error when casting image type to pointers by __builtin_astype.
The fix is to get image address space from TargetInfo then report the size accordingly.
Differential Revision: https://reviews.llvm.org/D22927
llvm-svn: 277647
Allows AMDGCN target to generate images (such as %opencl.image2d_t) in constant address space.
Images will still be generated in global address space by default.
Added tests to existing opencl-types.cl in test\CodeGenOpenCL.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22523
llvm-svn: 276161
Summary:
Summary:
Change Clang calling convention SpirKernel to OpenCLKernel.
Set calling convention OpenCLKernel for amdgcn as well.
Add virtual method .getOpenCLKernelCallingConv() to TargetCodeGenInfo
and use it to set target calling convention for AMDGPU and SPIR.
Update tests.
Reviewers: rsmith, tstellarAMD, Anastasia, yaxunl
Subscribers: kzhuravl, cfe-commits
Differential Revision: http://reviews.llvm.org/D21367
llvm-svn: 274220