Summary:
Port rL265324 to SystemZ to allow using the 'swiftcall' attribute on that architecture.
Depends on D19414.
Reviewers: kbarton, rjmccall, uweigand
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19432
llvm-svn: 267879
OpenMP 4.5 defines 'taskloop simd' directive, which is combined
directive for 'taskloop' and 'simd' directives. Patch adds initial
codegen support for this directive and its 2 basic clauses 'safelen' and
'simdlen'.
llvm-svn: 267872
It makes compiler-rt tests fail if the gold plugin is enabled.
Revert "Rework interface for bitset-using features to use a notion of LTO visibility."
Revert "Driver: only produce CFI -fvisibility= error when compiling."
Revert "clang/test/CodeGenCXX/cfi-blacklist.cpp: Exclude ms targets. They would be non-cfi."
llvm-svn: 267871
directive.
OpenMP 4.5 defines 'taskloop' directive and 2 additional clauses
'grainsize' and 'num_tasks' for this directive. Patch adds codegen for
these clauses.
These clauses are generated as arguments of the '__kmpc_taskloop'
libcall and are encoded the following way:
void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int sched, kmp_uint64 grainsize, void *task_dup);
If 'grainsize' is specified, 'sched' argument must be set to '1' and
'grainsize' argument must be set to the value of the 'grainsize' clause.
If 'num_tasks' is specified, 'sched' argument must be set to '2' and
'grainsize' argument must be set to the value of the 'num_tasks' clause.
It is possible because these 2 clauses are mutually exclusive and can't
be used at the same time on the same directive.
If none of these clauses is specified, 'sched' argument must be set to
'0'.
llvm-svn: 267862
Summary:
This patch adds support for the target exit data directive code generation.
Given that, apart from the employed runtime call, target exit data requires the same code generation pattern as target enter data, the OpenMP codegen entry point was renamed and reused for both.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17369
llvm-svn: 267814
Summary: This patch adds support for the target enter data directive code generation.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17368
llvm-svn: 267812
Summary:
This patch adds support for the target data directive code generation.
Part of the already existent functionality related with data maps is moved to a new function so that it could be reused.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17367
llvm-svn: 267811
Summary:
Implement codegen for the map clause. All the new list items in 4.5 specification are supported.
Fix bug in the generation of array sections that was exposed by some of the map clause tests: for pointer types the offsets have to be calculated from the pointee not the pointer.
Reviewers: hfinkel, kkwli0, carlo.bertolli, arpith-jacob, ABataev
Subscribers: ABataev, cfe-commits, caomhin, fraggamuffin
Differential Revision: http://reviews.llvm.org/D16749
llvm-svn: 267808
Bitsets, and the compiler features they rely on (vtable opt, CFI),
only have visibility within the LTO'd part of the linkage unit. Therefore,
only enable these features for classes with hidden LTO visibility. This
notion is based on object file visibility or (on Windows)
dllimport/dllexport attributes.
We provide the [[clang::lto_visibility_public]] attribute to override the
compiler's LTO visibility inference in cases where the class is defined
in the non-LTO'd part of the linkage unit, or where the ABI supports
calling classes derived from abstract base classes with hidden visibility
in other linkage units (e.g. COM on Windows).
If the cross-DSO CFI mode is enabled, bitset checks are emitted even for
classes with public LTO visibility, as that mode uses a separate mechanism
to cause bitsets to be exported.
This mechanism replaces the whole-program-vtables blacklist, so remove the
-fwhole-program-vtables-blacklist flag.
Because __declspec(uuid()) now implies [[clang::lto_visibility_public]], the
support for the special attr:uuid blacklist entry is removed.
Differential Revision: http://reviews.llvm.org/D18635
llvm-svn: 267784
Summary:
As of D18614, TargetMachine exposes a hook to add a set of passes that should
be run as early as possible. Invoke this hook from clang when setting up the
pass manager.
Reviewers: chandlerc
Subscribers: rnk, cfe-commits, tra
Differential Revision: http://reviews.llvm.org/D18617
llvm-svn: 267764
Make 'nodebug' on a global/static variable suppress all debug info
for the variable. Previously it would only suppress info for the
associated initializer function, if any.
Differential Revision: http://reviews.llvm.org/D19567
llvm-svn: 267746
declare reductions.
If reduction clause is applied to instance of class with user-defined
reduction operation without initialization clause, it may cause a crash.
Patch fixes this issue.
llvm-svn: 267695
Currently there is a problem with codegen of inlined directives inside
lambdas, it may cause a crash during codegen because of incorrect
capturing of variables. Patch fixes this problem.
llvm-svn: 267677
SPIR target can be used for C/C++ inputs too (i.e. in OpenCL compatible mode for the libs creation).
Patch by Neil Henning!
Review: http://reviews.llvm.org/D19478
llvm-svn: 267561
instantiation is in a module.
This patch fixes the condition for determining whether the debug info for a
template instantiation will exist in an imported clang module by:
- checking whether the ClassTemplateSpecializationDecl is complete and
- checking that the instantiation was in a module by looking at the first field.
I also added a negative check to make sure that a typedef to a forward-declared
template (with the definition outside of the module) is handled correctly.
http://reviews.llvm.org/D19443
rdar://problem/25553724
llvm-svn: 267464
The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed.
The next code will be generated for the taskloop directive:
#pragma omp taskloop num_tasks(N) lastprivate(j)
for( i=0; i<N*GRAIN*STRIDE-1; i+=STRIDE ) {
int th = omp_get_thread_num();
#pragma omp atomic
counter++;
#pragma omp atomic
th_counter[th]++;
j = i;
}
Generated code:
task = __kmpc_omp_task_alloc(NULL,gtid,1,sizeof(struct
task),sizeof(struct shar),&task_entry);
psh = task->shareds;
psh->pth_counter = &th_counter;
psh->pcounter = &counter;
psh->pj = &j;
task->lb = 0;
task->ub = N*GRAIN*STRIDE-2;
task->st = STRIDE;
__kmpc_taskloop(
NULL, // location
gtid, // gtid
task, // task structure
1, // if clause value
&task->lb, // lower bound
&task->ub, // upper bound
STRIDE, // loop increment
0, // 1 if nogroup specified
2, // schedule type: 0-none, 1-grainsize, 2-num_tasks
N, // schedule value (ignored for type 0)
(void*)&__task_dup_entry // tasks duplication routine
);
llvm-svn: 267395
LLVM really wants a debug location on every inlinable call in a function
with debug info, because it otherwise cannot set up inlining debug info.
This change applies an artificial line 0 debug location (which is how
DWARF marks automatically generated code that has no corresponding
source code) to the .__kmpc_global_dtor_. functions to avoid the
LLVM Verifier complaining.
llvm-svn: 267369
LLVM stopped using MDString-based type references, and DIBuilder no
longer fills 'retainedTypes:' with every DICompositeType that has an
'identifier:' field. There are just minor changes to keep the same
behaviour in CFE.
Leaving 'retainedTypes:' unfilled has a dramatic impact on the output
order of the IR though. There are a huge number of testcase changes,
which were unfortunately not really scriptable.
llvm-svn: 267297
Write out the PGOFuncName meta data if PGOFuncName is different from function's
raw name. This should only apply to internal linkage functions. This is to be
consumed by indirect-call promotion when called in LTO optimization pass.
Differential Revision: http://reviews.llvm.org/D18624
llvm-svn: 267224
For various reasons, involving dllexport and class linkage compuations,
we have to wait until after the semicolon after a class declaration to
emit inline methods. These are "deferred" decls. Before this change,
finishing the tag decl would trigger us to deserialize some PCH so that
we could make a "pretty" IR-level type. Deserializing the PCH triggered
calls to HandleTopLevelDecl, which, when done, checked the deferred decl
list, and emitted some dllexported decls that weren't ready.
Avoid this re-entrancy. Deferred decls should not get emitted when a tag
is finished, they should only be emitted after a real top level decl in
the main file.
llvm-svn: 267186
causes code generation failure.
The codegen part of firstprivate clause for member decls used type of
original variable without skipping reference type from
OMPCapturedExprDecl. Patch fixes this problem.
llvm-svn: 267125
If loop control variable for simd-based directives is explicitly marked
as linear/lastprivate in clauses, codegen for such construct would
crash. Patch fixes this problem.
llvm-svn: 267101
Summary:
Adds a framework to enable the instrumentation pass for the new
EfficiencySanitizer ("esan") family of tools. Adds a flag for esan's
cache fragmentation tool via -fsanitize=efficiency-cache-frag.
Adds appropriate tests for the new flag.
Reviewers: eugenis, vitalybuka, aizatsky, filcab
Subscribers: filcab, kubabrecka, llvm-commits, zhaoqin, kcc
Differential Revision: http://reviews.llvm.org/D19169
llvm-svn: 267059
in the compile unit that contains their implementation even if their
interface is declared in a module.
The private @implementation of an @interface may have additional
hidden ivars so we should not defer to the public version of the
type that is found in the module.
<rdar://problem/25541798>
llvm-svn: 266937
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266853
Summary:
This is a follow-on to apply Duncan's new DIType ODR uniquing from
r266549 and r266713 in more places.
When invoking ThinLTO backend compiles via clang (for a distributed
build), invoke enableDebugTypeODRUniquing() before parsing the module.
Reviewers: dexonsmith, joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19264
llvm-svn: 266852
The intrinsic is now called llvm.thread.pointer, not
llvm.aarch64.thread.pointer. Also, the code handling it in CGBuiltin.cpp
is dead - it's already covered by GCCBuiltin. Remove it.
Differential Revision: http://reviews.llvm.org/D19099
llvm-svn: 266817
r259537 added vfma/vfms to armv7, but the builtin was only lowered
on the AArch64 side. Instead of supporting it on ARM, get rid of it.
The vfms builtin lowered to:
%nb = fsub float -0.0, %b
%r = @llvm.fma.f32(%a, %nb, %c)
Instead, define the operation in terms of vfma, and swap the
multiplicands. It now lowers to:
%na = fsub float -0.0, %a
%r = @llvm.fma.f32(%na, %b, %c)
This matches the instruction more closely, and lets current LLVM
generate the "natural" operand ordering:
fmls.2s v0, v1, v2
instead of the crooked (but equivalent):
fmls.2s v0, v2, v1
Except for theses changes, assembly is identical.
LLVM accepts both commutations, and the LLVM tests in:
test/CodeGen/AArch64/arm64-fmadd.ll
test/CodeGen/AArch64/fp-dp3.ll
test/CodeGen/AArch64/neon-fma.ll
test/CodeGen/ARM/fusedMAC.ll
already check either the new one only, or both.
Also verified against the test-suite unittests.
llvm-svn: 266807
Currently, for the ppc64--gnu and aarch64 ABIs, we recognize:
typedef __attribute__((__ext_vector_type__(3))) float v3f32;
typedef __attribute__((__ext_vector_type__(16))) char v16i8;
struct HFA {
v3f32 a;
v16i8 b;
};
as an HFA. Since the first type encountered is used as the base type,
we pass the HFA as:
[2 x <3 x float>]
Which leads to incorrect IR (relying on padding values) when the
second field is used.
Instead, explicitly widen the vector (after size rounding) in
isHomogeneousAggregate.
Differential Revision: http://reviews.llvm.org/D18998
llvm-svn: 266784
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266754
If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks.
llvm-svn: 266722