226 Commits

Author SHA1 Message Date
Carlo Bertolli
c687225b43 [OPENMP] Codegen for teams directive for NVPTX
This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it:

Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region.
Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime.
Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region.

http://reviews.llvm.org/D17963

llvm-svn: 265304
2016-04-04 15:55:02 +00:00
Alexey Bataev
5a3af13d93 [OPENMP] Remove extra code transformation.
For better support of some specific GNU extensions some extra
transformation of AST nodes were introduced. These transformations are
very hard to handle. The code is improved in handling of these
extensions by using captured expressions construct.

llvm-svn: 264709
2016-03-29 08:58:54 +00:00
Alexey Bataev
14fa1c6b60 [OPENMP] Allow runtime insert its own code inside OpenMP regions.
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
 (required) code to support target specific codegen.

llvm-svn: 264700
2016-03-29 05:34:15 +00:00
Alexey Bataev
f539faa733 Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions."
Reverting because of failed tests.

llvm-svn: 264577
2016-03-28 12:58:34 +00:00
Alexey Bataev
424be92831 [OPENMP] Allow runtime insert its own code inside OpenMP regions.
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
 (required) code to support target specific codegen.

llvm-svn: 264576
2016-03-28 12:52:58 +00:00
Alexey Bataev
f662b5943c Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions."
This reverts commit 3ee791165100607178073f14531a0dc90c622b36.

llvm-svn: 264570
2016-03-28 10:12:03 +00:00
Alexey Bataev
b8c425c4f7 [OPENMP] Allow runtime insert its own code inside OpenMP regions.
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
  (required) code to support target specific codegen.

llvm-svn: 264569
2016-03-28 09:53:43 +00:00
Alexey Bataev
a839dddf92 [OPENMP 4.0] Use 'declare reduction' constructs in 'reduction' clauses.
OpenMP 4.0 allows to define custom reduction operations using '#pragma
omp declare reduction' construct. Patch allows to use this custom
defined reduction operations in 'reduction' clauses.

llvm-svn: 263701
2016-03-17 10:19:46 +00:00
John McCall
c56a8b3284 Preserve ExtParameterInfos into CGFunctionInfo.
As part of this, make the function-arrangement interfaces
a little simpler and more semantic.

NFC.

llvm-svn: 263191
2016-03-11 04:30:31 +00:00
Alexey Bataev
ef549a8955 [OPENMP 4.5] Codegen for data members in 'linear' clause
OpenMP 4.5 allows privatization of non-static data members in OpenMP
constructs. Patch adds proper codegen support for data members in
'linear' clause

llvm-svn: 263003
2016-03-09 09:49:09 +00:00
Alexey Bataev
78849fb464 [OPENMP 4.5] Codegen for data members in 'linear' clause.
OpenMP 4.5 allows to use data members in private clauses. Patch adds
codegen support for 'linear' clause.

llvm-svn: 263002
2016-03-09 09:49:00 +00:00
Carlo Bertolli
0ff587d4a4 [OPENMP] Codegen for distribute directive: fix bug in ordering of parameters.
llvm-svn: 262833
2016-03-07 16:19:13 +00:00
Carlo Bertolli
fc35ad2bbc Reapply r262741 [OPENMP] Codegen for distribute directive
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.

http://reviews.llvm.org/D17170

llvm-svn: 262832
2016-03-07 16:04:49 +00:00
Samuel Antao
bf4d18d3d2 Revert r262741 - [OPENMP] Codegen for distribute directive
Was causing a failure in one of the buildbot slaves.

llvm-svn: 262744
2016-03-04 21:02:14 +00:00
Carlo Bertolli
4a56e3831d [OPENMP] Codegen for distribute directive
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.

http://reviews.llvm.org/D17170

llvm-svn: 262741
2016-03-04 20:24:58 +00:00
Carlo Bertolli
6ad7b5aff2 [OPENMP] firstprivate and private clauses of teams, host codegeneration
Add code generation support for firstprivate and private clauses of teams on the host. Add extensive regression tests including lambda functions and vla testing.

http://reviews.llvm.org/D17582

llvm-svn: 262663
2016-03-03 22:09:40 +00:00
Carlo Bertolli
430d8ecc55 Add code generation for teams directive inside target region
llvm-svn: 262652
2016-03-03 20:34:23 +00:00
Samuel Antao
b68e2db8f9 [OpenMP] Code generation for teams - kernel launching
Summary:
This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive.

The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch.

Reviewers: hfinkel, arpith-jacob, kkwli0, carlo.bertolli, ABataev

Subscribers: cfe-commits, caomhin, fraggamuffin

Differential Revision: http://reviews.llvm.org/D17019

llvm-svn: 262625
2016-03-03 16:20:23 +00:00
Alexey Bataev
2bbf7217ea [OPENMP 4.5] Initial support for data members in 'linear' clause.
OpenMP 4.5 allows to privatize data members of current class in member
functions. Patch adds initial support for privatization of data members
in 'linear' clause, no codegen support.

llvm-svn: 262578
2016-03-03 03:52:24 +00:00
Alexey Bataev
61205070c4 [OPENMP 4.5] Codegen for data members in 'reduction' clause.
OpenMP 4.5 allows to privatize non-static data members of current class
in non-static member functions. Patch supports codegen for non-static
data members in 'reduction' clauses.

llvm-svn: 262460
2016-03-02 04:57:40 +00:00
Alexey Bataev
005248ac8a [OPENMP 4.5] Codegen for member decls in 'lastprivate' clause.
OpenMP 4.5 allows to privatize non-static member decls in non-static
member functions. Patch captures such decls by reference in general (for
bitfields, by value) and then operates with this capture. For bitfields,
at the end of codegen for lastprivates original bitfield is updated with the value of captured copy.

llvm-svn: 261824
2016-02-25 05:25:57 +00:00
Richard Trieu
cc3949d99a Remove use of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261271
2016-02-18 22:34:54 +00:00
Alexey Bataev
8ffcc949b1 [OPENMP] Fix codegen for lastprivate loop counters.
Patch fixes bug with codegen for lastprivate loop counters. Also it may
improve performance for lastprivates calculations in some cases.

llvm-svn: 261209
2016-02-18 13:48:15 +00:00
Alexey Bataev
417089fc7e [OPENMP 4.5] Codegen support for data members in 'firstprivate' clause.
Added codegen for captured data members in non-static member functions.

llvm-svn: 261089
2016-02-17 13:19:37 +00:00
Alexey Bataev
3392d76081 [OPENMP] Improved handling of pseudo-captured expressions in OpenMP.
Expressions inside 'schedule'|'dist_schedule' clause must be captured in
combined directives to avoid possible crash during codegen. Patch
improves handling of such constructs

llvm-svn: 260954
2016-02-16 11:18:12 +00:00
Alexey Bataev
cd8b6a2cf1 [OPENMP] Remove extra sync barriers for 'firstprivate' clause.
Sync barrier will be emitted after generation of firstprivate variables
only if one of the firstprivate vars is used in lastprivate clause.

llvm-svn: 260877
2016-02-15 08:07:17 +00:00
Alexey Bataev
4244be25bd [OPENMP] Rename OMPCapturedFieldDecl to OMPCapturedExprDecl, NFC.
OMPCapturedExprDecl allows caopturing not only of fielddecls, but also
other expressions. It also allows to simplify codegen for several
clauses.

llvm-svn: 260492
2016-02-11 05:35:55 +00:00
Alexey Bataev
31300ed0a5 [OPENMP 4.0] Fixed support of array sections/array subscripts.
Codegen for array sections/array subscripts worked only for expressions with arrays as base. Patch fixes codegen for bases with pointer/reference types.

llvm-svn: 259776
2016-02-04 11:27:03 +00:00
Arpith Chacko Jacob
05bebb578a [OpenMP] Parsing + sema for target parallel for directive.
Summary:
This patch adds parsing + sema for the target parallel for directive along with testcases.

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D16759

llvm-svn: 259654
2016-02-03 15:46:42 +00:00
Arpith Chacko Jacob
e955b3d3fe [OpenMP] Parsing + sema for target parallel directive.
Summary:
This patch adds parsing + sema for the target parallel directive and its clauses along with testcases.

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D16553

Rebased to current trunk and updated test cases.

llvm-svn: 258832
2016-01-26 18:48:41 +00:00
Arpith Chacko Jacob
3cf89040b0 [OpenMP] Parsing + sema for defaultmap clause.
Summary:
This patch adds parsing + sema for the defaultmap clause associated with the target directive (among others).

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D16527

llvm-svn: 258817
2016-01-26 16:37:23 +00:00
Alexey Bataev
1189bd0205 [OPENMP 4.5] Allow arrays in 'reduction' clause.
OpenMP 4.5, alogn with array sections, allows to use variables of array type in reductions.

llvm-svn: 258804
2016-01-26 12:20:39 +00:00
Alexey Bataev
3015bcc62a [OPENMP] Generalize codegen for 'sections'-based directive.
If 'sections' directive has only one sub-section, the code for 'single'-based directive was emitted. Removed this codegen, because it causes crashes in different cases.

llvm-svn: 258495
2016-01-22 08:56:50 +00:00
Alexey Bataev
8524d15954 [OPENMP] Fix crash on reduction for complex variables.
reworked codegen for reduction operation for complex types to avoid crash

llvm-svn: 258394
2016-01-21 12:35:58 +00:00
Alexey Bataev
9619f04c0e [OPENMP 4.0] Fix for codegen of 'cancel' directive within 'sections' directive.
Allow to emit code for 'cancel' directive within 'sections' directive with single sub-section.

llvm-svn: 258307
2016-01-20 12:29:47 +00:00
Samuel Antao
7259076032 [OpenMP] Parsing + sema for "target exit data" directive.
Patch by Arpith Jacob. Thanks!

llvm-svn: 258177
2016-01-19 20:04:50 +00:00
Samuel Antao
df67fc468e [OpenMP] Parsing + sema for "target enter data" directive.
Patch by Arpith Jacob. Thanks!

llvm-svn: 258165
2016-01-19 19:15:56 +00:00
Carlo Bertolli
b4adf55e0f Add OpenMP dist_schedule clause to distribute directive and related regression tests.
llvm-svn: 257917
2016-01-15 18:50:31 +00:00
Samuel Antao
ee8fb302f5 [OpenMP] Reapply rL256842: [OpenMP] Offloading descriptor registration and device codegen.
This patch attempts to fix the regressions identified when the patch was committed initially. 

Thanks to Michael Liao for identifying the fix in the offloading metadata generation 
related with side effects in evaluation of function arguments. 
 

llvm-svn: 256933
2016-01-06 13:42:12 +00:00
Samuel Antao
7d5de9a1ee [OpenMP] Revert rL256842: [OpenMP] Offloading descriptor registration and device codegen.
It was causing two regression, so I'm reverting until the cause is found.

llvm-svn: 256858
2016-01-05 19:16:13 +00:00
Samuel Antao
4d5f0bbea1 [OpenMP] Offloading descriptor registration and device codegen.
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.

This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.

About offloading descriptor:

The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
 void *addr;
 char *name;
 int64_t size;
};
```  
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.

The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.

The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.


About target codegen:

The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}

!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives. 
Entry 3) a unique ID of the file where the input source file that contain the target region lives. 
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.

Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )

This patch replaces http://reviews.llvm.org/D12306.

Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao

Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits

Differential Revision: http://reviews.llvm.org/D12614

llvm-svn: 256842
2016-01-05 16:23:04 +00:00
Alexey Bataev
a6f2a14b94 [OPENMP 4.5] Codegen for 'schedule' clause with monotonic/nonmonotonic modifiers.
OpenMP 4.5 adds support for monotonic/nonmonotonic modifiers in 'schedule' clause. Add codegen for these modifiers.

llvm-svn: 256666
2015-12-31 06:52:34 +00:00
Alexey Bataev
6f531ec0a2 [OPENMP] Remove explicit call for implicit barrier
#pragma omp parallel needs an implicit barrier that is currently done by an explicit call to __kmpc_barrier. However, the runtime already ensures a barrier in __kmpc_fork_call which currently leads to two barriers per region per thread.
Differential Revision: http://reviews.llvm.org/D15561

llvm-svn: 255992
2015-12-18 10:24:53 +00:00
Alexey Bataev
8ef3141127 [OPENMP] Fix for http://llvm.org/PR25878: Error compiling an OpenMP program
OpenMP codegen tried to emit the code for its constructs even if it was detected as a dead-code. Added checks to ensure that the code is emitted if the code is not dead.

llvm-svn: 255990
2015-12-18 07:58:25 +00:00
Alexey Bataev
fc57d1601d [OPENMP 4.5] Codegen for 'hint' clause of 'critical' directive
OpenMP 4.5 defines 'hint' clause for 'critical' directive. Patch adds codegen for this clause.

llvm-svn: 255639
2015-12-15 10:55:09 +00:00
Alexey Bataev
28c75417b2 [OPENMP 4.5] Parsing/sema for 'hint' clause of 'critical' directive.
OpenMP 4.5 adds 'hint' clause to critical directive. Patch adds parsing/semantic analysis for this clause.

llvm-svn: 255625
2015-12-15 08:19:24 +00:00
Carlo Bertolli
6200a3d0f3 Add parse and sema of OpenMP distribute directive with all clauses except dist_schedule
llvm-svn: 255498
2015-12-14 14:51:25 +00:00
Alexey Bataev
33c56402d8 [OPENMP] Fix debug info for 'atomic' construct.
Debug info for statement under 'atomic' construct must point exactly to that statement, not the directive itself.

llvm-svn: 255487
2015-12-14 09:26:19 +00:00
NAKAMURA Takumi
2d5c6ddf74 Revert r255001, "Add parse and sema for OpenMP distribute directive and all its clauses excluding dist_schedule."
It causes memory leak. Some tests in test/OpenMP would fail.

llvm-svn: 255094
2015-12-09 04:35:57 +00:00
Alexey Bataev
382967a2e4 [OPENMP 4.5] Parsing/sema for 'num_tasks' clause.
OpenMP 4.5 adds directives 'taskloop' and 'taskloop simd'. These directives support clause 'num_tasks'. Patch adds parsing/semantic analysis for this clause.

llvm-svn: 255008
2015-12-08 12:06:20 +00:00