Summary:
According to OpenMP 5.0, nonmonotonic modifier can be used with all
schedule kinds, not only dynamic and guided as in OpenMP 4.5.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82026
Summary:
The OpenMP loops are normalized and transformed into the loops from 0 to
max number of iterations. In some cases, original scheme may lead to
overflow during calculation of number of iterations. If it is unknown,
if we can end up with overflow or not (the bounds are not constant and
we cannot define if there is an overflow), cast original type to the
unsigned.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, openmp-commits, cfe-commits, caomhin
Tags: #clang, #openmp
Differential Revision: https://reviews.llvm.org/D81881
Summary:
According to OpenMP, During execution of an iteration of a worksharing-loop or a loop nest within a worksharing-loop, simd, or worksharing-loop SIMD region, a thread must not execute more than one ordered region corresponding to an ordered construct without a depend clause.
Need to report an error in this case.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81951
Summary:
Added codegen for use_device_addr clause. The components of the list
items are mapped as a kind of RETURN components and then the returned
base address is used instead of the real address of the base declaration
used in the use_device_addr expressions.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80730
Summary:
If the array subscript expression is type depent, its analysis must be
delayed before its instantiation.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, caomhin, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78637
Summary:
Diagnostic is emitted if some declaration of unsupported type
declaration is used inside device code.
Memcpy operations for structs containing member with unsupported type
are allowed. Fixed crash on attempt to emit diagnostic outside of the
functions.
The approach is generalized between SYCL and OpenMP.
CUDA/OMP deferred diagnostic interface is going to be used for SYCL device.
Reviewers: rsmith, rjmccall, ABataev, erichkeane, bader, jdoerfert, aaron.ballman
Reviewed By: jdoerfert
Subscribers: guansong, sstefan1, yaxunl, mgorny, bader, ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74387
Summary:
With recovery expr, it is possible that we have a value-dependent expr
within non-dependent context.
Reviewers: sammccall, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80200
Summary:
The BFloat IR type is introduced to provide support for, initially, the BFloat16
datatype introduced with the Armv8.6 architecture (optional from Armv8.2
onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE
754 floating point IR type.
This is part of a patch series upstreaming Armv8.6 features. Subsequent patches
will upstream intrinsics support and C-lang support for BFloat.
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau
Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78190
Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```
Reviewers: jdoerfert, aaron.ballman
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79257
Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79186
Summary:
Added basic support for 'task' modifier in the reduction clauses in
non-simd parallel and worksharing constructs.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78738
Summary:
The global variable should be captured in the region only if it was
privitized in the region or in any of the outer regions. Otherwise, it
should not be captured.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77731
Summary:
According to the standard, variable-category is the optional part of the
defaultmap clause while the compiler always requires it. Turned it into
optional part.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77751
By default, all traits in the OpenMP context selector have to match for
it to be acceptable. Though, we sometimes want a single property out of
multiple to match (=any) or no match at all (=none). We offer these
choices as extensions via
`implementation={extension(match_{all,any,none})}`
to the user. The choice will affect the entire context selector not only
the traits following the match property.
The first user will be D75788. There we can replace
```
#pragma omp begin declare variant match(device={arch(nvptx64)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
// TODO: Hack until we support an extension to the match clause that allows "or".
#undef __CLANG_CUDA_CMATH_H__
#undef __CUDA__
#pragma omp end declare variant
#pragma omp begin declare variant match(device={arch(nvptx)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
with the much simpler
```
#pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D77414
If we have a function definition in `omp begin/end declare variant` it
is a specialization of a base function with the same name and
"compatible" type. Before, we just created a declaration for the base.
With this patch we try to find an existing declaration first and only
create a new one if we did not find any with a compatible type. This is
preferable as we can tolerate slight mismatches, especially if the
specialized version is "more constrained", e.g., constexpr.
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D77252
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D77112
See rational here: https://reviews.llvm.org/D76173#1922916
Time to compile Attr.h in isolation goes from 2.6s to 1.8s.
Original patch by Johannes, plus some additions from Reid to fix some
clang tooling targets.
Effect on transitive includes is marginal, though:
$ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \
| grep '^[-+] ' | sort | uniq -c | sort -nr
104 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/AST/OpenMPClause.h
87 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Frontend/OpenMP/OMPContext.h
19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SmallSet.h
19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SetVector.h
14 - /usr/include/c++/9/set
...
Differential Revision: https://reviews.llvm.org/D76184
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Differential Revision: https://reviews.llvm.org/D77112
Summary:
Added basic representation and parsing/sema handling of array-shaping
operations. Array shaping expression is an expression of form ([s0]..[sn])base,
where s0, ..., sn must be a positive integer, base - a pointer. This
expression is a kind of cast operation that converts pointer expression
into an array-like kind of expression.
Reviewers: rjmccall, rsmith, jdoerfert
Subscribers: guansong, arphaman, cfe-commits, caomhin, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74144
This is the second part loosely extracted from D71179 and cleaned up.
This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.
A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].
The logic is as follows:
If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
1) Create a function declaration for the definition we were about to generate.
2) Create a function definition but with a mangled name (according to
`<SELECTOR>`).
3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
one used already for `omp declare variant`, using and the mangled
function definition as specialization for the context defined by
`<SELECTOR>`.
When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman
Subscribers: bollu, guansong, openmp-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75779
This is the first part extracted from D71179 and cleaned up.
This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].
A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D74941
region.
According to OpenMP 5.0, exactly one scan directive must appear in the loop body of an enclosing worksharing-loop, worksharing-loop SIMD, or simd construct on which a reduction clause with the inscan modifier is present.