These three clauses are all quite trivial, as they take no parameters.
They are mutually exclusive, and 'seq' has some other exclusives that
are implemented here.
The ONE thing that isn't implemented is 2.9's restriction (line 2010):
'A loop associated with a 'loop' construct that does not have a 'seq'
clause must be written to meet all the following conditions'.
Future clauses will require similar work, so it'll be done as a
followup.
This patch implements the 'loop' construct AST, as well as the basic
appertainment rule. Additionally, it sets up the 'parent' compute
construct, which is necessary for codegen/other diagnostics.
A 'loop' can apply to a for or range-for loop, otherwise it has no other
restrictions (though some of its clauses do).
This patch improves the preservation of qualifiers and loss of type
sugar in TemplateNames.
This problem is analogous to https://reviews.llvm.org/D112374 and this
patch takes a very similar approach to that patch, except the impact
here is much lesser.
When a TemplateName was written bare, without qualifications, we
wouldn't produce a QualifiedTemplate which could be used to disambiguate
it from a Canonical TemplateName. This had effects in the TemplateName
printer, which had workarounds to deal with this, and wouldn't print the
TemplateName as-written in most situations.
There are also some related fixes to help preserve this type sugar along
the way into diagnostics, so that this patch can be properly tested.
- Fix dropping the template keyword.
- Fix type deduction to preserve sugar in TST TemplateNames.
This improves and unifies our approach to printing all template
arguments.
The same approach to printing types is extended to all
TemplateArguments: A sugared version is printed in quotes, followed by
printing the canonical form, unless they would print the same.
Special improvements are done to add more detail to template template
arguments.
It's planned in a future patch to use this improved TemplateName printer
for other places besides TemplateArguments.
Note: The sugared/desugared printing does not show up for TemplateNames
in tests yet, because we do a poor job of preserving their type sugar.
This will be improved in a future patch.
'reduction' has a few restrictions over normal 'var-list' clauses:
1- On parallel, a num_gangs can only have 1 argument when combined with
reduction. These two aren't able to be combined on any other of the
compute constructs however.
2- The vars all must be 'numerical data types' types of some sort, or a
'composite of numerical data types'. A list of types is given in the
standard as a minimum, so we choose 'isScalar', which covers all of
these types and keeps types that are actually numeric. Other compilers
don't seem to implement the 'composite of numerical data types', though
we do.
3- Because of the above restrictions, member-of-composite is not
allowed, so any access via a memberexpr is disallowed. Array-element and
sub-arrays (aka array sections) are both permitted, so long as they meet
the requirements of #2.
This patch implements all of these for compute constructs.
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to. As we don't have a source for the valid
architectures yet, this patch just accepts all.
Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.
This reverts commit 06f04b2e27f2586d3db2204ed4e54f8b78fea74e.
This reapplies commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.
The build failures were caused by the patch depending on the order of
evaluation of arguments to a function. This reapplication separates out
the capture of one of the values.
This reverts commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.
This and the followup patch keep hitting an assert I wrote on the build
bots in a way that isn't clear. Reverting so I can fix it without a
rush.
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to. As we don't have a source for the valid
architectures yet, this patch just accepts all.
Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.
'wait' takes a few int-exprs (well, a series of async-arguments, but
those are effectively just an int-expr), plus a pair of tags. This
patch adds the support for this to the AST, and does the appropriate
semantic analysis for them.
This is a pretty simple clause, it takes an 'async-argument', which
effectively needs to be just parsed as an 'int' argument, since it can
be an arbitrarly integer at runtime (and negative values are legal for
implementation defined values).
This patch also cleans up the async-argument parsing, so 'wait' got some
minor quality-of-life improvements for parsing (both clause and
construct).
These two are very similar to the other 'var-list' variants, except they
require that the type of the variable be a pointer. This patch
implements that restriction.
Like 'copy', these also have alternate names, so this implements that as
well. Additionally, these have an optional tag of either 'readonly' or
'zero' depending on the clause.
Otherwise, this is a pretty rote implementation of the clause, as there
aren't any special rules for it.
Like present, no_create, and first_private, copy is a clause that takes
just a var-list, and follows the same rules as the others.
The one unique part of this clause is that it ALSO supports two
deprecated/backwards-compatibility spellings, so this patch adds them
and implements them.
The private clause is the first that takes a 'var-list', thus this has a
lot of additional work to enable the var-list type. A 'var' is a
traditional variable reference, subscript, member-expression, or
array-section, so checking of these is pretty minor.
Note: This ran into some issues with array-sections (aka sub-arrays)
that will be fixed in a follow-up patch.
num_gangs takes an 'int-expr-list', for 'parallel', and an 'int-expr'
for 'kernels'. This patch changes the parsing to always parse it as an
'int-expr-list', then correct the expression count during Sema. It also
implements the rest of the semantic analysis changes for this clause.
The 'vector_length' clause is semantically identical to the
'num_workers' clause, in that it takes a mandatory single int-expr. This
is implemented identically to it.
`self` clauses on compute constructs take an optional condition
expression. We again limit the implementation to ONLY compute constructs
to ensure we get all the rules correct for others. However, this one
will be particularly complicated, as it takes a `var-list` for `update`,
so when we get to that construct/clause combination, we need to do that
as well.
This patch also furthers uses of the `OpenACCClauses.def` as it became
useful while implementing this (as well as some other minor refactors as
I went through).
Finally, `self` and `if` clauses have an interaction with each other, if
an `if` clause evaluates to `true`, the `self` clause has no effect.
While this is intended and can be used 'meaningfully', we are warning on
this with a very granular warning, so that this edge case will be
noticed by newer users, but can be disabled trivially.
Like with the 'default' clause, this is being applied to only Compute
Constructs for now. The 'if' clause takes a condition expression which
is used as a runtime value.
This is not a particularly complex semantic implementation, as there
isn't much to this clause, other than its interactions with 'self',
which will be managed in the patch to implement that.
As a followup to my previous commits, this is an implementation of a
single clause, in this case the 'default' clause. This implements all
semantic analysis for it on compute clauses, and continues to leave it
rejected for all others (some as 'doesnt appertain', others as 'not
implemented' as appropriate).
This also implements and tests the TreeTransform as requested in the
previous patch.
This fixes some problems wrt dependence of captures in lambdas with
an explicit object parameter.
[temp.dep.expr] states that
> An id-expression is type-dependent if [...] its terminal name is
> - associated by name lookup with an entity captured by copy
> ([expr.prim.lambda.capture]) in a lambda-expression that has
> an explicit object parameter whose type is dependent [dcl.fct].
There were several issues with our implementation of this:
1. we were treating by-reference captures as dependent rather than
by-value captures;
2. tree transform wasn't checking whether referring to such a
by-value capture should make a DRE dependent;
3. when checking whether a DRE refers to such a by-value capture, we
were only looking at the immediately enclosing lambda, and not
at any parent lambdas;
4. we also forgot to check for implicit by-value captures;
5. lastly, we were attempting to determine whether a lambda has an
explicit object parameter by checking the `LambdaScopeInfo`'s
`ExplicitObjectParameter`, but it seems that that simply wasn't
set (yet) by the time we got to the check.
All of these should be fixed now.
This fixes#70604, #79754, #84163, #84425, #86054, #86398, and #86399.
As a first step in adding clause support for OpenACC to Semantic
Analysis, this patch adds the 'base' AST nodes required for clauses.
This patch has no functional effect at the moment, but followup patches
will add the semantic analysis of clauses (plus individual clauses).
'serial', 'parallel', and 'kernel' constructs are all considered
'Compute' constructs. This patch creates the AST type, plus the required
infrastructure for such a type, plus some base types that will be useful
in the future for breaking this up.
The only difference between the three is the 'kind'( plus some minor
clause legalization rules, but those can be differentiated easily
enough), so rather than representing them as separate AST nodes, it
seems
to make sense to make them the same.
Additionally, no clause AST functionality is being implemented yet, as
that fits better in a separate patch, and this is enough to get the
'naked' constructs implemented.
This is otherwise an 'NFC' patch, as it doesn't alter execution at all,
so there aren't any tests. I did this to break up the review workload
and to get feedback on the layout.
The ability to dump AST nodes is important to ad-hoc debugging, and
the fact this doesn't work with TypeLoc nodes is an obvious missing
feature in e.g. clang-query (`set output dump` simply does nothing).
Having TypeLoc::dump(), and enabling DynTypedNode::dump() for such nodes
seems like a clear win.
It looks like this:
```
int main(int argc, char **argv);
FunctionProtoTypeLoc <test.cc:3:1, col:31> 'int (int, char **)' cdecl
|-ParmVarDecl 0x30071a8 <col:10, col:14> col:14 argc 'int'
| `-BuiltinTypeLoc <col:10> 'int'
|-ParmVarDecl 0x3007250 <col:20, col:27> col:27 argv 'char **'
| `-PointerTypeLoc <col:20, col:26> 'char **'
| `-PointerTypeLoc <col:20, col:25> 'char *'
| `-BuiltinTypeLoc <col:20> 'char'
`-BuiltinTypeLoc <col:1> 'int'
```
It dumps the lexically nested tree of type locs.
This often looks similar to how types are dumped, but unlike types
we don't look at desugaring e.g. typedefs, as their underlying types
are not lexically spelled here.
---
Less clear is exactly when to include these nodes in existing text AST
dumps rooted at (TranslationUnit)Decls.
These already omit supported nodes sometimes, e.g. NestedNameSpecifiers
are often mentioned but not recursively dumped.
TypeLocs are a more extreme case: they're ~always more verbose
than the current AST dump.
So this patch punts on that, TypeLocs are only ever printed recursively
as part of a TypeLoc::dump() call.
It would also be nice to be able to invoke `clang` to dump a typeloc
somehow, like `clang -cc1 -ast-dump`. But I don't know exactly what the
best verison of that is, so this patch doesn't do it.
---
There are similar (less critical!) nodes: TemplateArgumentLoc etc,
these also don't have dump() functions today and are obvious extensions.
I suspect that we should add these, and Loc nodes should dump each other
(e.g. the ElaboratedTypeLoc `vector<int>::iterator` should dump
the NestedNameSpecifierLoc `vector<int>::`, which dumps the
TemplateSpecializationTypeLoc `vector<int>::` etc).
Maybe this generalizes further to a "full syntactic dump" mode, where
even Decls and Stmts would print the TypeLocs they lexically contain.
But this may be more complex than useful.
---
While here, ConceptReference JSON dumping must be implemented. It's not
totally clear to me why this implementation wasn't required before but
is now...
This patch dump the rewritten sub-expressions in `CXXDefaultArgExpr` and
`CXXDefaultInitExpr`.
This machinery is useful for checking whether the materialized
temporaries is lifetime-extended in the sub-AST of `CXXDefaultArgExpr`
(`CXXDefaultInitExpr` has not been lifetime extendend now).
Signed-off-by: yronglin <yronglin777@gmail.com>
Test updated to expect i8 gep.
Original message:
This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.
A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.
This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.
A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.
This patch converts `InlineCommandComment::RenderKind` to a scoped enum at namespace scope, making it eligible for forward declaring. This is useful for e.g. annotating bit-fields with `preferred_type`.
This patch converts `LinkageSpecDecl::LanguageIDs` into scoped enum, and moves it to namespace scope, so that it can be forward-declared where required.
This patch moves `OMPDeclareReductionDecl::InitKind` to DeclBase.h, so that it's complete at the point where corresponding bit-field is declared. This patch also converts it to scoped enum named `OMPDeclareReductionInitKind`
This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.
These are an artifact of how types are structured but serve little
purpose, merely showing that the type is sugared in some way. For
example, ElaboratedType's existence means struct S gets printed as
'struct S':'struct S' in the AST, which is unnecessary visual clutter.
Note that skipping the second print when the types have the same string
matches what we do for diagnostics, where the aka will be skipped.
The dump() is not actually included recursively in any other nodes' dump,
as this is too verbose (similar to NNS) but useful in its own right.
It's unfortunate to not have the actual tests yet, but the DynTypedNode
tests are matcher-based and adding matchers is a larger task than
DynTypedNode support (but can't be done first).
(I've got a clangd change stacked on this that uses DynTypedNode and
dump(), and both work. I'll send a change for matchers next).
Differential Revision: https://reviews.llvm.org/D159300