Similar to #135415.
The spec has not strict restriction to allow a single `if_present`
clause on the host_data and update directives. Allowing this clause
multiple times does not change the semantic of it. This patch relax the
rules in ACC.td since there is no restriction in the standard.
The OpenACC dialect represents the `if_present` clause with a `UnitAttr`
so the attribute will be set if the is one or more `if_present` clause.
The current implementation of OpenACC lowering includes explicit
expansion of following cases:
- Creation of `acc.bounds` operations for all arrays, including those
whose dimensions are captured in the type (eg `!fir.array<100xf32>`)
- Expansion of box types by only putting the box's address in the data
clause. The address was extracted with a `fir.box_addr` operation and
the bounds were filled with `fir.box_dims` operation.
However, with the creation of the new type interface `MappableType`, the
idea is that specific type-based semantics can now be used. This also
really simplifies representation in the IR. Consider the following
example:
```
subroutine sub(arr)
real :: arr(:)
!$acc enter data copyin(arr)
end subroutine
```
Before the current PR, the relevant acc dialect IR looked like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
...
%1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%2:3 = fir.box_dims %1#0, %c0 : (!fir.box<!fir.array<?xf32>>, index)
-> (index, index, index)
%c0_0 = arith.constant 0 : index
%3 = arith.subi %2#1, %c1 : index
%4 = acc.bounds lowerbound(%c0_0 : index) upperbound(%3 : index)
extent(%2#1 : index) stride(%2#2 : index) startIdx(%c1 : index)
{strideInBytes = true}
%5 = fir.box_addr %1#0 : (!fir.box<!fir.array<?xf32>>) ->
!fir.ref<!fir.array<?xf32>>
%6 = acc.copyin varPtr(%5 : !fir.ref<!fir.array<?xf32>>) bounds(%4) ->
!fir.ref<!fir.array<?xf32>> {name = "arr", structured = false}
acc.enter_data dataOperands(%6 : !fir.ref<!fir.array<?xf32>>)
```
After the current change, it looks like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
...
%1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
%2 = acc.copyin var(%1#0 : !fir.box<!fir.array<?xf32>>) ->
!fir.box<!fir.array<?xf32>> {name = "arr", structured = false}
acc.enter_data dataOperands(%2 : !fir.box<!fir.array<?xf32>>)
```
Restoring the old behavior can be done with following command line
options:
`--openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true`
As long as the data clause operations are not tightly
"associated" with the compute/data operations (e.g.
they can be optimized as SSA producers and made block
arguments), the information about the original async()
clause should be attached to the data clause operations
to make it easier to generate proper runtime actions
for them. This change propagates the async() information
from the OpenACC data/compute constructs to the data clause
operations. This change also adds the CurrentDeviceIdResource
to guarantee proper ordering of the operations that read
and write the current device identifier.
When the new `acc.loop` design was introduced some of the loop
information like `gang`/`vector`/`worker` were also updated to support
`device_type`.
With a conflict in parsing/printing, the keyword only value for
`async`/`gang`/`vector`/`worker` were printed/parsed with an empty set
of parenthesis `()`. To make the IR clearer to read and similar across
the operations, the loop control part of is now prefixed by `control`
and this allow to remove the need of the empty `()`.
`getDataOperandBaseAddr` retrieve the address of a value when we need to
generate bound operations. When switching to HLFIR, we did not really
handle the fact that this value was then pointing to the result of a
hlfir.declare. Because of that the `#1` value was being used. `#0` value
is carrying the correct information about lowerbounds and should be
used. This patch updates the `getDataOperandBaseAddr` function to use
the correct result value from hlfir.declare.
- Support wait(devnum: ) with device_type support on all operations that
require it
- devnum value is stored as the first value of waitOperands in its
device_type sub-segment. The hasWaitDevnum attribute inform which
sub-segment has a wait(devnum) value.
- Make async/wait information homogenous on compute ops, data and update
op.
- Unify operands/attributes names across operations and use the same
custom parser/printer
Patch 2/3 of the transition step 1 described in
https://discourse.llvm.org/t/rfc-enabling-the-hlfir-lowering-by-default/72778/7.
All the modified tests are still here since coverage for the direct
lowering to FIR was still needed while it was default. Some already have
an HLFIR version, some have not and will need to be ported in step 2
described in the RFC.
Note that another 147 lit tests use -emit-fir/-emit-llvm outputs but do
not need a flag since the HLFIR/no HLFIR output is the same for what is
being tested.
Update Fortran OpenACC tests to use the updated data clause
attribute format for better readability.
Depends on D155605
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D155603
Self clause is the same same as the host clause. Lower it
in a simmilar way.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D150174
The `if_present` clause is modeled as an attribute on the
acc.update operation. The lowering was not adding correctly the attribute
when the clause was present. This patch update the lowering to add
the attribute when needed.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D150171
Data operands associated with acc.update should comes
from acc data entry/exit operations or acc.getdeviceptr.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D149990
Update OpenACC update construct lowering to create
the data operand operations for host and device clauses.
Depends on D149909
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D149910
This patch adds lowering for the `!$acc update`
from the PFT to OpenACC dialect.
This patch is part of the upstreaming effort from fir-dev branch.
Depends on D122387
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D122396