The new version of the OpenACC specification will allow the if clause on
the atomic directives. Allow it in `ACC.td` and update the parse node
and parser in flang to support it.
OpenACC dialect will need to be updated to support it as well.
In `makeMinMaxInitValGenerator` it explicitly checks for only
`FloatType` and `IntegerType`, so we shouldn't match if we don't have
either of those types.
Fix for #134308
* Enable lowering and conversion patterns to pass volatility information
from higher level operations to lower level ones.
* Enable codegen to pass volatility to LLVM dialect ops by setting an
attribute on loads, stores, and memory intrinsics.
* Add utilities for passing along the volatility from an input type to
an output type.
To introduce volatile types into the IR, entities with the volatile
attribute will be given a volatile type in the bridge; this is not
enabled in this patch. User code should not result in IR with volatile
types yet, so this patch contains no tests with Fortran source, only IR
that already contains volatile types.
Part 3 of #132486.
The OpenMP runtime needs the base address of the array section to
identify the dependency.
If we just put the vector subscript through the usual HLFIR expression
lowering, that would generate a new contiguous array representing the
values of the elements in the array which was sectioned. We cannot use
addresses from this array because these addresses would not match
dependencies on the original array. For example
```
integer :: array(1024)
integer :: indices(2)
indices(1) = 1
indices(2) = 100
!$omp task depend(out: array(1:512))
!$omp end task
!$omp task depend(in: array(indices))
!$omp end task
```
This requires taking the lowering path previously only used for ordered
assignments to get the address of the elements in the original array
which were indexed. This is done using `hlfir.elemental_addr`. e.g.
```
array(indices) = 2
```
`hlfir.elemental_addr` is awkward to use because it (by design) doesn't
return something like `!hlfir.expr<>` (like `hlfir.elemental`) and so it
can't have a generic lowering: each place it is used has to carefully
inline the contents of the operation and extract the needed address.
For this reason, `hlfir.elemental_addr` is not allowed outside of these
ordered assignments. In this commit I ignore this restriction so that I
can use `hlfir.elemental_addr` to lower the OpenMP DEPEND clause (this
works because the operation is inlined and removed before the verifier
runs).
One alternative solution would have been to provide my own more limited
re-implementation of `HlfirDesignatorBuilder` which skipped
`hlfir::elemental_addr`, instead inlining its body directly at the
current insertion point applying indices only for the first element.
This would have been difficult to maintain because designation in
Fortran is complex.
OpenACC declare statements are restricted from having having clauses
that reference assumed size arrays. It should be the case that we can
implement `deviceptr` and `present` clauses for assumed-size arrays.
This is a first step towards relaxing this restriction.
Note running flang on the following example results in an error in
lowering.
```
$ cat t.f90
subroutine vadd (a, b, c, n)
real(8) :: a(*), b(*), c(*)
!$acc declare deviceptr(a, b, c)
!$acc parallel loop
do i = 1,n
c(i) = a(i) + b(i)
enddo
end subroutine
$ flang -fopenacc -c t.f90
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":3:7): expect declare attribute on variable in declare operation
error: Lowering to LLVM IR failed
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): unsupported OpenACC operation: acc.private.recipe
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): LLVM Translation failed for operation: acc.private.recipe
error: failed to create the LLVM module
```
I would like to to share this code, because others are currently working
on the implementation of `deviceptr`, but it is obviously not running
end-to-end. I think the cleanest approach to this would be to put this
exception to the rule behind some feature flag, but I am not certain
what the precedence for that is.
Nearly, but not all, other compilers have a blanket prohibition against
the use of an INTENT(OUT) dummy argument in a specification expression.
Some compilers, however, permit an INTENT(OUT) dummy argument to appear
in a specification expression in a BLOCK construct or inner procedure
via host association.
The argument some have put forth to accept this usage comes from a
reading of 10.1.11 (specification expressions) in Fortran 2023 that, if
followed consistently, would also require host-associated OPTIONAL dummy
argument to be allowed. That would be dangerous for reasons that should
be obvious.
However, I can agree that a non-OPTIONAL dummy argument can't be assumed
to remain undefined on entry to a BLOCK construct or inner procedure, so
we can accept host-associated INTENT(OUT) in specification expressions
with a portability warning.
The logic for fixed form compiler directive line continuation has a hole
that can apply continuation for !$ even if the next line does not begin
with a fixed form comment character. Rearrange the nested if statements
to enforce that requirement for all compiler directives.
Recent work to better handle macro replacement in literal constant kind
suffixes isn't handling fixed form well, leading to a crash in Fujitsu
test 0113/0113_0073.F. The look-ahead needs to be done with the
higher-level prescanner functions that skip over fixed form comment
fields after column 72. Rework.
Currently we don't check for the presence of descriptor/BoxTypes before
emitting stores which lower to memcpys, the issue with this is that
users can have optional arguments, where they don't provide an input,
making the argument effectively null. This can still be mapped and this
causes issues at the moment as we'll emit a memcpy for function
arguments to store to a local variable for certain edge cases, when we
perform this memcpy on a null input, we cause a segfault at runtime.
The fix to this is to simply create a branch around the store that
checks if the data we're copying from is actually present. If it is, we
proceed with the store, if it isn't we skip it.
Fix Flang invocation in `Driver/do_concurrent_to_omp_cli.f90` test to
run compilation step only, to fix testing when building with
`-DFLANG_INCLUDE_RUNTIME=OFF`. The test is only concerned with warning
being emitted by the compiler, so there is no need to link the resulting
executable.
The PR is to generalize the re-use of the `compilerRT` code of adding
the path of `libflang_rt.runtime.a (so)` from AIX and LoP only to all
platforms via a new function `addFlangRTLibPath`.
It also added `-static-libflangrt` and `-shared-libflangrt` compiler
options to allow users choosing which `flang-rt` to link to. It defaults
to shared `flang-rt`, which is consistent with the linker behavior,
except on AIX, it defaults to static.
Also, PR #134320 exposed an issue in PR #131041 that the the overriding
`addFortranRuntimeLibs` is missing the link to `libquadmath`. This PR
also fixed that and restored the test case that PR #131041 broke.
This reverts commit 028429ac452acde227ae0bfafbfe8579c127e1ea and
1004fae222efeee215780c4bb4e64eb82b07fb4f.
These really need to be part of the compiler distribution. Bots are
relying on a nearly year old version to provide bitcode.
Similar to #135415.
The spec has not strict restriction to allow a single `if_present`
clause on the host_data and update directives. Allowing this clause
multiple times does not change the semantic of it. This patch relax the
rules in ACC.td since there is no restriction in the standard.
The OpenACC dialect represents the `if_present` clause with a `UnitAttr`
so the attribute will be set if the is one or more `if_present` clause.
The spec has not strict restriction to allow a single `finalize` clause
on the `exit data` directive. Allowing this clause multiple times does
not change the semantic of it. This patch relax the rules in `ACC.td`
since there is no restriction in the standard.
The OpenACC dialect represent the finalize clause with a UnitAttr so the
attribute will be set if the is one or more `finalize` clause.
Currently rocm-device-lib-path is not enabled for Flang, so when the
compiler warns / requests a user to provide this option in cases where
it can't find rocm a user cannot actually set the device libraries using
rocm-device-lib-path. The alternative rocm_path that's also mentioned
via the warning can be used, but we should enable both mentioned options
to not confuse users (and myself).
This patch defines `fir::SafeTempArrayCopyAttrInterface` and the
corresponding
OpenACC/OpenMP related attributes in FIR dialect. The actual
implementations
are just placeholders right now, and array repacking becomes a no-op
if `-fopenacc/-fopenmp` is used for the compilation.
See test case. When Fortran line continuation has been used, don't
insert spaces in -E formatted output to put things into the right
column, as this can break up a token.
Fixes https://github.com/llvm/llvm-project/issues/134986.
This patch switches the --experimental-debuginfo-iterators flag to be
stored to an otherwise unused cl-opt. This is a deliberate attempt to
break downstream tests that are relying on the use of debug intrinsics,
because they're imminently going away! If this commit breaks your tests,
please just revert the commit upstream, and then make contact with us
here:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
So that we can work out whether there's any further transition work
needed to support the move away from using debug intrinsics.
A recent patch that obviated the need to use -fopenmp when using the
compiler to preprocess in -E mode broke a case of Fortran line
continuation when using OpenMP conditional compilation lines (!$) when
*not* in -E mode. Fix.
Part two of merging #132486. Support volatility in fir ops.
* Introduce a new operation fir.volatile_cast, whose only purpose is to
add or take away the volatility of an SSA value's type. The types must
be otherwise identical, and any other type conversions must be handled
by fir.convert. fir.convert will give an error if the volatility of the
inputs does not match, such that all changes to volatility must be
handled explicitly through fir.volatile_cast.
* Add memory effects to ops that read from or write to memory. The
precedent for this comes from the LLVM dialect (feb7beaf70) where
llvm.load/store ops with the volatile attribute report read/write
effects to a generic memory resource. This change is similar in spirit
but different in two ways: the volatility of an operation is determined
by the type of its memref, not an attribute on the op, and the memory
effects of a load- or store-like operation on a volatile reference type
are reported against a particular memory resource,
`VolatileMemoryResource`. This is so MLIR optimizations are able to
reorder operations that are not volatile around operations that are,
which we believe more precisely models LLVM's volatile memory semantics.
@vzakhari suggested this in #132486 citing LangRef. See
https://llvm.org/docs/LangRef.html#volatile-memory-accesses
Changes needed to generate IR with volatile types are not included in
this change, so it should be non-functional, containing only the changes
to Fir ops and op utilities that will be needed once we enable lowering
to generate volatile types.
This patch fixes UnicodeDecodeError on Windows in test_errors.py. This
issue was observed on the flang-arm64-windows-msvc buildbot.
Semantics/OpenMP/interop-construct.f90 was crashing due to Python
defaulting to the cp1252 codec on Windows.
I have fixed this by explicitly setting encoding="utf-8" when reading
source files and invoking subprocess.run() in test_errors.py
flang-arm64-windows-msvc was running on stagging master which resulted
in this issue not being fixed earlier.
https://lab.llvm.org/staging/#/builders/206
The statement context is used for lowering clauses for openmp operations
using generalised helpers from flang lowering. The statement context
stores closures which generate code for cleaning up temporary values
generated by the lowering helper. These closures are run when the
statement construct is destroyed. Keeping the statement context local to
the clause or operation being lowered without any special handling was
not correct because any cleanup code would be generated at the insertion
point when that statement context went out of scope (which would in
general be inside of the newly created container operation). It would be
better to generate the cleanup code after the newly created operation
(clause processing is synchronous even for deferred tasks).
Currently supported clauses are mostly populated with simple scalar
values that require no cleanup. Even the simple array sections added by
#132994 needed no cleanup because indexing the right values of the array
did not create any temporaries. Supporting array sections with vector
indexing will generate hlfir.destroy operations for cleanup. This patch
fixes where those will be created. Those hlfir.destroy operations don't
generate any FIR (or LLVM) code, but the issue still exists
theoretically.
I wasn't able to find any clauses which have any cleanup to use to test
this PR. It is probably NFC for the current lowering. This will be
tested in [the PR adding vector subscripting of array
sections](https://github.com/llvm/llvm-project/pull/133892).
When a host associated `threadprivate` variable was used in a parallel
region with `default(none)` in an internal subroutine was failing,
because the compiler did not properly determine that the variable was
pre-determined `threadprivate` and thus should not have been reported as
missing a DSA.
Part one of merging #132486. Add support for representing volatility in
the type system for reference, box, and class types. Don't do anything
with volatile just yet, only support and test their representation and
utility functions.
The naming convention is a little goofy - `fir::isa_volatile_type` and
`fir::updateTypeWithVolatility` use different capitalization, but I put
them near similar functions and tried to match the surrounding
conventions and [the
docs](https://github.com/llvm/llvm-project/blob/main/flang/docs/C%2B%2Bstyle.md#naming)
best I could.
The test is checking output from MLIR debug prints. MLIR passes can be
executed in parallel, for example a pass on func.func might schedule
different func.func operations in different threads. This led to
intermittent test failures where debug output from different threads
became mixed up.
Fix by disabling mlir multithreading for this test.
Delete duplicated creation of hlfir.declare op of do concurrent
induction variables when inside cuf kernel directive.
Obtain the correct hlfir.declare op generated from bindSymbol, and add
it to ivValues.
Allocatable or pointer module variables with the CUDA managed attribute
are defined with a double descriptor. One on the host and one on the
device. Only the data pointed to by the descriptor will be allocated in
managed memory.
Allow the registration of any allocatable or pointer module variables
like device or constant.
Implement extended intrinsic PUTENV, both function and subroutine forms.
Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and
semantic unit tests.