A recent patch that obviated the need to use -fopenmp when using the
compiler to preprocess in -E mode broke a case of Fortran line
continuation when using OpenMP conditional compilation lines (!$) when
*not* in -E mode. Fix.
Part two of merging #132486. Support volatility in fir ops.
* Introduce a new operation fir.volatile_cast, whose only purpose is to
add or take away the volatility of an SSA value's type. The types must
be otherwise identical, and any other type conversions must be handled
by fir.convert. fir.convert will give an error if the volatility of the
inputs does not match, such that all changes to volatility must be
handled explicitly through fir.volatile_cast.
* Add memory effects to ops that read from or write to memory. The
precedent for this comes from the LLVM dialect (feb7beaf70) where
llvm.load/store ops with the volatile attribute report read/write
effects to a generic memory resource. This change is similar in spirit
but different in two ways: the volatility of an operation is determined
by the type of its memref, not an attribute on the op, and the memory
effects of a load- or store-like operation on a volatile reference type
are reported against a particular memory resource,
`VolatileMemoryResource`. This is so MLIR optimizations are able to
reorder operations that are not volatile around operations that are,
which we believe more precisely models LLVM's volatile memory semantics.
@vzakhari suggested this in #132486 citing LangRef. See
https://llvm.org/docs/LangRef.html#volatile-memory-accesses
Changes needed to generate IR with volatile types are not included in
this change, so it should be non-functional, containing only the changes
to Fir ops and op utilities that will be needed once we enable lowering
to generate volatile types.
The statement context is used for lowering clauses for openmp operations
using generalised helpers from flang lowering. The statement context
stores closures which generate code for cleaning up temporary values
generated by the lowering helper. These closures are run when the
statement construct is destroyed. Keeping the statement context local to
the clause or operation being lowered without any special handling was
not correct because any cleanup code would be generated at the insertion
point when that statement context went out of scope (which would in
general be inside of the newly created container operation). It would be
better to generate the cleanup code after the newly created operation
(clause processing is synchronous even for deferred tasks).
Currently supported clauses are mostly populated with simple scalar
values that require no cleanup. Even the simple array sections added by
#132994 needed no cleanup because indexing the right values of the array
did not create any temporaries. Supporting array sections with vector
indexing will generate hlfir.destroy operations for cleanup. This patch
fixes where those will be created. Those hlfir.destroy operations don't
generate any FIR (or LLVM) code, but the issue still exists
theoretically.
I wasn't able to find any clauses which have any cleanup to use to test
this PR. It is probably NFC for the current lowering. This will be
tested in [the PR adding vector subscripting of array
sections](https://github.com/llvm/llvm-project/pull/133892).
When a host associated `threadprivate` variable was used in a parallel
region with `default(none)` in an internal subroutine was failing,
because the compiler did not properly determine that the variable was
pre-determined `threadprivate` and thus should not have been reported as
missing a DSA.
Part one of merging #132486. Add support for representing volatility in
the type system for reference, box, and class types. Don't do anything
with volatile just yet, only support and test their representation and
utility functions.
The naming convention is a little goofy - `fir::isa_volatile_type` and
`fir::updateTypeWithVolatility` use different capitalization, but I put
them near similar functions and tried to match the surrounding
conventions and [the
docs](https://github.com/llvm/llvm-project/blob/main/flang/docs/C%2B%2Bstyle.md#naming)
best I could.
Delete duplicated creation of hlfir.declare op of do concurrent
induction variables when inside cuf kernel directive.
Obtain the correct hlfir.declare op generated from bindSymbol, and add
it to ivValues.
Allocatable or pointer module variables with the CUDA managed attribute
are defined with a double descriptor. One on the host and one on the
device. Only the data pointed to by the descriptor will be allocated in
managed memory.
Allow the registration of any allocatable or pointer module variables
like device or constant.
Implement extended intrinsic PUTENV, both function and subroutine forms.
Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and
semantic unit tests.
No longer require -fopenmp or -fopenacc with -E, unless specific version
number options are also required for predefined macros. This means that
most source can be preprocessed with -E and then later compiled with
-fopenmp, -fopenacc, or neither.
This means that OpenMP conditional compilation lines (!$) are also
passed through to -E output. The tricky part of this patch was dealing
with the fact that those conditional lines can also contain regular
Fortran line continuation, and that now has to be deferred when !$ lines
are interspersed.
The preprocessor can perform macro replacement within identifiers when
they are split up with Fortran line continuation, but is failing to do
macro replacement on a continued identifier when none of its parts are
replaced.
The optional second argument to IEEE_SUPPORT_FLAG (and related functions
from the intrinsic IEEE_ARITHMETIC module) is needed only for its type,
not its value. Restrictions on local objects as arguments to function
references in specification expressions shouldn't apply to it.
Define a new attribute for dummy data object characteristics to
distinguish such arguments, set it for the appropriate intrinsic
function references, and test it during specification expression
validation.
Fortran::runtime::Descriptor::BytesFor() only works for Fortran
intrinsic types for which a C++ type counterpart exists, so it crashes
on some types that are legitimate Fortran types like REAL(2). Move some
logic from Evaluate into a new header in flang/Common, then use it to
avoid this needless dependence on C++.
A function or subroutine can allow an object of the same name to appear
in its scope, so long as the name is not used. This is similar to the
case of a name being imported from multiple distinct modules, and
implemented by the same representation.
It's not clear whether this is conforming behavior or a common
extension.
Add function and subroutine forms of FSEEK and FTELL as intrinsic
procedures. Accept common aliases from legacy compilers as well.
A separate patch to llvm-test-suite will enable tests for these
procedures once this patch has merged.
Depends on https://github.com/llvm/llvm-project/pull/132423; CI builds
will likely fail until that patch is merged and this PR is rebased.
Follow up to #134170.
We should be using the LLVM intrinsics instead of plain fir.calls when
we can. Existing code creates a declaration for the llvm intrinsic and
a regular fir.call, which makes it hard for consumers of the IR to find
all the intrinsic calls.
This patch updates flang to follow clang's behavior when processing the
`-mcode-object-version` option.
It is now used to populate an LLVM module flag called
`amdhsa_code_object_version` expected by the backend and also updates
the driver to add the `--amdhsa-code-object-version` option to the
frontend invocation for device compilation of AMDGPU targets.
Follow-up to #132003, in particular, see
https://github.com/llvm/llvm-project/pull/132003#issuecomment-2739701936.
This PR extends reduction support for `loop` directives. Consider the
following scenario:
```fortran
subroutine bar
implicit none
integer :: x, i
!$omp teams loop reduction(+: x)
DO i = 1, 5
call foo()
END DO
end subroutine
```
Note the following:
* According to the spec, the `reduction` clause will be attached to
`loop` during earlier stages in the compiler.
* Additionally, `loop` cannot be mapped to `distribute parallel for` due
to the call to a foreign function inside the loop's body.
* Therefore, `loop` must be mapped to `distribute`.
* However, `distribute` does not have `reduction` clauses.
* As a result, we have to move the `reduction`s from the `loop` to its
parent `teams` directive, which is what is done by this PR.
This PR implements the nonstandard intrinsic time.
In addition to running the unit tests, I also double checked that the
example code works by manually compiling and running it.
This PR adds the intrinsic `unlink` to flang.
## Test plan
- Added two codegen unit tests and ensured flang-check continues to
pass.
- Manually compiled and ran the example from the documentation.
The string used for intrinsic was not the correct one
"llvm.nvvm.match.any.sync.i32p". There was an extra `p` at the end.
Use the NVVM operation instead so we don't duplicate it.
Flang uses `fir.call <llvm intrinsic>` in a few places. This means
consumers of the IR need to strcmp every fir.call if they want to find a
particular LLVM intrinsic.
Emit LLVM memcpy intrinsics instead.
This patch updates Flang lowering and kernel flags identification in
MLIR so that loop bounds on `target teams loop` constructs are evaluated
on the host, making the trip count available to the corresponding
`__tgt_target_kernel` call emitted for the target region.
This is necessary in order to properly execute these constructs as
`target teams distribute parallel do`.
Co-authored-by: Kareem Ergawy <kareem.ergawy@amd.com>
Consider:
```
function foo()
!$omp declare target(foo) ! This `foo` was a function-result symbol
...
end
```
When resolving symbols, for this case use the symbol corresponding to
the function instead of the symbol corresponding to the function result.
Currently, this will result in an error:
```
error: A variable that appears in a DECLARE TARGET directive must be
declared in the scope of a module or have the SAVE attribute, either
explicitly or implicitly
```
Like other target statements, the statement associated with the label in
a legacy ASSIGN statement could be inside a construct. Constructs
containing such a target must therefore be marked as unstructured,
fairly similar to how targets are processed in `markBranchTarget`.
Hi,
This patch implements support for the following directives :
- `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO
LOOP.
- `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP.
- `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.