1856 Commits

Author SHA1 Message Date
Peter Klausler
262b3f7615
[flang] Remove runtime dependence on C++ support for types (#134164)
Fortran::runtime::Descriptor::BytesFor() only works for Fortran
intrinsic types for which a C++ type counterpart exists, so it crashes
on some types that are legitimate Fortran types like REAL(2). Move some
logic from Evaluate into a new header in flang/Common, then use it to
avoid this needless dependence on C++.
2025-04-04 08:42:38 -07:00
Slava Zakharin
3f6ae3f0a8
[flang] Added driver options for arrays repacking. (#134002)
Added options:
  * -f[no-]repack-arrays
  * -f[no-]stack-repack-arrays
  * -frepack-arrays-contiguity=whole/innermost
2025-04-03 10:43:28 -07:00
Asher Mancinelli
d7d91500b6
[flang][nfc] Initial changes needed to use llvm intrinsics instead of regular calls (#134170)
Flang uses `fir.call <llvm intrinsic>` in a few places. This means
consumers of the IR need to strcmp every fir.call if they want to find a
particular LLVM intrinsic.
Emit LLVM memcpy intrinsics instead.
2025-04-03 08:37:40 -07:00
Sergio Afonso
18dd299fb1
[Flang][MLIR][OpenMP] Host-evaluation of omp.loop bounds (#133908)
This patch updates Flang lowering and kernel flags identification in
MLIR so that loop bounds on `target teams loop` constructs are evaluated
on the host, making the trip count available to the corresponding
`__tgt_target_kernel` call emitted for the target region.

This is necessary in order to properly execute these constructs as
`target teams distribute parallel do`.

Co-authored-by: Kareem Ergawy <kareem.ergawy@amd.com>
2025-04-03 15:06:19 +01:00
vdonaldson
8a0f694381
[flang] Legacy ASSIGN statement target processing (#133737)
Like other target statements, the statement associated with the label in
a legacy ASSIGN statement could be inside a construct. Constructs
containing such a target must therefore be marked as unstructured,
fairly similar to how targets are processed in `markBranchTarget`.
2025-04-02 09:52:13 -04:00
Jean-Didier PAILLEUX
c309abd925
[flang] Implement !DIR$ NOVECTOR and !DIR$ NOUNROLL[_AND_JAM] (#133885)
Hi,
This patch implements support for the following directives :
- `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO
LOOP.
- `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP.
- `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.
2025-04-02 14:30:01 +02:00
Tom Eccles
e17d864f55
[flang][OpenMP][Lower] lower array subscripts for task depend (#132994)
The OpenMP standard says that all dependencies in the same set of
inter-dependent tasks must be non-overlapping. This simplification means
that the OpenMP only needs to keep track of the base addresses of
dependency variables. This can be seen in kmp_taskdeps.cpp, which stores
task dependency information in a hash table, using the base address as a
key.

This patch generates a rebox operation to slice boxed arrays, but only
the box data address is used for the task dependency. The extra box is
optimized away by LLVM at O3.

Vector subscripts are TODO (I will address in my next patch).

This also fixes a bug for ordinary subscripts when the symbol was mapped
to a box:

Fixes #132647
2025-04-01 10:26:14 +01:00
Thirumalai Shaktivel
091dcb8fc2
[Flang] Make a private copy for the common block variables in copyin clause (#111359)
Fixes: https://github.com/llvm/llvm-project/issues/82949
2025-04-01 11:35:44 +05:30
Thirumalai Shaktivel
374a5bea52
[Flang][OpenMP] Add PointerAssociateScalar to Cray Pointer used in the DSA (#133232)
Issue: Cray Pointer is not associated to Cray Pointee, leading to
Segmentation fault

Fix: GetUltimate, retrieves the base symbol in the current scope, which
gets passed all the references and returns the original symbol

---------

Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2025-03-29 15:39:12 +01:00
swatheesh-mcw
fe30cf18ab
Revert "Revert "[flang][openmp] Adds Parser and Semantic Support for Interop Construct, and Init and Use Clauses."" (#132343)
Reverts llvm/llvm-project#132005
2025-03-28 15:21:52 +00:00
Peter Klausler
4ea5aa09de
[flang][NFC] Restore I/O runtime API header name (#132423)
flang/include/flang/Runtime/io-api.h was changed into io-api-consts.h,
then wrapped into a new io-api.h that includes io-api-consts.h, does
some redundant includes and declarations, and then declares the
prototype of one function, InquiryKeywordHashDecode.

Make that function static in io-stmt.cpp prior to its sole call site,
then undo the renaming, to reduce confusion and redundancy.
2025-03-26 12:09:16 -07:00
Krzysztof Parzyszek
c221d64206
[flang] Remove mentions of evaluate::Variable<T> (#132805)
The template itself was not defined anywhere. The closest thing was a
forward declaration in flang/include/flang/Evaluate/variable.h.
2025-03-24 18:26:57 -05:00
Leandro Lupori
ef56f4b5a0
[flang][OpenMP] Fix reduction of arrays with non-default lower bounds (#132228)
Using LoopNest's indices with ShapeShifts that have non-default
lower bounds results in accesses to incorrect array elements.
To avoid having to adjust each index, a ShapeShift with default
lower bounds can be used instead.

Fixes #131751
2025-03-24 09:48:41 -03:00
Michael Kruse
123eb75cd4
[Flang] Do not emit numeric_storage_size into object file (#131463)
The value of numeric_storage_size depends on compilation options and
therefore its value is not yet known when building the builtins runtime.
Instead, the parameter is folding a __numeric_storage_size() expression
which is loaded into the user program. For the iso_fortran_env object
file, omit the symbol as it is never used.

Similar tests that ensure that __numeric_storage_size() is not folded
until compiling the actual user program exist in FortranEvalutate:

1e6ba3cd2f/flang/lib/Evaluate/check-expression.cpp (L487-L492)

1e6ba3cd2f/flang/lib/Evaluate/fold-integer.cpp (L1457-L1460)

Required for using CMake to compile the builtin module files. See RFC at
https://discourse.llvm.org/t/rfc-building-flangs-builtin-mod-files/84626
2025-03-21 12:32:54 +01:00
Sergio Afonso
b231f6f862
[MLIR][OpenMP] Improve omp.map.info verification (#132066)
This patch makes the `map_type` and `map_capture_type` arguments of the
`omp.map.info` operation required, which was already an invariant being
verified by its users via `verifyMapClause()`. This makes it clearer, as
getters no longer return misleading `std::optional` values.

Checks for the `mapper_id` argument are moved to a verifier for the
operation, rather than being checked by users.

Functionally NFC, but not marked as such due to a reordering of
arguments in the assembly format of `omp.map.info`.
2025-03-20 15:48:45 +00:00
Krzysztof Parzyszek
68180d8d16
[flang][OpenMP] Use OmpDirectiveSpecification in standalone directives (#131163)
This uses OmpDirectiveSpecification in the rest of the standalone
directives.
2025-03-20 06:50:43 -05:00
Slava Zakharin
2c91f10362
[flang] Fixed repacking for TARGET and INTENT(OUT) (#131972)
TARGET dummy arrays can be accessed indirectly, so it is unsafe
to repack them.
INTENT(OUT) dummy arrays that require finalization on entry
to their subroutine must be copied-in by `fir.pack_arrays`.

In addition, based on my testing results, I think it will be useful
to document that `LOC` and `IS_CONTIGUOUS` will have different values
for the repacked arrays. I still need to decide where to document
this, so just added a note in the design doc for the time being.
2025-03-19 17:12:32 -07:00
Sergio Afonso
ac9e4e9b33
[Flang][OpenMP] Simplify entry block creation for BlockArgOpenMPOpInterface ops, NFC (#132036)
This patch adds the `OpWithBodyGenInfo::blockArgs` field and updates
`createBodyOfOp()` to prevent the need for `BlockArgOpenMPOpInterface`
operations to pass the same callback, minimizing chances of introducing
inconsistent behavior.
2025-03-19 17:29:40 +00:00
Krzysztof Parzyszek
cd26dd5595
[flang][OpenMP] Use OmpDirectiveSpecification in simple directives (#131162)
The `OmpDirectiveSpecification` contains directive name, the list of
arguments, and the list of clauses. It was introduced to store the
directive specification in METADIRECTIVE, and could be reused everywhere
a directive representation is needed.
In the long term this would unify the handling of common directive
properties, as well as creating actual constructs from METADIRECTIVE by
linking the contained directive specification with any associated user
code.
2025-03-19 11:34:40 -05:00
Kiran Chandramohan
96b112fb61
Revert "[flang][openmp] Adds Parser and Semantic Support for Interop Construct, and Init and Use Clauses." (#132005)
Reverts llvm/llvm-project#120584

Reverting due to CI failure
https://lab.llvm.org/buildbot/#/builders/157/builds/22946
2025-03-19 11:13:52 +00:00
swatheesh-mcw
ee8a759bfb
[flang][openmp] Adds Parser and Semantic Support for Interop Construct, and Init and Use Clauses. (#120584)
Adds Parser and Semantic Support for the below construct and clauses:
- Interop Construct
- Init Clause
- Use Clause

Note:
The other clauses supported by Interop Construct such as Destroy, Use,
Depend and Device are added already.
2025-03-19 10:49:17 +00:00
Tom Eccles
e7c6e3557b
[flang][OpenMP] Fix threadprivate pointer variable in common block (#131888)
Fixes #112538

The problem was that the host associated symbol for the threadprivate
variable doesn't have all of the symbol attributes (e.g. POINTER). This
caused the lowering code to generate the wrong type, eventually hitting
an assertion.
2025-03-19 10:12:52 +00:00
Slava Zakharin
fd0e20a64b
[flang] Generate fir.pack/unpack_array in Lowering. (#131704)
Basic generation of array repacking operations in Lowering.
2025-03-18 21:26:33 -07:00
Kelvin Li
6c7c660afe
[flang] Use C-style casts to silence message (NFC) (#131796) 2025-03-18 13:02:18 -04:00
Akash Banerjee
cbc5c11fec
[MLIR][OpenMP] Add Lowering support for implicitly linking to default declare mappers (#131006) 2025-03-18 13:17:10 +00:00
Kareem Ergawy
83658ddb1b
[flang][OpenMP] Enable delayed privatization by default for omp.distribute (#131574)
Switches delayed privatization for `omp.distribute` to be on by default:
controlled by the `-openmp-enable-delayed-privatization` instead of by
`-openmp-enable-delayed-privatization-staging`.

### GFortran & Fujitsu test suite results:

#### gfotran test-suite (this PR):
```
Testing Time: 34.51s
  Passed: 6569
```

#### Fujitsu without changes (commit: 0813c5cf5f52):
```
Testing Time: 155.39s
  Passed            : 88325
  Failed            :   156
  Executable Missing:   408
```

#### Fujitsu with changes (this PR):
```
Testing Time: 158.54s
  Passed            : 88325
  Failed            :   156
  Executable Missing:   408
```
2025-03-18 14:07:41 +01:00
Valentin Clement (バレンタイン クレメン)
4fde8c341f
[flang][cuda] Lower CUDA shared variable with cuf.shared_memory op (#131399)
Use `cuf.shared_memory` operation instead of `cuf.alloc` for CUDA shared
variable. These variables do not need free operations.
2025-03-16 17:44:56 -07:00
jeanPerier
3ff3b29dd6
[flang] lower remaining cases of pointer assignments inside forall (#130772)
Implement handling of `NULL()` RHS, polymorphic pointers, as well as
lower bounds or bounds remapping in pointer assignment inside FORALL.

These cases eventually do not require updating hlfir.region_assign,
lowering can simply prepare the new descriptor for the LHS inside the
RHS region.

Looking more closely at the polymorphic cases, there is not need to call
the runtime, fir.rebox and fir.embox do handle the dynamic type setting
correctly.

After this patch, the last remaining TODO is the allocatable assignment
inside FORALL, which like some cases here, is more likely an accidental
feature given FORALL was deprecated in F2003 at the same time than
allocatable components where added.
2025-03-14 10:51:46 +01:00
Kelvin Li
c2b66ce655
[flang][OpenMP] Silence unused-but-set-variable message (NFC) (#130979) 2025-03-13 14:09:47 -04:00
Krzysztof Parzyszek
f4fc2d731c
[flang][OpenMP] Map ByRef if size/alignment exceed that of a pointer (#130832)
Improve the check for whether a type can be passed by copy. Currently,
passing by copy is done via the OMP_MAP_LITERAL mapping, which can only
transfer as much data as can be contained in a pointer representation.
2025-03-12 19:41:11 -05:00
Sergio Afonso
cf68c9378b
[Flang][OpenMP] Move declare mapper sym creation outside loop, NFC (#130794)
This patch simplifies the definition of
`ClauseProcessor::processMapObjects` by hoisting the creation of the
MLIR symbol associated to an existing `omp.declare_mapper` operation
outside of the loop processing all mapped objects.

That change removes some inter-iteration dependencies that made the
implementation more difficult to follow.
2025-03-12 11:54:29 +00:00
jeanPerier
356bf3fa2d
Reland " [flang] Rely on global initialization for simpler derived types" (#130290)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.

Note: this relands #114002 with the fix for the LLVM timeout regressions that have been seen. The fix is to use the added fir.copy to avoid aggregate load/store.

Co-authored-by: NimishMishra <42909663+NimishMishra@users.noreply.github.com>
2025-03-11 15:19:43 +01:00
Leandro Lupori
29f5d5bea9
[flang][OpenMP] Fix privatization of procedure pointers (#130336)
Fixes #121720
2025-03-11 09:38:40 -03:00
Ritanya-B-Bharadwaj
63635c1746
[clang] [OpenMP] New OpenMP 6.0 self_maps clause (#129888)
Initial parsing/sema support for self maps in map and requirement clause
[Sections 7.9.6 and 10.5.1.6 in OpenMP 6.0 spec]
2025-03-11 16:31:42 +05:30
Tom Eccles
a542a08309
[flang][OpenMP] Support reduction of variables in EQUIVALENCE (#130607)
These previously crashed the compiler because !fir.ptr (not wrapped
inside of a box) was not supported.

Real POINTER variables are supported as !fir.box<!fir.ptr<>>. The
version for EQUIVALENCE doesn't need to do anything different to
!fir.ref<>.
2025-03-11 10:16:48 +00:00
Krzysztof Parzyszek
d67947162f
[flang][OpenMP] Implement HAS_DEVICE_ADDR clause (#128568)
The HAS_DEVICE_ADDR indicates that the object(s) listed exists at an
address that is a valid device address. Specifically,
`has_device_addr(x)` means that (in C/C++ terms) `&x` is a device
address.

When entering a target region, `x` does not need to be allocated on the
device, or have its contents copied over (in the absence of additional
mapping clauses). Passing its address verbatim to the region for use is
sufficient, and is the intended goal of the clause.

Some Fortran objects use descriptors in their in-memory representation.
If `x` had a descriptor, both the descriptor and the contents of `x`
would be located in the device memory. However, the descriptors are
managed by the compiler, and can be regenerated at various points as
needed. The address of the effective descriptor may change, hence it's
not safe to pass the address of the descriptor to the target region.
Instead, the descriptor itself is always copied, but for objects like
`x`, no further mapping takes place (as this keeps the storage pointer
in the descriptor unchanged).

---------

Co-authored-by: Sergio Afonso <safonsof@amd.com>
2025-03-10 08:11:01 -05:00
agozillon
f1178815d2
[Flang][OpenMP][MLIR] Implement close, present and ompx_hold modifiers for Flang maps (#129586)
This PR adds an initial implementation for the map modifiers close,
present and ompx_hold, primarily just required adding the appropriate
map type flags to the map type bits. In the case of ompx_hold it
required adding the map type to the OpenMP dialect. Close has a bit of a
problem when utilised with the ALWAYS map type on descriptors, so it is
likely we'll have to make sure close and always are not applied to the
descriptor simultaneously in the future when we apply always to the
descriptors to facilitate movement of descriptor information to device
for consistency, however, we may find an alternative to this with
further investigation. For the moment, it is a TODO/Note to keep track
of it.
2025-03-07 22:22:30 +01:00
Tom Eccles
d31a7dde48
Revert " [flang] Rely on global initialization for simpler derived types" (#130278)
Reverts llvm/llvm-project#114002

This causes a regression building cam4_r from spec2017
2025-03-07 13:59:29 +00:00
jeanPerier
40e245a9aa
[flang] add support for procedure pointer assignment inside FORALL (#130114)
Very similar to object pointer assignment, the difference is the SSA
types of the LHS (!fir.ref<!fir.boxproc<()->()>> and RHS
(!fir.boxproc<()->()).

The RHS must be saved as simple address, not descriptors (it is not
possible to make CFI descriptor out of procedure entity).
2025-03-07 10:28:02 +01:00
Kareem Ergawy
9543e9e927
[flang][OpenMP] Handle pre-detemined lastprivate for simd (#129507)
This PR tries to fix `lastprivate` update issues in composite
constructs. In particular, pre-determined `lastprivate` symbols are
attached to the wrong leaf of the composite construct (the outermost
one). When using delayed privatization (should be the default mode in
the future), this results in trying to update the `lastprivate` symbol
in the wrong construct (outside the `omp.loop_nest` op).

For example, given the following input:
```fortran
!$omp target teams distribute parallel do simd collapse(2) private(y_max)
  do i=x_min,x_max
    do j=y_min,y_max
    enddo
  enddo
```

Without the fixes introduced in this PR, the `DataSharingProcessor`
tries to generate the `lastprivate` update ops in the `parallel` op
since this is the op for which the DSP instance is created.

The fix consists of 2 main parts:
1. Instead of creating a single DSP instance, one instance is created
for the leaf constructs that might need privatization (whether for
explicit, implicit, or pre-determined symbols).
2. When generating the `lastprivate` comparison ops, we don't directly
use the SSA values of the UBs and steps. Instead, we regenerated these
SSA values from the original loop bounds' expressions. We have to do
this to avoid using `host_eval` values in the `lastprivate` comparison
logic which is illegal.
2025-03-07 05:44:39 +01:00
Valentin Clement (バレンタイン クレメン)
478e516140
[flang][cuda] Sync double descriptor after c_f_pointer call (#130194)
After a global device pointer is set through `c_f_pointer`, we need to
sync the double descriptor so the version on the device is also up to
date.
2025-03-06 19:19:51 -08:00
Kiran Chandramohan
e2911aa2c2
[Flang][OpenMP] Fix crash when loop index var is pointer or allocatable (#129717)
Use hlfir dereferencing for pointers and allocatables and use hlfir
assign. Also, change the code updating IV in lastprivate.

Note: This is a small change. Modifications in existing tests are
changes from fir.store to hlfir.assign.

Fixes #121290
2025-03-06 12:19:34 +00:00
Valentin Clement (バレンタイン クレメン)
2130285564
[flang][cuda] Make sure allocator id is set for pointer allocate (#129950) 2025-03-05 17:29:09 -08:00
Zhen Wang
d1abbb4dc5
[flang][cuda] Change induction variable from i32 to index for doconcurrent inside cuf kernel directive (#129924)
Use `index` instead of `i32` for induction variables for doconcurrent
inside cuf kernel directive. Regular do loop inside cuf kernel directive
also uses `index`:
```
cuf.kernel<<<*, *>>> (%arg0 : index) = ...
```
2025-03-05 14:50:42 -08:00
Mats Petersson
9925359fee
[flang][llvm][openmp]Add Initializer clause to OMP.td (#129540)
Then use this in the Flang compiler for parsing the OpenMP declare
reduction.

This has no real functional change to the existing code, it's only
moving the declaration itself around.

A few tests has been updated, to reflect the new type names.
2025-03-05 15:41:24 +00:00
NimishMishra
0ae1f0a310
[flang] Rely on global initialization for simpler derived types (#114002)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
2025-03-05 05:44:51 -08:00
jeanPerier
7302e1b94e
[flang] implement simple pointer assignments inside FORALL (#129522)
The semantic of pointer assignments inside FORALL requires evaluating
the targets (RHS) and pointer variables (LHS) of all iterations before
evaluating the assignments.

In practice, if the compiler can prove that the RHS and LHS evaluations
are not impacted by the assignments, the evaluation of the FORALL
assignment statement can be done in a single loop. However, if the
compiler cannot prove this, it needs to "save" the addresses of the
targets and/or the pointer descriptors of each iterations before doing
the assignments.

This patch implements the most common cases where there is no lower bound
spec, no bounds remapping, the LHS is not polymorphic, and the RHS is
not NULL.

The HLFIR operation used to represent assignments inside FORALL can be
used for pointer assignments to (the only difference being that the LHS
is a descriptor address).

The analysis for intrinsic assignment can be reused, with the
distinction that the RHS data is not read during the assignment.

The logic that is used to save LHS in intrinsic assignments inside
FORALL is extracted to be used for the RHS of pointer assignments when
needed (saving a descriptor value).
Pointer assignment LHS are just descriptor addresses and are saved as
int_ptr values.
2025-03-05 11:24:04 +01:00
Peter Klausler
79a25e11fe
[flang] Further work on NULL(MOLD=allocatable) (#129345)
Refine handling of NULL(...) in semantics to properly distinguish
NULL(), NULL(objectPointer), NULL(procPointer), and NULL(allocatable)
from each other in relevant contexts.

Add IsNullAllocatable() and IsNullPointerOrAllocatable() utility
functions. IsNullAllocatable() is true only for NULL(allocatable); it is
false for a bare NULL(), which can be detected independently with
IsBareNullPointer().

IsNullPointer() now returns false for NULL(allocatable).

ALLOCATED(NULL(allocatable)) now works, and folds to .FALSE.

These utilities were modified to accept const pointer arguments rather
than const references; I usually prefer this style when the result
should clearly be false for a null argument (in the C sense), and it
helped me find all of their use sites in the code.
2025-03-03 14:46:35 -08:00
Krzysztof Parzyszek
8f971ca1d9
[flang] Move DumpEvaluateExpr from Lower to Semantics (#128723)
Since evaluate::Expr can show up in the parse tree in the semantic
analysis step, make it possible to dump its structure in the Semantics
module.

The Lower module depends on Semantics, so the code is still accessible
in it.
2025-03-03 15:38:42 -06:00
Krzysztof Parzyszek
9573c62114
[flang][OpenMP] Accept modern syntax of FLUSH construct (#128975)
The syntax with the object list following the memory-order clause has
been removed in OpenMP 5.2. Still, accept that syntax with versions >=
5.2, but treat it as deprecated (and emit a warning).
2025-03-03 07:59:19 -06:00