9900 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
dcda314b6c
[flang][cuda] Fix atmoicxor lowering to accept arrays (#130331)
The first agrument can be an address of a scalare, an array element or
even just the address of the first element of an array. Update lowering
to not trigger elemental lowering.
2025-03-07 13:05:42 -08:00
Valentin Clement (バレンタイン クレメン)
5668c7bb90
[flang][cuda] Add more interfaces for __ldca, __ldcs, __ldlu and __ldcv (#130218) 2025-03-07 10:19:20 -08:00
Renaud Kauffmann
d4754db15d
Test fix: Adding REQUIRES: asserts (#130314) 2025-03-07 09:49:47 -08:00
Renaud Kauffmann
718c4ed8a0
[flang] [NFCI] Using getSource instead of getOriginalDef (#128984)
As discussed in past MRs, this change removes the use of getOriginalDef
to use getSource instead to gather data from an indirection.
2025-03-07 08:47:42 -08:00
Krzysztof Parzyszek
b47dac609b [flang][OpenMP] Remove pessimizing move introduced in 90f45a15ab
This unbreaks the builders that diagnose this situation:

  error: moving a temporary object prevents copy elision [-Werror,-Wpess
imizing-move]
    341 |   static OmpClauseList empty{std::move(decltype(OmpClauseList:
:v){})};
        |                              ^
2025-03-07 08:38:15 -06:00
Tom Eccles
d31a7dde48
Revert " [flang] Rely on global initialization for simpler derived types" (#130278)
Reverts llvm/llvm-project#114002

This causes a regression building cam4_r from spec2017
2025-03-07 13:59:29 +00:00
Krzysztof Parzyszek
90f45a15ab
[flang][OpenMP] Implement OmpDirectiveName, use in OmpDirectiveSpecif… (#130121)
…ication

The `OmpDirectiveName` class has a source in addition to wrapping the
llvm::omp::Directive.
2025-03-07 07:56:40 -06:00
Tom Eccles
f7daa9d302
[mlir][OpenMP] fix crash outlining infinite loop (#129872)
Previously an extra block was created by splitting the previous exit
block. This produced incorrect results when the outlined region
statically never terminated because then there wouldn't be a valid exit
block for the outlined region, this caused this newly added block to
have an incoming edge from outside of the outlining region, which caused
outlining to fail.

So far as I can tell this extra block no longer serves any purpose. The
comment says it is supposed to collate multiple control flow edges into
one place, but the code as it is now does not achieve this. In fact, as
can be seen from the changes to lit tests, this block was not actually
outlined in the end. This is because there are actually two code
extractors: one in the callback for creating a parallel op which is used
to find what the input/output variables are (which does have this block
added to it), and another one which actually does the outlining (which
this block was not added to).

Tested with the gfortran and fujitsu test suites.

Fixes #112884
2025-03-07 11:02:52 +00:00
jeanPerier
40e245a9aa
[flang] add support for procedure pointer assignment inside FORALL (#130114)
Very similar to object pointer assignment, the difference is the SSA
types of the LHS (!fir.ref<!fir.boxproc<()->()>> and RHS
(!fir.boxproc<()->()).

The RHS must be saved as simple address, not descriptors (it is not
possible to make CFI descriptor out of procedure entity).
2025-03-07 10:28:02 +01:00
Kiran Kumar T P
c02019141c
[LLVM-FLANG] [OpenMP] [Taskloop] - Add test case with cancel construct inside taskloop (#129862)
Added a test case with cancel construct inside taskloop. Currently
taskloop lowering is not supported so below error is issued: "not yet
implemented: Taskloop construct"
Once the lowering patch is merged, todo error should be issued for
cancel construct. "not yet implemented: OpenMPCancelConstruct"
2025-03-07 11:15:22 +05:30
Kareem Ergawy
9543e9e927
[flang][OpenMP] Handle pre-detemined lastprivate for simd (#129507)
This PR tries to fix `lastprivate` update issues in composite
constructs. In particular, pre-determined `lastprivate` symbols are
attached to the wrong leaf of the composite construct (the outermost
one). When using delayed privatization (should be the default mode in
the future), this results in trying to update the `lastprivate` symbol
in the wrong construct (outside the `omp.loop_nest` op).

For example, given the following input:
```fortran
!$omp target teams distribute parallel do simd collapse(2) private(y_max)
  do i=x_min,x_max
    do j=y_min,y_max
    enddo
  enddo
```

Without the fixes introduced in this PR, the `DataSharingProcessor`
tries to generate the `lastprivate` update ops in the `parallel` op
since this is the op for which the DSP instance is created.

The fix consists of 2 main parts:
1. Instead of creating a single DSP instance, one instance is created
for the leaf constructs that might need privatization (whether for
explicit, implicit, or pre-determined symbols).
2. When generating the `lastprivate` comparison ops, we don't directly
use the SSA values of the UBs and steps. Instead, we regenerated these
SSA values from the original loop bounds' expressions. We have to do
this to avoid using `host_eval` values in the `lastprivate` comparison
logic which is illegal.
2025-03-07 05:44:39 +01:00
Thirumalai Shaktivel
e15545cad8
[Flang][OpenMP] Allow copyprivate and nowait on the directive clauses (#127769)
Issue:
- Single construct used to throw a semantic error for copyprivate and
  nowait clause when used in the single directive.
- Also, the copyprivate with nowait restriction has been removed from
  OpenMP 6.0

Fix:
- Allow copyprivate and nowait on both single and end single directive
- Allow at most one nowait clause
- Throw a warning when the same list item is used in the copyprivate clause
  on the end single directive

From Reference guide (OpenMP 5.2, 2.10.2):
```
!$omp single [clause[ [,]clause] ... ]
loosely-structured-block
!$omp end single [end-clause[ [,]end-clause] ...]

clause:
  copyprivate (list)
  nowait
  [...]

end-clause:
  copyprivate (list)
  nowait
```

Towards: https://github.com/llvm/llvm-project/issues/110008
2025-03-07 09:24:32 +05:30
Valentin Clement (バレンタイン クレメン)
478e516140
[flang][cuda] Sync double descriptor after c_f_pointer call (#130194)
After a global device pointer is set through `c_f_pointer`, we need to
sync the double descriptor so the version on the device is also up to
date.
2025-03-06 19:19:51 -08:00
Kelvin Li
996092d5a5
[flang] probably convert Fortran logical to i1 in expanding hlfir.maxloc/hlfir.minloc opcodes (#129791)
If mask is a scalar, it always converts to !fir.box<!fir.array<1xi1>>.
The wrong value may be picked up when passing to the function
on the big endian platform. This patch is to do the conversion 
based on the original type of the mask and convert the value to 
i1 after the load.
2025-03-06 15:47:44 -05:00
Valentin Clement (バレンタイン クレメン)
c8898b09f9
[flang][rt] Use allocator registry to allocate the pointer payload (#129992)
pointer allocation is done through `AllocateValidatedPointerPayload`.
This function was not updated to use the registered allocators in the
descriptor to perform the allocation. This patch makes use of the
allocator.
The footer word is not set and not checked for allocator other than the
default one. The support will likely come in a follow up patch but this
will necessitate more functions to be registered to be able to set and
get the footer value when the allocation in on the device.
2025-03-06 08:47:27 -08:00
Kiran Chandramohan
e2911aa2c2
[Flang][OpenMP] Fix crash when loop index var is pointer or allocatable (#129717)
Use hlfir dereferencing for pointers and allocatables and use hlfir
assign. Also, change the code updating IV in lastprivate.

Note: This is a small change. Modifications in existing tests are
changes from fir.store to hlfir.assign.

Fixes #121290
2025-03-06 12:19:34 +00:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Matthias Springer
a6151f4e23
[mlir][IR] Move match and rewrite functions into separate class (#129861)
The vast majority of rewrite / conversion patterns uses a combined
`matchAndRewrite` instead of separate `match` and `rewrite` functions.

This PR optimizes the code base for the most common case where users
implement a combined `matchAndRewrite`. There are no longer any `match`
and `rewrite` functions in `RewritePattern`, `ConversionPattern` and
their derived classes. Instead, there is a `SplitMatchAndRewriteImpl`
class that implements `matchAndRewrite` in terms of `match` and
`rewrite`.

Details:
* The `RewritePattern` and `ConversionPattern` classes are simpler
(fewer functions). Especially the `ConversionPattern` class, which now
has 5 fewer functions. (There were various `rewrite` overloads to
account for 1:1 / 1:N patterns.)
* There is a new class `SplitMatchAndRewriteImpl` that derives from
`RewritePattern` / `OpRewritePatern` / ..., along with a type alias
`RewritePattern::SplitMatchAndRewrite` for convenience.
* Fewer `llvm_unreachable` are needed throughout the code base. Instead,
we can use pure virtual functions. (In cases where users previously had
to implement `rewrite` or `matchAndRewrite`, etc.)
* This PR may also improve the number of [`-Woverload-virtual`
warnings](https://discourse.llvm.org/t/matchandrewrite-hiding-virtual-functions/84933)
that are produced by GCC. (To be confirmed...)

Note for LLVM integration: Patterns with separate `match` / `rewrite`
implementations, must derive from `X::SplitMatchAndRewrite` instead of
`X`.

---------

Co-authored-by: River Riddle <riddleriver@gmail.com>
2025-03-06 08:48:51 +01:00
Valentin Clement (バレンタイン クレメン)
2130285564
[flang][cuda] Make sure allocator id is set for pointer allocate (#129950) 2025-03-05 17:29:09 -08:00
Zhen Wang
d1abbb4dc5
[flang][cuda] Change induction variable from i32 to index for doconcurrent inside cuf kernel directive (#129924)
Use `index` instead of `i32` for induction variables for doconcurrent
inside cuf kernel directive. Regular do loop inside cuf kernel directive
also uses `index`:
```
cuf.kernel<<<*, *>>> (%arg0 : index) = ...
```
2025-03-05 14:50:42 -08:00
Krzysztof Parzyszek
44c6a23789
[flang][OpenMP][AMDGPU] Allow REAL(10) to compile on AMDGPU (#129742)
This will allow the following code to compile
```
program p
  real(10) :: x
  !$omp target
    continue
  !$omp end target
end
```
2025-03-05 14:23:59 -06:00
Mats Petersson
9925359fee
[flang][llvm][openmp]Add Initializer clause to OMP.td (#129540)
Then use this in the Flang compiler for parsing the OpenMP declare
reduction.

This has no real functional change to the existing code, it's only
moving the declaration itself around.

A few tests has been updated, to reflect the new type names.
2025-03-05 15:41:24 +00:00
NimishMishra
0ae1f0a310
[flang] Rely on global initialization for simpler derived types (#114002)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
2025-03-05 05:44:51 -08:00
jeanPerier
7302e1b94e
[flang] implement simple pointer assignments inside FORALL (#129522)
The semantic of pointer assignments inside FORALL requires evaluating
the targets (RHS) and pointer variables (LHS) of all iterations before
evaluating the assignments.

In practice, if the compiler can prove that the RHS and LHS evaluations
are not impacted by the assignments, the evaluation of the FORALL
assignment statement can be done in a single loop. However, if the
compiler cannot prove this, it needs to "save" the addresses of the
targets and/or the pointer descriptors of each iterations before doing
the assignments.

This patch implements the most common cases where there is no lower bound
spec, no bounds remapping, the LHS is not polymorphic, and the RHS is
not NULL.

The HLFIR operation used to represent assignments inside FORALL can be
used for pointer assignments to (the only difference being that the LHS
is a descriptor address).

The analysis for intrinsic assignment can be reused, with the
distinction that the RHS data is not read during the assignment.

The logic that is used to save LHS in intrinsic assignments inside
FORALL is extracted to be used for the RHS of pointer assignments when
needed (saving a descriptor value).
Pointer assignment LHS are just descriptor addresses and are saved as
int_ptr values.
2025-03-05 11:24:04 +01:00
Iñaki Amatria Barral
6eefadd8ef
[flang][Semantics] Ensure deterministic mod file output (#129669)
This PR is a follow-up to #128655.

It adds another test to ensure deterministic ordering in `.mod` files
and includes related changes to prevent non-deterministic ordering
caused by iterating over a set ordered by heap pointers. This issue is
particularly noticeable when using Flang as a library and compiling the
same files multiple times.

The reduced test case is as minimal as possible. We were unable to
reproduce the issue with a smaller set of files.
2025-03-05 08:27:17 +01:00
Eugene Epshteyn
ab6cc6b7b3
[flang] Allow nested scopes for implied DO loops with DATA statements (#129410)
Previously, nested scopes for implied DO loops with DATA statements were
disallowed, which meant that the following code couldn't compile due to
re-use of `j` loop variable name:
    
    DATA (a(i),(b(i,j),j=1,3),(c(i,j),j=1,3),i=0,4)/
    
This change allows nested scopes implied DO loops, which allows the code
above to compile.

Tests modified to in accordance with this change:
Semantics/resolve40.f90, Semantics/symbol09.f90
2025-03-04 20:41:01 -05:00
jeanPerier
9a659fac2f
[flang] fix MAXVAL(x%array_comp_with_custom_lower_bounds) (#129684)
The HLFIR inlining of MAXVAL kicks in at O1 and more when the argument
is an array component reference but the implementation did not account
for the rare cases where the array components have non default lower
bounds.

This patch fixes the issue by using `getElementAt` to compute the
element address.
Rename `indices` to `oneBasedIndices` for more clarity.
2025-03-04 17:52:05 +01:00
Abid Qadeer
e27b8b2eda
[flang][debug] Improve handling of cyclic derived types with classes. (#129588)
While checking if a type should be cached or not, we use
`getDerivedType` to peel outer layers and get to the base type. This
function did not peel the `fir.class` which caused the algorithm to
fail.

Fixes #128606.
2025-03-04 10:27:24 +00:00
Slava Zakharin
f57756a640
[flang-rt] Use RT_API_ATTRS for ErfcScaled. (#129598)
As long as it is a host-only function, it cannot be referenced
by the flang-rt's ErfcScaled entry points. With the markup in place,
it is compiling properly by a CUDA compiler.
2025-03-03 17:10:50 -08:00
Peter Klausler
f6e83664e0
[flang] Improve two coarray error messages (#129597)
Two messages that complain about local variables mention that they don't
have the SAVE attribute; in both cases, it would be okay if they were
ALLOCATABLE instead. Clarify the messages.
2025-03-03 14:47:02 -08:00
Peter Klausler
79a25e11fe
[flang] Further work on NULL(MOLD=allocatable) (#129345)
Refine handling of NULL(...) in semantics to properly distinguish
NULL(), NULL(objectPointer), NULL(procPointer), and NULL(allocatable)
from each other in relevant contexts.

Add IsNullAllocatable() and IsNullPointerOrAllocatable() utility
functions. IsNullAllocatable() is true only for NULL(allocatable); it is
false for a bare NULL(), which can be detected independently with
IsBareNullPointer().

IsNullPointer() now returns false for NULL(allocatable).

ALLOCATED(NULL(allocatable)) now works, and folds to .FALSE.

These utilities were modified to accept const pointer arguments rather
than const references; I usually prefer this style when the result
should clearly be false for a null argument (in the C sense), and it
helped me find all of their use sites in the code.
2025-03-03 14:46:35 -08:00
Peter Klausler
b2ba43a9c1
[flang] Refine checking of type-bound generics (#129292)
I merged a patch yesterday
(https://github.com/llvm/llvm-project/pull/128980) that strengthened
error detection of indistinguishable specific procedures in a type-bound
generic procedure, and broke a couple of tests. Refine the check so that
it doesn't flag valid cases of overridden bindings, and add a thorough
test with all of the boundary cases that I can think of.
2025-03-03 14:46:08 -08:00
Krzysztof Parzyszek
8f971ca1d9
[flang] Move DumpEvaluateExpr from Lower to Semantics (#128723)
Since evaluate::Expr can show up in the parse tree in the semantic
analysis step, make it possible to dump its structure in the Semantics
module.

The Lower module depends on Semantics, so the code is still accessible
in it.
2025-03-03 15:38:42 -06:00
Jean-Didier PAILLEUX
a9b2e31fb0
[flang] Define CO_REDUCE intrinsic procedure (#125115)
Define the intrinsic `CO_REDUCE` and add semantic checks.
A test was already present but was at `XFAIL`. It has been modified to
take new messages into the output.
2025-03-03 20:50:02 +01:00
Kelvin Li
83f8721201
[flang] handle passing bind(c) derived type by value for ppc64le and powerpc64-aix (#128780) 2025-03-03 14:43:43 -05:00
Slava Zakharin
a704e6587b
[flang] Added alternative inlining code for hlfir.cshift. (#129176)
Flang generates slower code for `CSHIFT(CSHIFT(PTR(:,:,I),sh1,1),sh2,2)`
pattern in facerec than other compilers. The first CSHIFT can be done
as two memcpy's wrapped in a loop for the second dimension.
This does require creating a temporary array, but it seems to be faster,
than the current hlfir.elemental inlining.

I started with modifying the new index computation in
hlfir.elemental inlining: the new arith.select approach does enable
some vectorization in LLVM, but on x86 it is using gathers/scatters
and does not give much speed-up.

I also experimented with LoopBoundSplitPass
and InductiveRangeCheckElimination for a simple (not chained) CSHIFT
case, but I could not adjust them to split the loop with a condition
on the value of the IV into two loops with disjoint iteration spaces.
I thought if I could do it, I would be able to keep the hlfir.elemental
inlining mostly untouched, and then adjust the hlfir.elemental inlining
heuristics for the facerec case.

Since I was not able to make these pass work for me, I added a special
case inlining for CSHIFT(ARRAY,SH,DIM=1) via hlfir.eval_in_mem.
If ARRAY is not statically known to have the contiguous leading
dimension, there is a dynamic check for contiguity, which allows
exposing it to LLVM and enabling the rewrite of the copy loops
into memcpys. This approach is stepping on the toes of LoopVersioning,
but it is helpful in facerec case.

I measured ~6% speed-up on grace, and ~4% on zen4.
2025-03-03 09:58:20 -08:00
Slava Zakharin
0735cece68
[flang] Fixed fir.coordinate_of access in AddDebugInfo. (#129423)
The issue came up after #127231, when fir.coordinate_of, fed into
a declare, only has the field attribute and no coordinates.
2025-03-03 07:53:17 -08:00
Krzysztof Parzyszek
9573c62114
[flang][OpenMP] Accept modern syntax of FLUSH construct (#128975)
The syntax with the object list following the memory-order clause has
been removed in OpenMP 5.2. Still, accept that syntax with versions >=
5.2, but treat it as deprecated (and emit a warning).
2025-03-03 07:59:19 -06:00
Mats Petersson
50301052e9
[flang][OpenMP]Support for subroutine call for DECLARE REDUCTION init (#127889)
The DECLARE REDUCTION allows the initialization part to be either an
expression or a call to a subroutine.

This modifies the parsing and semantic analysis to allow the use of the
subroutine, in addition to the simple expression that was already
supported.

New tests in parser and semantics sections check that the generated
structure is as expected.

DECLARE REDUCTION lowering is not yet implemented, so will end in a
TODO. A new test with an init subroutine is added, that checks that this
variant also ends with a "Not yet implemented" message.
2025-03-03 13:49:51 +00:00
Jean-Didier PAILLEUX
370d34fe40
[flang][Driver] Add support of -fd-lines-as-comments and -fd-lines-as-code flags (#127605)
`-fd-lines-as-code` and `-fd-lines-as-comments` enables treatment for
lines beginning with `d` or `D` in fixed form sources.
Using these options in free form has no effect.
If the `-fd-lines-as-code` option is given they are treated as if the
first column contained a blank.
If the `-fd-lines-as-comments` option is given, they are treated as
comment lines.
2025-03-03 11:55:36 +00:00
jeanPerier
9805854699
[flang][NFC] clean-up fir.field_index legacy usages in tests (#129219)
After #127231, fir.coordinate_of should directly carry the field.

I updated the lowering and codegen tests in #12731, but not the FIR to
FIR tests, which is what this patch is cleaning up.
2025-03-03 10:01:54 +01:00
Valentin Clement (バレンタイン クレメン)
d1fd3698a9
[flang][cuda] Allow unsupported data transfer to be done on the host (#129160)
Some data transfer marked as unsupported can actually be deferred to an
assignment on the host when the variables involved are unified or
managed.
2025-03-02 16:12:01 -08:00
klensy
62f15a042b
[flang][test] Fix filecheck annotation typos [2/n] (#126099)
Few more fixes, previous: #92387

Co-authored-by: klensy <nightouser@gmail.com>
2025-02-28 10:04:16 +00:00
jeanPerier
a8db1fb9b5
[flang] update fir.coordinate_of to carry the fields (#127231)
This patch updates fir.coordinate_op to carry the field index as
attributes instead of relying on getting it from the fir.field_index
operations defining its operands.

The rational is that FIR currently has a few operations that require
DAGs to be preserved in order to be able to do code generation. This is
the case of fir.coordinate_op, which requires its fir.field operand
producer to be visible.
This makes IR transformation harder/brittle, so I want to update FIR to
get rid if this.

Codegen/printer/parser of fir.coordinate_of and many tests need to be
updated after this change.
2025-02-28 09:50:05 +01:00
Kareem Ergawy
e0c690990d
[flang][OpenMP] Add reduction clause support to loop directive (#128849)
Extends `loop` directive transformation by adding support for the
`reduction` clause.
2025-02-28 05:46:03 +01:00
KAWASHIMA Takahiro
0e56f6dc3e
[flang][docs][NFC] Fix Markdown /*comments*/ (#129018)
`*` in `/*comments*/` were interpreted as emphasis marks and were not
displayed in https://flang.llvm.org/docs/Extensions.html.
2025-02-28 10:18:37 +09:00
Peter Klausler
51dc52631c
[flang] Catch more defined I/O conflicts (#129115)
The code that checks for conflicts between type-bound defined I/O
generic procedures and non-type-bound defined I/O interfaces only works
when then procedures are defined in the same module as subroutines. It
doesn't catch conflicts when either are external procedures, procedure
pointers, dummy procedures, &c. Extend the checking to cover those cases
as well.

Fixes https://github.com/llvm/llvm-project/issues/128752.
2025-02-27 16:16:34 -08:00
Kazu Hirata
44c6616a4a [flang] Fix a warning
This patch fixes:

  flang/lib/Semantics/check-declarations.cpp:2009:15: error: unused
  variable 'kind' [-Werror,-Wunused-variable]
2025-02-27 14:48:13 -08:00
Peter Klausler
cbef629838
[flang] Catch type-bound generic with inherited indistinguishable spe… (#128980)
…cific

When checking generic procedures for indistinguishable specific
procedures, don't neglect to include specific procedures from any
accessible instance of the generic procedure inherited from its parent
type..

Fixes https://github.com/llvm/llvm-project/issues/128760.
2025-02-27 14:33:11 -08:00
Peter Klausler
c6dd9f4278
[flang] Catch usage of : and * lengths in array c'tors (#128974)
The definition of an array constructor doesn't preclude the use of
[character(:)::] or [character(*)::] directly, but there is language
elsewhere in the standard that restricts their use to specific contexts,
neither of which include explicitly typed array constructors.

Fixes https://github.com/llvm/llvm-project/issues/128755.
2025-02-27 14:32:50 -08:00