This patch makes the `map_type` and `map_capture_type` arguments of the
`omp.map.info` operation required, which was already an invariant being
verified by its users via `verifyMapClause()`. This makes it clearer, as
getters no longer return misleading `std::optional` values.
Checks for the `mapper_id` argument are moved to a verifier for the
operation, rather than being checked by users.
Functionally NFC, but not marked as such due to a reordering of
arguments in the assembly format of `omp.map.info`.
After #127231, fir.coordinate_of should directly carry the field.
I updated the lowering and codegen tests in #12731, but not the FIR to
FIR tests, which is what this patch is cleaning up.
Currently the alias analysis doesn't trace the source whenever there are
operations from fir::cg dialect. This PR added support for
fir::cg::XEmboxOp, fir::cg::XReboxOp, fir::cg::XDeclareOp for a specific
application i'm working on.
For example, determine that the address in `obj%p` below cannot alias
the address of `v`:
```
module m
type :: ty
real, pointer :: p
end type ty
end module m
subroutine test()
use m
real, target :: t
real :: v
type(ty) :: obj
obj%p => t
v = obj%p
end subroutine test
```
For example, determine that the address in p below cannot alias the
address of v:
```
subroutine test()
real, pointer :: p
real, target :: t
real :: v
p => t
v = p
end subroutine test
```
Typically, we do not track memory sources after a load because of the
dynamic nature of the load and the fact that the alias analysis is a
simple static analysis.
However, the code is written in a way that makes it seem like we are
continuing to track memory but in reality we are only doing so when we
know that the tracked memory is a leaf and therefore when there will
only be one more iteration through the switch statement. In other words,
we are iterating one more time, to gather data about a box, anticipating
that this will be the last time. This is a hack that helped avoid
cut-and-paste from other case statements but gives the wrong impression
about the intention of the code and makes it confusing.
To make it clear that there is no more tracking, we gather all the
necessary data from the memref of the load, in the case statement for
the load, and exit the loop. I am also limiting this data gathering for
the case when we load a box reference while we were actually following
data, as tests have shows, is the only case when we need it for. Other
cases will be handled conservatively, but this can change in the future,
on a case-by-case basis.
---------
Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
The intention of this work is to give MLIR->LLVMIR conversion freedom to
control how the private variable is allocated so that it can be
allocated on the stack in ordinary cases or as part of a structure used
to give closure context for tasks which might outlive the current stack
frame. See RFC:
https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084
For example, a privatizer for an integer used to look like
```mlir
omp.private {type = private} @x.privatizer : !fir.ref<i32> alloc {
^bb0(%arg0: !fir.ref<i32>):
%0 = ... allocate proper memory for the private clone ...
omp.yield(%0 : !fir.ref<i32>)
}
```
After this change, allocation become implicit in the operation:
```mlir
omp.private {type = private} @x.privatizer : i32
```
For more complex types that require initialization after allocation, an
init region can be used:
``` mlir
omp.private {type = private} @x.privatizer : !some.type init {
^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>):
// initialize %arg1, using %arg0 as a mold for allocations
omp.yield(%arg1 : !some.pointer<!some.type>)
} dealloc {
^bb0(%arg0: !some.pointer<!some.type>):
... deallocate memory allocated by the init region ...
omp.yield
}
```
This patch lays the groundwork for delayed task execution but is not
enough on its own.
After this patch all gfortran tests which previously passed still pass.
There
are the following changes to the Fujitsu test suite:
- 0380_0009 and 0435_0009 are fixed
- 0688_0041 now fails at runtime. This patch is testing firstprivate
variables with tasks. Previously we got lucky with the undefined
behavior and won the race. After these changes we no longer get lucky.
This patch lays the groundwork for a proper fix for this issue.
In flang the lowering re-uses the existing lowering used for reduction
init and dealloc regions.
In flang, before this patch we hit a TODO with the same wording when
generating the copy region for firstprivate polymorphic variables. After
this patch the box-like fir.class is passed by reference into the copy
region, leading to a different path that didn't hit that old TODO but
the generated code still didn't work so I added a new TODO in
DataSharingProcessor.
Runtime function call to a void function are producing a ssa value
because the FunctionType result is set to NoneType with is later
translated to a empty struct. This is not an issue when going to LLVM IR
but it breaks when lowering a gpu module to PTX. This patch update the
RTModel to correctly set the FunctionType result type to nothing.
This is one runtime call before this patch at the LLVM IR dialect step.
```
%45 = llvm.call @_FortranAAssign(%arg0, %1, %44, %4) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
```
After the patch the call would be correctly formed
```
llvm.call @_FortranAAssign(%arg0, %1, %44, %4) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> ()
```
Without the patch it would lead to error like:
```
ptxas /tmp/mlir-cuda_device_mod-nvptx64-nvidia-cuda-sm_60-e804b6.ptx, line 10; error : Output parameter cannot be an incomplete array.
ptxas /tmp/mlir-cuda_device_mod-nvptx64-nvidia-cuda-sm_60-e804b6.ptx, line 125; error : Call has wrong number of parameters
```
The change is pretty much mechanical.
Flang alias analysis crashes for omp private allocatable item. The issue
is described here : https://github.com/llvm/llvm-project/issues/116954 .
We know that private value can't alias with anything else unless it is POINTER
or TARGET. That's why we can simplify alias analysis logic.
fir.call side effects are hard to describe in a useful way using
`MemoryEffectOpInterface` because it is impossible to list which memory
location a user procedure read/write without doing a data flow analysis
of its body (even PURE procedures may read from any module variable,
Fortran SIMPLE procedure from F2023 will allow that, but they are far
from common at that point).
Fortran language specifications allow the compiler to deduce
that a procedure call cannot access a variable in many cases
This patch leverages this to extend `fir::AliasAnalysis::getModRef` to
deal with fir.call.
This will allow implementing "array = array_function()" optimization in
a future patch.
Enable alias analysis for omp private clause for code:
```
program main
integer :: arrayA(10,10)
integer :: tmp(2)
integer :: i,j
!$omp target teams distribute parallel do private(tmp)
do j = 1, 10
do i = 1,10
tmp = [i,j]
arrayA = tmp(1)
end do
end do
end program main
```
This PR is based on: https://github.com/llvm/llvm-project/pull/113566
and it contains fix for Fujitsu test suite. Previous PR introduced
regression in Fujitsu test suite. For some Fujitsu test cases
`omp.yield` operation points to block argument. Alias analysis for such
MLIR code will be added in separate PR.
Enable alias analysis for omp private clause for code:
```
program main
integer :: arrayA(10,10)
integer :: tmp(2)
integer :: i,j
!$omp target teams distribute parallel do private(tmp)
do j = 1, 10
do i = 1,10
tmp = [i,j]
arrayA = tmp(1)
end do
end do
end program main
```
This PR applies the changes discussed in [[RFC] Rationale for Flang
AliasAnalysis pointer component
logic](https://discourse.llvm.org/t/rfc-rationale-for-flang-aliasanalysis-pointer-component-logic/79252).
In summary, this PR replaces the existing pointer component logic in
Flang's AliasAnalysis implementation. That logic focuses on aliasing
between pointers and non-pointer, non-target composites that have
pointer components. However, it is more conservative than necessary, and
some existing tests expect its current results when less conservative
results seem reasonable.
This PR splits the logic into two cases:
1. Source values are the same: Return MayAlias when one value is the
address of a composite, and the other value is statically the address of
a pointer component of that composite.
2. Source values are different: Return MayAlias when one value is the
address of a composite (actual argument), and the other value is the
address of a pointer (dummy arg) that might dynamically be a component
of that composite.
In both cases, the actual implementation is still more conservative than
described above, but it can be improved further later. Details appear in
the comments.
Additionally, this PR revises the logic that reports MayAlias for a
pointer/target vs. another pointer/target. It constrains the existing
logic to handle only isData cases, and it adds less conservative
handling of !isData cases elsewhere. First, it extends case 2 listed
above to cover the case where the actual argument is the address of a
pointer rather than a composite. Second, it adds a third case: where
target attributes enable aliasing with a dummy argument.
This patch extends the alias analysis for temporary arrays in Flang.
With this extension, Flang can now determine that the temporary array
[a, b, c] does not alias with arrayD in Fortran code:
```
integer :: a, b, c
integer :: arrayD(3)
arrayD = [ a, b, c ]
```
At present, alias analysis does not work for operations inside OMP
target regions because the FIR declare operations within OMP target do
not offer sufficient information for alias analysis. Consequently, it is
necessary to examine the FIR code outside the OMP target region.
Ahead of https://github.com/llvm/llvm-project/pull/94242 and as
requested in the technical call, I am adding a couple of tests for
pointer components that I would like to make sure are covered.
This PR is an implementation for changes proposed in
https://discourse.llvm.org/t/rfc-distinguish-between-data-and-non-data-in-fir-alias-analysis/78759
Test updates were made when the query was on the wrong reference. So, it
is my hope that this will clear ambiguity on the nature of the queries
from here on.
There are also some TODOs that were addressed.
It also partly implements what
https://github.com/llvm/llvm-project/pull/87723 is attempting to
accomplish. At least, on a point-to-point query between references, the
distinction is made. To apply it to TBAA, would be another PR.
Note that, the changes were minimal in the TBAA code to retain the
current results.
After PR#68727 the source for both the fir.box_addr and a box became the
same. Thus the detection that only one of the sources was direct and the
special logic around it was being skipped. As a result, the test
included would show a "MayAlias" result instead of a "NoAlias" result.
The changes are needed to get leslie3d same performance with HLFIR
as with FIR lowering. The two module allocatable variables cannot
alias, so the optimized bufferization should be able to elide
the temporary and inline the assignment loop.
Add support for fir.box_addr, fir.array_corr, fir.coordinate, fir.embox,
fir.rebox and fir.load.
1) Through the use of boolean `followBoxAddr` determine whether the
analysis should apply to the address of the box or the address wrapped
by the box.
2) Some asserts have been removed to allow for more SourceKinds though
the flow, in a particular SourceKind::Direct
3) getSource was a public method but the returned type (SourceKind) was
not public making it impossible to be called publicly
4) About 12 tests have been added to check for real Fortran scenarios
5) More tests will be added with HLFIR
6) A few TODOs have been identified and will need to be addressed in
follow-up patches. I felt that more changes would increase the
complexity of the patch.
This patch adds `host_assoc` attribute for operations that implement
FortranVariableInterface (e.g. `hlfir.declare`). The attribute is used
by the alias analysis to make better conclusions about memory overlap.
For example, a dummy argument of an inner subroutine and a host's
variable used inside the inner subroutine cannot refer to the same
object (if the dummy argument does not satisify exceptions in F2018
15.5.2.13).
This closes a performance gap between HLFIR optimization pipeline
and FIR ArrayValueCopy for Polyhedron/nf.
Add a rough alias analysis rule for hlfir.designate which just follows
the memref argument. This could be extended in the future to take into
account the indices or derived type fields accessed to spot for provably
non-overlapping cases. In the meantime, we need a flag to ensure we
never say "MustAlias" when following a value through a hlfir.designate
because the designate analysis is only approximate.
Differential Revision: https://reviews.llvm.org/D157718
These are experimental changes in Flang AA to provide
at least some means to disambiguate memory accesses in some
simple cases. This AA is still not used by any transformation,
so the LIT tests are the only way to trigger it currently.
I will further look into applying this AA within Flang
to address some of the known performance issues in the benchmarks.
Credits to @Renaud-K for the initial implementation.
Differential Revision: https://reviews.llvm.org/D141410