That is another problem uncovered during hlfir.reshape inlining,
where the shape bits could be any integer type.
This patch adds explicit convertions to `index` type where needed.
hlfir.reshape inlining uncovered an existing bug that non-index shapes
result in failures during hlfir.get_extent conversion to FIR.
I could have "fixed" the shape during hlfir.reshape inlining,
but this seems like a better fix.
If -funroll-loops tests are not restricted to specific targets the tests
may behave differently based on the host platform. This patch restricts
the tests to aarch64 and x86_64, and removes the PowerPC XFAIL.
This PR adds debug support for common block in flang. As variable which
are part of a common block don't have a special marker to recognize
them, we use the following check to find them.
%0 = fir.address_of(@a)
%1 = fir.convert %0
%2 = fir.coordinate_of %1, %c0
%3 = fir.convert %2
%4 = fircg.ext_declare %3
If the memref of a fircg.ext_declare points to a fir.coordinate_of and
that in turn points to an fir.address_of (ignoring immediate
fir.convert) then we assume that it is a common block variable. The
fir.address_of gives us the global symbol which is the storage for
common block and fir.coordinate_of provides the offset in this storage.
The debug hierarchy looks like as
subroutine f3
integer :: x, y
common /a/ x, y
end subroutine
@a_ = global { ... } { ... }, !dbg !26, !dbg !28!23 = !DISubprogram(name: "f3"...)
!24 = !DICommonBlock(scope: !23, name: "a", ...)
!25 = !DIGlobalVariable(name: "x", scope: !24 ...)
!26 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression())
!27 = !DIGlobalVariable(name: "y", scope: !24 ...)
!28 = !DIGlobalVariableExpression(var: !27, expr:
!DIExpression(DW_OP_plus_uconst, 4))
This required following changes:
1. Instead of using DIGlobalVariableAttr in the FusedLoc of GlobalOp, we
use DIGlobalVariableExpressionAttr. This allows us the generate the
DIExpression where we have the information.
2. Previously, only one DIGlobalVariableExpressionAttr could be linked
to one global op. I recently removed this restriction in mlir. To make
use of it, we add an ArrayAttr to the FusedLoc of a GlobalOp. This
allows us to pass multiple DIGlobalVariableExpressionAttr.
3. I was depending on the name of global for the name of the common
block. The name gets a '_' appended. I could not find a utility function
in flang to remove it so I have to brute force it.
This is take two of #70976. This iteration of the patch makes sure that
custom
diagnostics without any warning group don't get promoted by `-Werror` or
`-Wfatal-errors`.
This implements parts of the extension proposed in
https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7.
Specifically, this makes it possible to specify a diagnostic group in an
optional third argument.
Follow-up PR to fix the failure caused here:
https://github.com/llvm/llvm-project/pull/121028
Failure:
https://lab.llvm.org/buildbot/#/builders/89/builds/14474
Problems:
- Cray pointee cannot be used in the DSA list (If used results in segmentation fault)
- Cray pointer has to be in the DSA list when Cray pointee is used in the default (none) region
Fix: Added required semantic checks along the tests
Reference from the documentation (OpenMP 5.0: 2.19.1):
- Cray pointees have the same data-sharing attribute as the storage with
which their Cray pointers are associated.
Lower Fortran RESHAPE intrinsic into hlfir.reshape, and then lower
hlfir.reshape into a runtime call.
A later patch will add hlfir.reshape inlining as hlfir.elemental.
When constructing the characteristics of a particular reference to an
intrinsic procedure that was passed a non-coindexed reference to local
coarray data as an actual argument, don't add the corank of the actual
argument to those characteristics.
Also clean up the TypeAndShape characteristics class a little; the
Attr::Coarray is redundant since the corank() accessor can be used to
the same effect.
Since https://github.com/llvm/llvm-project/pull/122770, we have seen
that compile time have become extremely slow for cyclic derived types.
In #122770, we made the criteria to cache a derived type very strict. As
a result, some types which are safe to cache were also being
re-generated every type they were required. This increased the compile
time and also the size of the debug info.
Please see the description of PR# 122770. We decided that when
processing `t1`, the type generated for `t2` and `t3` were not safe to
cached. But our algorithm also denied caching to `t1` which as top level
type was safe.
```
type t1
type(t2), pointer :: p1
end type
type t2
type(t3), pointer :: p2
end type
type t3
type(t1), pointer :: p3
end type
```
I have tinkered the check a bit so that top level type is always cached.
To detect a top level type, we use a depth counter that get incremented
before call to `convertRecordType` and decremented after it returns.
After this change, the following
[file](https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90)
from Fujitsu get compiled around 40s which is same as it was before
#122770.
The smaller testcase present in issue #124049 takes less than half a
second. I also added check to make sure that duplicate entries of the
`DICompositeType` are not present in the IR.
Fixes#124049 and #123960.
A complex literal constant can have one BOZ component, since the type
and value of the literal can be determined by converting the BOZ value
to the type of the other component. But a complex literal constant with
two BOZ components doesn't have a well-defined type. The error message
was confusing in the case; emit a better one.
Fixes https://github.com/llvm/llvm-project/issues/124201.
GetShape() needed to be called with a FoldingContext in order to
properly construct an extent expression for the shape of an array
constructor whose elements (nested in an implied DO loop) were not
scalars.
Fixes https://github.com/llvm/llvm-project/issues/124191.
When ASYNCHRONOUS='NO' appears in a data transfer statement control item
list, don't crash if it isn't appropriate for the kind of I/O under way
(such as child I/O).
Fixes https://github.com/llvm/llvm-project/issues/124135.
When an allocatable or pointer was being associated as a storage
sequence with a dummy argument, the checks were using the actual storage
size of the allocatable or pointer's descriptor, not the size of the
storage that it references.
Fixes https://github.com/llvm/llvm-project/issues/123807.
The compiler was interpreting 'Q' as an exponent letter in a literal
real constant as meaning real(kind=10) on x86-64, which is the legacy
80387 80-bit extended precision floating-point type. It turns out that
'Q' means kind=16 with all other compilers, even for x86-64 targets.
Change to conform.
When a character array constructor does not have an explicit type with a
constant length, the compiler can still fold it if all of its elements
are constants. These array constructors will have been wrapped up in the
internal %SET_LENGTH operation, which will determine the final length of
the folded value, so use the maximum length of the constant elements as
the length of the folded array constructor.
Fixes https://github.com/llvm/llvm-project/issues/123766.
An assertion in module file generation didn't allow for a case that has
arisen in a test; remove it, extend commentary, and add a regression
test.
Fixes https://github.com/llvm/llvm-project/issues/123534.
Catch and report multiple initializations of the same procedure pointer
rather than assuming that control wouldn't reach a given point in name
resolution in that case.
Fixes https://github.com/llvm/llvm-project/issues/123538.
The check for a coarray actual argument being passed to a procedure with
an implicit interface was incorrect, yielding false positives for
coindexed objects. Fix.
It is valid to jump to a CHANGE TEAM statement from anywhere in the
containing executable part, and valid to jump to an END TEAM statement
from within the construct.
When a character component reference is applied to a constant array of
derived type, ensure that the length of the resulting character array is
properly defined.
Fixes https://github.com/llvm/llvm-project/issues/123362.
The event variable in an EVENT POST/WAIT statement can be a coarray
reference, and need not be an entire coarray.
Variables and potential subobject components with EVENT_TYPE/LOCK_TYPE
must be coarrays, unless they are potential subobjects nested within
coarrays or pointers.
This one appears to have been omitted when other ATOMIC_xxx intrinsic
procedures were defined. There's already tests for it, but they
apparently work even when ATOMIC_ADD must be interpreted as an external
procedure with an implicit interface. Extend the tests with INTRINSIC
NONE(EXTERNAL, TYPE) statements to ensure that they require the
intrinsic interpretation.
This removes mentions of `target` from the generic `loop` rewrite pass
since there is not need for it anyway. It is enough to detect `loop`'s
nesting within `teams` or `parallel` directives.
Extends rewriting of `loop` directives by supporting `bind` clause for
standalone directives. This follows both the spec and the current state
of clang as follows:
* No `bind` or `bind(thread)`: the `loop` is rewritten to `simd`.
* `bind(parallel)`: the `loop` is rewritten to `do`.
* `bind(teams)`: the `loop` is rewritten to `distribute`.
This is a follow-up PR for
https://github.com/llvm/llvm-project/pull/122632, only the latest commit
in this PR is relevant to the PR.
If -funroll-loops tests are not restricted to specific targets the tests
may behave differently based on the host platform. This patch restricts
the tests to aarch64 and x86_64, and removes the PowerPC XFAIL.
This allows the Flang parser to accept the !$OMP DISPATCH and related
clauses.
Lowering is currently not implemented. Tests for unparse and parse-tree
dump is provided, and one for checking that the lowering ends in a "not
yet implemented"
---------
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
When a field in a derived type is `c_devptr`, keep check if we can do a
memcpy instead of falling back to the runtime assignment.
Many internal CUDA Fortran derived type have a `c_devptr` field and this
would lead to stack overflow on the device if the assignment is
performed by the runtime function.
The backtrace may at least print the backtrace name in the call stack,
but this does not happen with the release builds of the runtime.
Surprisingly, specifying "no-omit-frame-pointer" did not work
with GCC, so I decided to fall back to -O0 for these functions.
It looks like in most cases we still don't make calls to deallocate
allocatable members of derived types which have been privatized.
This is just intended to add a test for the one case where we do, to
make sure this doesn't regress with my upcoming changes.