4745 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
841327db4e
[flang][cuda] Convert cuf.alloc for box to fir.alloca in device context (#102662)
In device context managed memory is not available so it makes no sense
to allocate the descriptor using it. Fall back to fir.alloca as it is
handled well in device code.
cuf.free is just dropped.
2024-08-09 13:41:51 -07:00
Valentin Clement (バレンタイン クレメン)
8c3b6bd0cb
[flang][cuda] Do not lower device variables in main program as globals (#102512)
Flang considers arrays in main program larger than 32 bytes having the
SAVE attribute and lowers them as globals. In CUDA Fortran, device
variables are not allowed to have the SAVE attribute and should be
allocated dynamically in the main program scope.
This patch updates lowering so CUDA Fortran device variables are not
considered with the SAVE attribute.
2024-08-08 13:12:59 -07:00
Peter Klausler
7ea78643fe
[flang] Improve error message output (#102324)
When a local character variable with non-constant length has an
initializer, it's an error in a couple of ways (SAVE variable with
unknown size, static initializer that isn't constant due to conversion
to an unknown length). The error that f18 reports is the latter, but the
message contains a formatted representation of the initialization
expression that exposes a non-Fortran %SET_LENGTH() operation. Print the
original expression in the message instead.
2024-08-08 11:08:48 -07:00
Peter Klausler
b949a6f5e3
[flang] Warn on useless IOMSG= (#102250)
An I/O statement with IOMSG= but neither ERR= nor IOSTAT= deserves a
warning to the effect that it's not useful.
2024-08-08 11:08:26 -07:00
Peter Klausler
245eb0a716
[flang] Catch structure constructor in its own type definition (#102241)
The check for a structure constructor to a forward-referenced derived
type wasn't tripping for constructors in the type definition itself. Set
the forward reference flag unconditionally at the beginning of name
resolution for the type definition.
2024-08-08 11:08:00 -07:00
Peter Klausler
25822dc392
[flang] Fix searches for polymorphic components (#102212)
FindPolymorphicAllocatableUltimateComponent needs to be
FindPolymorphicAllocatablePotentialComponent. The current search is
missing cases where a derived type has an allocatable component whose
type has a polymorphic allocatable component.
2024-08-08 11:07:40 -07:00
Peter Klausler
7c512cef61
[flang] Disallow references to some IEEE procedures in DO CONCURRENT (#102082)
There's a numbered constraint that prohibits calls to some IEEE
arithmetic and exception procedures within the body of a DO CONCURRENT
construct. Clean up the implementation to catch missing cases.
2024-08-08 11:07:19 -07:00
Peter Klausler
9390eb9221
[flang] Catch impure calls in nested concurrent-headers (#102075)
The start, end, and stride expressions of a concurrent-header in a DO
CONCURRENT or FORALL statement can contain calls to impure functions...
unless they appear in a statement that's nested in an enclosing DO
CONCURRENT or FORALL construct. Ensure that we catch this nested case.
2024-08-08 11:06:52 -07:00
Peter Klausler
e83c5b25f3
[flang] Warn about automatic data in main program, disallow in BLOCK … (#102045)
…DATA

We allow automatic data objects in the specification part of the main
program; add an optional portability warning and documentation. Don't
allow them in BLOCK DATA. They're already disallowed as module
variables.
2024-08-08 11:06:32 -07:00
Peter Klausler
d46c639ebf
[flang] Fix derived type compatibility checking in ALLOCATE (#102035)
The derived type compatibility checking for ALLOCATE statements with
SOURCE= or MOLD= was only checking for the same derived type name. That
is a necessary but not sufficient check, and it can produce bogus errors
as well as miss valid errors.

Fixes https://github.com/llvm/llvm-project/issues/101909.
2024-08-08 11:06:05 -07:00
Peter Klausler
d9af9cf436
[flang] Don't set Subroutine flag on PROCEDURE() pointers (#102011)
External procedures about which no characteristics are known -- from
EXTERNAL and PROCEDURE() statements of entities that are never called --
are marked as subroutines. This shouldn't be done for procedure
pointers, however.

Fixes https://github.com/llvm/llvm-project/issues/101908.
2024-08-08 11:05:39 -07:00
Valentin Clement (バレンタイン クレメン)
a262ac0c68
[flang][cuda] Make operations dynamically legal in cuf op conversion (#102220) 2024-08-08 09:18:51 -07:00
Razvan Lupusoru
7634a96589
[flang][acc] Improve lowering of Fortran optional in data clause (#102224)
Fortran optional arguments are effectively null references. To deal with
this possibility, flang lowering of OpenACC data clauses creates three
if-else regions when preparing the data pointer for the data clause:
1) Load box value from box reference
2) Load box addr from box value
3) Load box dims from box value

However, this pattern makes it more complicated to find the original box
reference. Effectively, the first if-else region to get the box value is
not needed - since the value can be loaded before the corresponding
`fir.box_addr` and `fir.box_dims` operations. Thus, reduce the number of
if-else regions by deferring the box load to the use sites.

For non-optional cases, the old functionality is left alone - which
preloads the box value.
2024-08-07 08:04:06 -07:00
Kelvin Li
ce2a3d9042
[flang] Match the type of the element size in the box in getValueFromBox (#100512)
Currently, `%17 = fir.box_elesize %16 :
(!fir.class<!fir.ptr<!fir.type<_QFTt{a:i32,b:i32}>>>) -> i32`
is translated to
```
  %4 = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %1, i32 0, i32 1
  %5 = load i32, ptr %4, align 4
```
The type of the element size is `i64`. The load essentially truncates
the value and yields incorrect result in the big endian environment. The
problem occurs in the `storage_size` intrinsic on a polymorphic
variable.
2024-08-06 18:23:05 -04:00
Valentin Clement (バレンタイン クレメン)
a3ccaed3b9
[flang][cuda] Allocate local descriptor in managed memory (#102060)
This patch adds entry point in the runtime to be able to allocate
descriptors in managed memory. These entry points currently only call
`CUFAllocManaged` and `CUFFreeManaged` but could be more complicated in
the future.

`cuf.alloc` and `cuf.free` related to local descriptors are converted
into runtime calls.
2024-08-06 11:17:11 -07:00
Valentin Clement (バレンタイン クレメン)
fca5038597
[flang][cuda] Add conversion pass for cuf.allocate and cuf.deallocate (#101563)
Allocator can be specified in the descriptor. For simple local
allocatable, we can simply convert `cuf.allocate`/`cuf.deallocate` to
their corresponding runtime calls in the standard flang runtime. More
specific cases will require dedicated entry points. Global descriptor
will require sync between host and device copy.

This patch adds a pass to perform this conversion.
2024-08-02 16:19:10 -07:00
Valentin Clement (バレンタイン クレメン)
bbdb1e400f
[flang][cuda] Set the allocator on fir.embox operation (#101722)
This patch set the `allocator_idx` attribute for allocatable descriptor
that have specific CUDA attribute.
2024-08-02 14:00:26 -07:00
Peter Klausler
90617e99bb
[flang] Fix folding edge cases with IEEE_NEXT_{UP/DOWN/AFTER} & NEAREST (#101424)
The generation of 80-bit x87 floating-point infinities was incorrect in
Normalize(), the comparison for IEEE_NEXT_AFTER needs to use the most
precise type of its arguments, and we don't need to warn about overflows
from +/-HUGE() to infinity. Warnings about NaN arguments remain in
place, and enabled by default, as their usage may or may not be
portable, and their appearance in a real code seems most likely to
signify an earlier error.
2024-08-02 12:06:15 -07:00
Peter Klausler
ca305337ff
[flang] Fix -fdefault-integer-8 result kind of relations (#101234)
The result of a relational operator is a default logical, which is
LOGICAL(8) under the -fdefault-integer-8 option.

Fixes https://github.com/llvm/llvm-project/issues/101161.
2024-08-02 12:02:45 -07:00
Sergio Afonso
84b1e59580
[MLIR][OpenMP][OMPIRBuilder] Add lowering support for omp.target_triples (#100156)
This patch modifies MLIR to LLVM IR lowering of the OpenMP dialect to take into
consideration the contents of the `omp.target_triples` module attribute while
generating code for `omp.target` operations.

It adds the `OpenMPIRBuilderConfig::TargetTriples` field and initializes it
using the `amendOperation` flow of the `OpenMPToLLVMIRTranslation` pass. Some
changes are introduced into the `OpenMPIRBuilder` to allow passing the
information about whether a target region is intended to be offloaded from
outside.

The result of this change is that offloading calls are only generated when the
`--offload-arch` or `-fopenmp-targets` options are given to the compiler.
Otherwise, only the host fallback code is generated. This fixes linker errors
currently triggered by `flang-new` if a source file containing a `target`
construct is compiled without any of the aforementioned options.

Several unit tests impacted by these changes, which are intended to check host
code generated for `omp.target` operations, are updated to contain the new
attribute. Without it, no calls to `__tgt_target_kernel` and associated control
flow operations are generated.

Fixes #100209.
2024-08-02 11:58:40 +01:00
Sergio Afonso
9dadb1f62b
[Flang][OpenMP] Add frontend support for -fopenmp-targets (#100155)
This patch adds support for the `-fopenmp-targets` option to the `bbc`
and `flang -fc1` tools. It adds an `OMPTargetTriples` property to the
`LangOptions` structure, which is filled with the triples represented by
the compiler option.

This is used to initialize the `omp.target_triples` module attribute for
later use by lowering stages.
2024-08-02 10:54:15 +01:00
Kareem Ergawy
10df320743
[flang][OpenMP] Enable delayed privatization for omp parallel by default (#90945)
Flips the delayed privatization switch to be on by default. After the
recent fixes related to delayed privatization, the gfortran test suite
runs successfully with delayed privatization turned on by defuault for
`omp parallel`.
2024-08-02 09:46:34 +02:00
Valentin Clement (バレンタイン クレメン)
0def9a923d
[flang] Add allocator_idx attribute on fir.embox and fircg.ext_embox (#101212)
#100690 introduces allocator registry with the ability to store
allocator index in the descriptor. This patch adds an attribute to
fir.embox and fircg.ext_embox to be able to set the allocator index
while populating the descriptor fields.
2024-08-01 12:49:17 -07:00
Valentin Clement (バレンタイン クレメン)
6df4e7c25f
[flang] Add ability to have special allocator for descriptor data (#100690)
This patch enhances the descriptor with the ability to have specialized
allocator. The allocators are registered in a dedicated registry and the
index of the desired allocator is stored in the descriptor. The default
allocator, std::malloc, is registered at index 0.

In order to have this allocator index in the descriptor, the f18Addendum
field is repurposed to be able to hold the presence flag for the
addendum (lsb) and the allocator index.

Since this is a change in the semantic and name of the 7th field of the
descriptor, the CFI_VERSION is bumped to the date of the initial change.

This patch only adds the ability to have this features as part of the
descriptor but does not add specific allocator yet. CUDA fortran will be
the first user of this feature to allocate descriptor data in the
different type of device memory base on the CUDA attribute.

---------

Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
2024-08-01 09:39:53 -07:00
Sergio Afonso
e1451236a0
[Flang][Driver] Introduce -fopenmp-targets offloading option (#100152)
This patch modifies the flang driver to introduce the `-fopenmp-targets`
option to the frontend compiler invocations corresponding to the OpenMP
host device on offloading-enabled compilations.

This option holds the list of offloading triples associated to the
compilation and is used by clang to determine whether offloading calls
should be generated for the host.
2024-08-01 14:27:29 +01:00
Kareem Ergawy
bbadbf751e
[flang][OpenMP] Delayed privatization for variables with equivalence association (#100531)
Handles variables that are storage associated via `equivalence`. The
problem is that these variables are declared as `fir.ptr`s while their
privatized storage is declared as `fir.ref` which was triggering a
validation error in the OpenMP dialect.
2024-08-01 11:52:00 +02:00
Leandro Lupori
366eade911
[flang][OpenMP] Reland Fix copyprivate semantic checks (#95799) (#101009)
There are some cases in which variables used in OpenMP constructs
are predetermined as private. The semantic checks for copyprivate
were not handling those cases.

Besides that, shared symbols were not being properly represented
in some cases. When there was no previously declared private
(implicit) symbol, no new association symbols, representing
shared ones, were being created.

These symbols must always be inserted in constructs that may
privatize the original symbol: parallel, teams and task
generating constructs.

Fixes #87214 and #86907
2024-07-31 14:39:06 -03:00
Sergio Afonso
a3800a60ed
[MLIR][OpenMP] NFC: Sort clauses alphabetically (2/2) (#101194)
This patch sorts the clause lists for the following OpenMP operations:
- omp.taskloop
- omp.taskgroup
- omp.target_data
- omp.target_enter_data
- omp.target_exit_data
- omp.target_update
- omp.target

This change results in the reordering of operation arguments, so
impacted unit tests are updated accordingly.
2024-07-31 10:41:10 +01:00
Sergio Afonso
b3b46963b7
[MLIR][OpenMP] NFC: Sort clauses alphabetically (1/2) (#101193)
This patch sorts the clause lists for the following OpenMP operations:
- omp.parallel
- omp.teams
- omp.sections
- omp.wsloop
- omp.distribute
- omp.task

This change results in the reordering of operation arguments, so
impacted unit tests are updated accordingly.
2024-07-31 10:40:11 +01:00
Peter Klausler
99a0a12ad6
[flang][parser] Better error recovery for SUBROUTINE/FUNCTION statements (#100664)
When there's an error in a SUBROUTINE or FUNCTION statement, errors
cascade quickly because the body of the subprogram or interface isn't in
the right context. So, if a SUBROUTINE or FUNCTION statement is
expected, and contains a SUBROUTINE or FUNCTION keyword, it counts as
one -- retain and emit any errors pertaining to the arguments or suffix,
recover to the end of the line if needed, and proceed.
2024-07-30 11:19:23 -07:00
Peter Klausler
ff567a4e04
[flang] Fix folding of RANK(assumed-type assumed-rank) (#101027)
The code that deals with the special case of RANK(assumed-rank) in
intrinsic function folding wasn't handling the even more special case of
assumed-type assumed-rank dummy arguments.
2024-07-30 09:46:26 -07:00
Peter Klausler
6f7e715eae
[flang] Don't inject possibly invalid conversions while folding (#100842)
A couple of intrinsic functions have optional arguments. Don't insert
type conversions on those arguments when the actual arguments may not be
present at execution time, due to being OPTIONAL, allocatables, or
pointers.
2024-07-30 09:45:34 -07:00
Peter Klausler
ed5a78a13f
[flang] Catch ASSOCIATE(x=>assumed_rank) (#100626)
An assumed-rank dummy argument cannot be the variable or expression in
the selector of an ASSOCIATE construct. (SELECT TYPE/RANK are fine.)
2024-07-30 09:44:09 -07:00
Peter Klausler
fffbabfd47
[flang][parser] Better error recovery for misplaced declaration (#100482)
When a declaration construct appears in the execution part of a block or
subprogram body, report it as such rather than as a misleading syntax
error on the executable statement that it somehow matched the most.
2024-07-30 09:43:43 -07:00
Peter Klausler
1ada235267
[flang][preprocessor] Fix handling of #line before free-form continua… (#100178)
…tion

See new test. A #line (or #) directive after a line ending with & and
before its continuation shouldn't elicit an error about mismatched
parentheses.

Fixes https://github.com/llvm/llvm-project/issues/100073.
2024-07-30 09:42:59 -07:00
Peter Klausler
539a6b500c
[flang] Detect use-before-decl errors on type parameters (#99947)
Ensure that type parameters are declared as such before being referenced
within the derived type definition. (Previously, such references would
resolve to symbols in the enclosing scope.)

This change causes the symbols for the type parameters to be created
when the TYPE statement is processed in name resolution. They are
TypeParamDetails symbols with no KIND/LEN attribute set, and they shadow
any symbols of the same name in the enclosing scope.

When the type parameter declarations are processed, the KIND/LEN
attributes are set. Any earlier reference to a type parameter with no
KIND/LEN attribute elicits an error.

Some members of TypeParamDetails have been retyped &/or renamed.
2024-07-30 09:42:15 -07:00
Peter Klausler
33c27f28d1
[flang] Warn about undefined function results (#99533)
When the result of a function never appears in a variable definition
context, emit a warning.

If the function has multiple result variables due to alternate ENTRY
statements, any definition will suffice.

The implementation of this check is tied to the general variable
definability checking utility in semantics. Every variable definition
context uses it to ensure that no undefinable variable is being defined.
A set of defined variables is maintained in the SemanticsContext and,
when the warning is enabled and no fatal error has been reported, the
scope tree is traversed and all the function subprograms' results are
tested for membership in that set.
2024-07-30 09:41:46 -07:00
vdonaldson
4cdc19b84c
[flang] IEEE_NEXT_AFTER, IEEE_NEXT_DOWN, IEEE_NEXT_UP, NEAREST (#100782)
IEEE_ARITHMETIC intrinsic module procedures IEEE_NEXT_AFTER,
IEEE_NEXT_DOWN, and IEEE_NEXT_UP, and intrinsic NEAREST return larger or
smaller values adjacent to their primary REAL argument. The four
procedures vary in how the direction is chosen, in how special cases are
treated, and in what exceptions are generated. Implement the three
IEEE_ARITHMETIC procedures. Update the NEAREST implementation to support
all six REAL kinds 2,3,4,8,10,16, and fix several bugs.

IEEE_NEXT_AFTER(X,Y) returns a NaN when Y is a NaN as that seems to be
the universal choice of other compilers.

Change the front end compile time implementation of these procedures to
return normal (HUGE) values for infinities when applicable, rather than
always returning the input infinity.
2024-07-29 09:22:36 -04:00
Dominik Adamski
d86311f293
[Flang-new][OpenMP] Add bitcode files for AMD GPU OpenMP (#96742)
Flang-new needs to add `mlink-builtin-bitcode` objects to properly
support offload code generation for AMD GPUs (for example, math
functions).

Both Flang-new and Clang rely on `mlink-builtin-bitcode` flags. These
flags are added by the `AMDGPUOpenMPToolchain::addClangTargetOptions`
function. Now, both compilers reuse the same function.

Flang-new tests for AMDGPU were updated by adding the `-nogpulib` flag.
This flag allows running AMDGPU tests on machines without the ROCm stack.
2024-07-29 11:21:40 +02:00
khaki3
26d92826a5
[mlir][flang] Add an interface of OpenACC compute regions for further getAllocaBlock support (#100675)
This PR implements `ComputeRegionOpInterface` to define `getAllocaBlock`
of OpenACC loop and compute constructs (parallel/kernels/serial). The
primary objective here is to accommodate local variables in OpenACC
compute regions. The change in `fir::FirOpBuilder::getAllocaBlock`
allows local variable allocation inside loops and kernels.
2024-07-26 13:52:27 -07:00
Valentin Clement (バレンタイン クレメン)
16975ad27c
[flang][cuda] Emit error when host array is used in CUF kernel (#100693)
Restriction from the standard 2.11.2.

Arrays used or assigned in the loop must have the device, managed or
unifed attribute.
2024-07-26 09:49:43 -07:00
Tom Eccles
98e733eaf2
[flang][OpenMP] Initialize privatised derived type variables (#100417)
Fixes #91928
2024-07-25 16:53:27 +01:00
Abid Qadeer
bf76290de4
[flang][debug] Set scope of internal functions correctly. (#99531)
The functions internal to subroutine should have the scope set to the
parent function. This allows a user to evaluate local variables of
parent function when control is stopped in the child.

Fixes #96314
2024-07-25 13:52:50 +01:00
Leandro Lupori
58fb51492d
Revert "[flang][OpenMP] Fix copyprivate semantic checks" (#100478)
Reverts llvm/llvm-project#95799

This caused errors in some internal test suites.
2024-07-24 19:14:59 -03:00
Kiran Chandramohan
8a77961280
[Flang][Driver] Enable config file options (#100343)
Config files provide a facility to invoke the compiler with a predefined
set of options. The patch only enables these options in the flang
driver. Functionality was always there.
2024-07-24 16:28:24 +01:00
Kareem Ergawy
68a0d0c762
[flang][OpenMP] Handle common blocks in delayed privatization (#100317)
Adds proper mapping of common block elements to block arguments in
parallel regions when delayed privatization is enabled.
2024-07-24 13:48:47 +02:00
jeanPerier
1ead51a86c
[flang] fix C_PTR function result lowering (#100082)
Functions returning C_PTR were lowered to function returning intptr (i64
on 64bit arch). This caused conflicts when these functions were defined
as returning !fir.ref<none>/llvm.ptr in other compiler generated
contexts (e.g., malloc).

Lower them to return !fir.ref<none>.

This should deal with https://github.com/llvm/llvm-project/issues/97325
and https://github.com/llvm/llvm-project/issues/98644.
2024-07-24 10:24:04 +02:00
Joseph Huber
ae1de3ea3c [Flang] Remove tests checking now removed 'libc-gpu.a`
Summary:
These tests were removed in a previous patch.

The linker wrapper now just extracts the device inputs and forwards them
directly to the device's link job. This is the job that occurs when you
do `clang --target=amdgcn-amd-amdhsa foo.o` or similar. Because this can
handle LTO we no longer do LTO in the linker wrapper. This has some
fallout, because we now require `ld.lld` to be built with a compatible
version, but I think we always expected that.

I made the decision to remove this `libc-gpu.a` library because it was
unnecessary and complicated things. Now I simply have the link job
implicitly link `-lc` if it exists. Users can also now pass
`-Xoffload-linker=amdgcn-amd-amdhsa -lc` or similar to pass it. Because
of this, these tests need to be removed. I forgot that Fortran also had
these.
2024-07-23 19:41:25 -05:00
Valentin Clement (バレンタイン クレメン)
0ee0eeb4bb
[flang] Enhance location information (#95862)
Add inclusion location information by using FusedLocation with
attribute.

More context here:
https://discourse.llvm.org/t/rfc-enhancing-location-information/79650
2024-07-23 09:49:17 -07:00
Anchu Rajendran S
a51d263282
Adding warning for Master as it is deprecated in 5.2 (#98955)
Since `master` is deprecated from OpenMP spec 5.2, warning is added.
Using `masked` is the recommended alternative as per spec
2024-07-23 09:19:54 -07:00