463 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
16c2a1016e
Revert "[flang] Allow to pass an async id to allocate the descriptor (#118713)" (#119109)
This reverts commit 7d1c661381d36018fd105f4ad4c2d6dc45e7288b.

This commit breaks some device runtime builds. Need time to investigate.
2024-12-07 19:55:12 -08:00
Michael Kruse
c91ba04328
[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188)
Split some headers into headers for public and private declarations in
preparation for #110217. Moving the runtime-private headers in
runtime-private include directory will occur in #110298.

* Do not use `sizeof(Descriptor)` in the compiler. The size of the
descriptor is target-dependent while `sizeof(Descriptor)` is the size of
the Descriptor for the host platform which might be too small when
cross-compiling to a different platform. Another problem is that the
emitted assembly ((cross-)compiling to the same target) is not identical
between Flang's running on different systems. Moving the declaration of
`class Descriptor` out of the included header will also reduce the
amount of #included sources.

* Do not use `sizeof(ArrayConstructorVector)` and
`alignof(ArrayConstructorVector)` in the compiler. Same reason as with
`Descriptor`.

* Compute the descriptor's extra flags without instantiating a
Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime
source, but not the compiler source.

* Move `InquiryKeywordHashDecode` into runtime-private header. The
function is defined in the runtime sources and trying to call it in the
compiler would lead to a link-error.

* Move allocator-kind magic numbers into common header. They are the
only declarations out of `allocator-registry.h` in the compiler as well.
 
This does not make Flang cross-compile ready yet, the main goal is to
avoid transitive header dependencies from Flang to clang-rt. There are
more assumptions that host platform is the same as the target platform.
2024-12-06 15:29:00 +01:00
Valentin Clement (バレンタイン クレメン)
83ccaad473
[flang][cuda] Use async id for device stream allocation (#118733)
When stream is specified use cudaMallocAsync with the specified stream
2024-12-05 08:57:10 -08:00
Michael Kruse
0cda970ecc
[Flang][NFC] Split common headers to reduce dependencies. (#110244)
Fortran.h and target.h are defining symbols where some are used by both, the Fortran runtime (Flang-RT) and Fortran compiler (Flang), and others are used by Flang only. With the upcoming refactoring of the Fortran runtime into its own subproject (#110217), move the declarations that are used by both into new headers to minimize the amount of code that will need to be shared by Flang-RT and Flang.

Details:

 * `Fortran.h`: Flang-RT  only uses some enum definitions out of this file, but not `AsFortran` which is defined in `Fortran.cpp`. Moving the enums into `Fortran-consts.h` allows keeping `Fortran.cpp` within Flang.

 * `target.h`: Contains some floating-point definitions that is used by the non-GTest unittests in `fp-testing.h`. Flang-RT also uses some non-GTest as well. Moving those definitions avoids the dependence on the entire FortranEvaluate library.
2024-12-05 11:29:32 +01:00
Valentin Clement (バレンタイン クレメン)
7d1c661381
[flang] Allow to pass an async id to allocate the descriptor (#118713)
This is a patch in preparation for the support stream ordered memory
allocator in CUDA Fortran.

This patch adds an asynchronous id to the AllocatableAllocate runtime
function and to Descriptor::Allocate so it can be passed down to the
registered allocator. It is up to the allocator to use this value or
not.

A follow up patch will implement that asynchronous allocator for CUDA
Fortran.
2024-12-04 18:24:40 -08:00
Kelvin Li
af35e21cfe
[flang] Update CommandTest for AIX (NFC) (#118403)
With the change in commit e335563, the behavior for `ECLGeneralErrorCommandErrorSync` 
on AIX is the same as on Linux.
2024-12-03 10:27:14 -05:00
David Truby
e335563806
[NFC][flang] Fix execute_command_line test for odd environments (#117714)
One of the execute_command_line tests currently runs `cat` on an invalid
file and checks its return value, but since we don't control `cat` or
the user's path, the return value might not be reliably stable on a
per-platform basis. For example, if `git` is installed on Windows in
certain configurations it adds a directory to the path containing a
`cat` with a different set of error codes to the default Windows one.

This patch changes the test to use the `not` binary built by LLVM for
testing purposes, which should always return 1 on any platform
regardless of the user's environment.
2024-11-27 00:43:56 +00:00
Valentin Clement
308c00749d [flang][cuda][NFC] Fix format 2024-11-01 12:42:06 -07:00
Valentin Clement (バレンタイン クレメン)
32473864cb
[flang][cuda] Data transfer with descriptor (#114598)
Reopen PR #114302 as it was automatically closed. 

Review in #114302
2024-11-01 12:35:48 -07:00
jeanPerier
c4204c0b29
[flang] replace fir.complex usages with mlir complex (#110850)
Core patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292.
After that, the last step is to remove fir.complex from FIR types.
2024-10-03 17:10:57 +02:00
Yusuke MINATO
b91a25ef58
[flang] add nsw to operations in subscripts (#110060)
This patch adds nsw to operations when lowering subscripts.

See also the discussion in the following discourse post.
https://discourse.llvm.org/t/rfc-add-nsw-flags-to-arithmetic-integer-operations-using-the-option-fno-wrapv/77584/9
2024-10-03 10:56:01 +09:00
David Truby
856c38d542
[flang] Implement GETUID and GETGID intrinsics (#110679)
GETUID and GETGID are non-standard intrinsics supported by a number of
other Fortran compilers. On supported platforms these intrinsics simply
call the POSIX getuid() and getgid() functions and return the result.
The only platform we support that does not have these is Windows.

Windows does not have the same concept of UIDs and GIDs, so on Windows
we issue a warning indicating this and return 1 from both functions.

Co-authored-by: Yi Wu <yi.wu2@arm.com>
2024-10-02 13:26:40 +01:00
jeanPerier
3b7989cd9b
[flang] remove support for std::complex value lowering. (#110643)
To avoid ABIs issues, std::complex should be passed/returned by reference to the runtime.

Part of the [RFC to use mlir complex
type](https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292).
2024-10-02 10:11:58 +02:00
David Truby
7a0a7947ee
Revert "[flang] Implement GETUID and GETGID intrinsics" (#110531)
Reverts llvm/llvm-project#108017
2024-09-30 17:35:27 +01:00
David Truby
054eadcb11
[flang] Implement GETUID and GETGID intrinsics (#108017)
GETUID and GETGID are non-standard intrinsics supported by a number of
other Fortran compilers. On supported platforms these intrinsics simply
call the POSIX getuid() and getgid() functions and return the result.
The only platform we support that does not have these is Windows.

Windows does not have the same concept of UIDs and GIDs, so on Windows
we issue a warning indicating this and return 1 from both functions.

Co-authored-by: Yi Wu <yi.wu2@arm.com>

---------

Co-authored-by: Yi Wu <yi.wu2@arm.com>
2024-09-30 14:36:39 +01:00
Valentin Clement (バレンタイン クレメン)
fa627d98e8
[flang][cuda] Add entry point for alloc/free and simple copy (#109867)
These will be used to translate simple cuf.alloc/cuf.free and
cuf.data_transfer on scalar and constant size arrays.
2024-09-24 20:00:11 -07:00
Slava Zakharin
fc51c7f0cc
[flang][runtime] Disable LDBL_MANT_DIG == 113 for the offload builds. (#109339)
When compiling on aarch64 some `LDBL_MANT_DIG == 113` entries
end up trying to use `complex<long double>` for which there are
no certain specializations in `libcudacxx`. This change-set
includes a clean-up for `LDBL_MANT_DIG == 113` usage, which is replaced
with `HAS_LDBL128` that is set in `float128.h`.
2024-09-19 15:45:45 -07:00
Valentin Clement (バレンタイン クレメン)
cdf447baa5
[flang][cuda] Add function to allocate and deallocate device module variable (#109213)
This patch adds new runtime entry points that perform the simple
allocation/deallocation of module allocatable variable with cuda
attributes.
When the allocation is initiated on the host, the descriptor on the
device is synchronized. Both descriptors point to the same data on the
device.

This is the first PR of a stack.
2024-09-18 20:22:06 -07:00
Slava Zakharin
104f3c1806
Reland "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078)" (#109207)
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std::complex` from
`libcudacxx`.
`cuda::std::complex` does not have specializations for `long double`,
so the change is accompanied with a clean-up for `long double` usage.

Additional change on top of #109078 is to use `cuda::std::complex`
only for the device compilation, otherwise the host compilation
fails because `libcudacxx` may not support `long double` specialization
at all (depending on the compiler).
2024-09-18 17:41:33 -07:00
Slava Zakharin
36192fdfb9
Revert "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build." (#109173)
Reverts llvm/llvm-project#109078
2024-09-18 11:22:31 -07:00
Slava Zakharin
be187a6812
[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078)
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std::complex` from
`libcudacxx`.
`cuda::std::complex` does not have specializations for `long double`,
so the change is accompanied with a clean-up for `long double` usage.
2024-09-18 10:59:05 -07:00
Youngsuk Kim
d5dd7d230e [flang] Tidy uses of raw_string_ostream (NFC)
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly

( 65b13610a5226b84889b923bae884ba395ad084d for further reference )

Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
2024-09-17 12:20:21 -05:00
Peter Klausler
7aad87312a
[flang][runtime] Accept some real input for integer NAMELIST (#108268)
A few other Fortran compilers silently accept real values for integer
variables in NAMELIST input. Handling an exponent would be difficult,
but it's easy to skip and ignore a fractional part when one is present.
2024-09-12 09:14:20 -07:00
Peter Klausler
500f6cc25c
[flang][runtime] Support SPACING for REAL(2 & 3) (#106575)
Add runtime APIs for the intrinsic function SPACING for REAL kinds 2 & 3
in two ways: Spacing2 (& 3) for build environments with std::float16_t,
and Spacing2By4 (& 3By4) variants (for any build environment) which
compute SPACING for those types but accept and return their values as
32-bit floats.

SPACING for REAL(2) is needed by HDF5.
2024-09-04 10:53:22 -07:00
Peter Klausler
9e53e77265
[flang] Fix warnings from more recent GCCs (#106567)
While experimenting with some more recent C++ features, I ran into
trouble with warnings from GCC 12.3.0 and 14.2.0. These warnings looked
legitimate, so I've tweaked the code to avoid them.
2024-09-04 10:52:51 -07:00
Kelvin Li
ba5e8fcece
[flang] Adjust execute_command_line intrinsic return values for AIX (NFC) (#106472) 2024-08-29 15:21:06 -04:00
Kelvin Li
66ab4b80a4
[flang] Re-enable date_and_time intrinsic test (NFC) (#104967) 2024-08-20 15:47:15 -04:00
Valentin Clement (バレンタイン クレメン)
5c3a3dc9eb
[flang][cuda] Add version in libCufRuntime name (#104506) 2024-08-15 20:45:33 -07:00
Valentin Clement
743e99dcf5
Reland "[flang][cuda] Use cuda runtime API #103488"
CUDA Fortran is meant to be an equivalent to the runtime API. Therefore, it
makes more sense to use the cuda rt API in the allocators for CUF.
2024-08-14 14:56:00 -07:00
Valentin Clement (バレンタイン クレメン)
f6e3dbc27d
Revert "[flang][cuda] Use cuda runtime API" (#104232)
Reverts llvm/llvm-project#103488
2024-08-14 13:44:49 -07:00
Valentin Clement (バレンタイン クレメン)
00ab8a6a4c
[flang][cuda] Use cuda runtime API (#103488)
CUDA Fortran is meant to be an equivalent to the runtime API. Therefore,
it makes more sense to use the cuda rt API in the allocators for CUF.

@bdudleback
2024-08-14 12:34:45 -07:00
Valentin Clement (バレンタイン クレメン)
4c1dbbe7aa
[flang][cuda] Make CUFRegisterAllocator callable from C/Fortran (#102543) 2024-08-08 17:09:53 -07:00
Valentin Clement (バレンタイン クレメン)
10d7805c4f
[flang][cuda][NFC] Disambiguate namespace with cuf dialect (#102194)
Rename namespace `Fortran::runtime::cuf` to `Fortran::runtime::cuda` to
avoid embiguity with the namespace `::cuf` that is defined in the CUF
dialect.
2024-08-06 14:04:45 -07:00
Valentin Clement (バレンタイン クレメン)
a3ccaed3b9
[flang][cuda] Allocate local descriptor in managed memory (#102060)
This patch adds entry point in the runtime to be able to allocate
descriptors in managed memory. These entry points currently only call
`CUFAllocManaged` and `CUFFreeManaged` but could be more complicated in
the future.

`cuf.alloc` and `cuf.free` related to local descriptors are converted
into runtime calls.
2024-08-06 11:17:11 -07:00
Kazu Hirata
6f8ef5ad2f
[flang] Construct SmallVector with ArrayRef (NFC) (#101901) 2024-08-05 04:03:10 -07:00
Valentin Clement (バレンタイン クレメン)
bbdb1e400f
[flang][cuda] Set the allocator on fir.embox operation (#101722)
This patch set the `allocator_idx` attribute for allocatable descriptor
that have specific CUDA attribute.
2024-08-02 14:00:26 -07:00
Peter Klausler
b1a1d4e08e
[flang][runtime] Don't emit excess digits for ES0.0E0 or EN0.0E0 (#101238)
Don't emit any digits after the decimal point.
2024-08-02 12:04:12 -07:00
Valentin Clement (バレンタイン クレメン)
1417633943
[flang][cuda] Add CUF allocator (#101216)
Add allocators for CUDA fortran allocation on the device. 3 allocators
are added for pinned, device and managed/unified memory allocation.
`CUFRegisterAllocator()` is called to register the allocators in the
allocator registry added in #100690.


Since this require CUDA, a cmake option `FLANG_CUF_RUNTIME` is added to
conditionally build these.
2024-08-02 10:02:34 -07:00
vdonaldson
4cdc19b84c
[flang] IEEE_NEXT_AFTER, IEEE_NEXT_DOWN, IEEE_NEXT_UP, NEAREST (#100782)
IEEE_ARITHMETIC intrinsic module procedures IEEE_NEXT_AFTER,
IEEE_NEXT_DOWN, and IEEE_NEXT_UP, and intrinsic NEAREST return larger or
smaller values adjacent to their primary REAL argument. The four
procedures vary in how the direction is chosen, in how special cases are
treated, and in what exceptions are generated. Implement the three
IEEE_ARITHMETIC procedures. Update the NEAREST implementation to support
all six REAL kinds 2,3,4,8,10,16, and fix several bugs.

IEEE_NEXT_AFTER(X,Y) returns a NaN when Y is a NaN as that seems to be
the universal choice of other compilers.

Change the front end compile time implementation of these procedures to
return normal (HUGE) values for infinities when applicable, rather than
always returning the input infinity.
2024-07-29 09:22:36 -04:00
Alexis Perry-Holby
f1d3fe7aae
Add basic -mtune support (#98517)
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
2024-07-16 16:48:24 +01:00
Slava Zakharin
8ce1aed55f
[flang] Lower MATMUL to type specific runtime calls. (#97547)
Lower MATMUL to the new runtime entries added in #97406.
2024-07-03 21:18:56 -07:00
Slava Zakharin
dd22085308
[flang][runtime] Split MATMUL[_TRANSPOSE] into separate entries. (#97406)
Device compilation is much faster for separate MATMUL[_TRANPOSE]
entries than for a single one that covers all data types.
The lowering changes and the removal of the generic entries will follow.
2024-07-02 21:30:37 -07:00
Yi Wu
dfd2711f8f
Revert "Revert "[flang] Fix execute_command_line cmdstat is not set when error occurs" (#96365)" (#96774)
The fix broke llvm-test-suite, so it was reverted previously. With test
fixes added in https://github.com/llvm/llvm-test-suite/pull/137, it
should now pass the tests

This reverts commit 435635652fd226fa292abcff6a10d3df9dbd74e3.
2024-06-27 10:53:56 +01:00
Tarun Prabhu
8dd9494056
Revert "[flang] Add basic -mtune support" (#96678)
Reverts llvm/llvm-project#95043
2024-06-25 13:25:39 -06:00
Alexis Perry-Holby
a790279bf2
[flang] Add basic -mtune support (#95043)
This PR adds -mtune as a valid flang flag and passes the information
through to LLVM IR as an attribute on all functions. No specific
architecture optimizations are added at this time.
2024-06-25 18:39:35 +01:00
David Truby
954b692bd7
[flang] Allow derf as alternate spelling for erf (#95784)
This patch adds derf as an alternate spelling for the erf intrinsic.
This spelling is supported by multiple other compilers and used by WRF.
2024-06-25 01:24:49 +01:00
Peter Klausler
5d15f606da
[flang][preprocessing] Mix preprocessing directives with free form li… (#96244)
…ne continuation

Allow preprocessing directives to appear between a source line and its
continuation, including conditional compilation directives (#if, #ifdef,
&c.).

Fixes https://github.com/llvm/llvm-project/issues/95476.
2024-06-24 10:18:50 -07:00
Kiran Chandramohan
435635652f
Revert "[flang] Fix execute_command_line cmdstat is not set when error occurs" (#96365)
Reverts llvm/llvm-project#93023

Reverting due to buildbot failure.
https://lab.llvm.org/buildbot/#/builders/41/builds/227
test-suite ::
Fortran/gfortran/regression/gfortran-regression-execute-regression__execute_command_line_3_f90
2024-06-21 23:47:13 +01:00
Yi Wu
4232dd586b
[flang] Fix execute_command_line cmdstat is not set when error occurs (#93023)
Fixes: https://github.com/llvm/llvm-project/issues/92929
Also added cmdstat for common linux return code 1, 126, 127
2024-06-21 14:42:07 +01:00
Peter Klausler
f8fc883da9
[flang][runtime] Distinguish VALUE from non-VALUE operations in REDUCE (#95297)
Accommodate operations with VALUE dummy arguments in the runtime support
for the REDUCE intrinsic function by splitting most entry points into
Reduce...Ref and Reduce...Value variants.

Further work will be needed in lowering to call the ...Value entry
points.
2024-06-13 11:10:32 -07:00