`syncthreads_and`, `syncthreads_count`, `syncthreads_or`, `synwrap` must
take their argument by value. This patch updates the interfaces and
makes sure these functions can be called inside a cuff kernel as well.
With these changes, CUF atomic operations are handled as cudadevice
intrinsics and are converted straight to the LLVM dialect with the
`llvm.atomicrw` operation.
I am only submitting changes for `atomicadd` to gather feedback. If we
are to proceed with these changes I will add support for all other
applicable atomic operations following this pattern.
Intrinsic module procedures ieee_get_modes, ieee_set_modes,
ieee_get_status, and ieee_set_status store and retrieve opaque data
values whose size varies by machine and OS environment. These data
values are usually, but not always small. Their sizes are not directly
known in a cross compilation environment. Address this issue by
implementing two mechanisms for processing these data values.
Environments that use typical small data sizes can access storage
defined at compile time. When this is not valid, data storage of any
size can be allocated at runtime.
The standard requires EVENT_TYPE, LOCK_TYPE, NOTIFY_TYPE, and TEAM_TYPE
to have full default initialization for their nonallocatable private
components.
Add `c_devloc` as intrinsic and inline it during lowering. `c_devloc` is
used in CUDA Fortran to get the address of device variables.
For the moment, we borrow almost all semantic checks from `c_loc` except
for the pointer or target restriction. The specifications of `c_devloc`
are are pretty vague and we will relax/enforce the restrictions based on
library and apps usage comparing them to the reference compiler.
Implement the UNSIGNED extension type and operations under control of a
language feature flag (-funsigned).
This is nearly identical to the UNSIGNED feature that has been available
in Sun Fortran for years, and now implemented in GNU Fortran for
gfortran 15, and proposed for ISO standardization in J3/24-116.txt.
See the new documentation for details; but in short, this is C's
unsigned type, with guaranteed modular arithmetic for +, -, and *, and
the related transformational intrinsic functions SUM & al.
For proper error detection of bad KIND= arguments, make IEEE_INT and
IEEE_REAL actual intrinsic functions. Lowering will have to check for
the new __builtin_ names.
…INCLUDE lines
The compiler current treats an INCLUDE line as essentially a synonym for
a preprocessing #include directive. The causes macros that have been
defined at the point where the INCLUDE line is processed to be replaced
within the text of the included file.
This behavior is surprising to users who expect an INCLUDE line to be
expanded into its contents *after* preprocessing has been applied to the
original source file, with no further macro expansion.
Change INCLUDE line processing to use a fresh instance of Preprocessor
containing no macro definitions except _CUDA in CUDA Fortran
compilations and, if the original file was being preprocessed, the
standard definitions of __FILE__, __LINE__, and so forth.
A defined assignment generic interface for a given LHS/RHS type & rank
combination may have a specific procedure with LHS dummy argument that
is neither allocatable nor pointer, or specific procedure(s) whose LHS
dummy arguments are allocatable or pointer. It is possible to have two
specific procedures if one's LHS dummy argument is allocatable and the
other's is pointer.
However, the runtime doesn't work with LHS dummy arguments that are
allocatable, and will crash with a mysterious "invalid descriptor" error
message.
Extend the list of special bindings to include
ScalarAllocatableAssignment and ScalarPointerAssignment, use them when
appropriate in the runtime type information tables, and handle them in
Assign() in the runtime support library.
Add a builtin type for c_devptr since it will need some special handling
for some function like c_f_pointer.
`c_ptr` is defined as a builtin type and was raising a semantic error if
you try to use it in a I/O statement. This patch add a check for c_ptr
and c_devptr to bypass the semantic check and allow the variables of
these types to be used in I/O.
This version of the patch keeps the semantic error when -pedantic is
enabled to align with gfortran.
This generates `warning: REAL(KIND=16) is not an enabled type for this
target` if that type is used in a build not correctly configured to
support this type. Uses of `selected_real_kind(30)` return -1.
Relanding #102147 because the test errors turned out to be specific to a
downstream configuration.
Add a builtin type for c_devptr since it will need some special handling
for some function like c_f_pointer.
`c_ptr` is defined as a builtin type and was raising a semantic error if
you try to use it in a I/O statement. This patch add a check for c_ptr
and c_devptr to bypass the semantic check and allow the variables of
these types to be used in I/O.
Reverts llvm/llvm-project#102147
It seems some systems which should support F128 are wrongly detected as
not supporting.
This might be due to checking `LDBL_MANT_DIG` instead of
`__LDBL_MANT_DIG__`. I will investigate.
This generates `warning: REAL(KIND=16) is not an enabled type for this
target` if that type is used in a build not correctly configured to
support this type. Uses of `selected_real_kind(30)` return -1.
There's a numbered constraint that prohibits calls to some IEEE
arithmetic and exception procedures within the body of a DO CONCURRENT
construct. Clean up the implementation to catch missing cases.
Moves definitions of the kind arrays into a Fortran MODULE to not only
emit the MOD file, but also compile that MODULE file into an object
file. This file is then linked into libFortranRuntime.so.
Eventually this workaround PR shoud be redone and a proper runtime build
should be setup that will then also compile Fortran MODULE files.
Fixes#89403
---------
Co-authored-by: Valentin Clement (バレンタイン クレメン) <clementval@gmail.com>
Co-authored-by: Michael Kruse <github@meinersbur.de>
Transformational functions from the intrinsic module ISO_C_BINDING are
allowed in specification expressions, so tweak some general checks that
would otherwise trigger error messages about inadmissible targets, dummy
procedures in specification expressions, and pure procedures with impure
dummy procedures.
All of the IEEE_SUPPORT_xxx() intrinsic functions must fold to constant
logical values when they have constant arguments; and since they fold to
.TRUE. for currently support architectures, always fold them. But also
put in the infrastructure whereby a driver can initialize Evaluate's
target information to set some of them to .FALSE. if that becomes
necessary.
This is a re-worked version of #91668. It adds the `cudadevice` module
and set the `device` attributes on its functions/subroutines so there is
no need for special case in semantic check.
`cudadevice` module is implicitly USE'd in `global`/`device` subprogram.
Some functions and subroutines are available in device context
(device/global). These functions have interfaces declared in the
`cudadevice` module.
This patch adds interfaces as `__cuda_device_builtins_<fctname>` in a
builtin module and they are USE'd rename in the `cudadevice` module. The
module is implicitly used in device/global subprograms.
The builtin module only contains procedures from section 3.6.4 for now.
On Linux, PowerPC defines `int_fast16_t` and `int_fast32_t` as `long`.
Need to update the corresponding type, `c_int_fast16_t` and
`c_int_fast32_t` in `iso_c_binding` module so they are interoparable.
Address TODOs in the intrinsic module ISO_FORTRAN_ENV, and extend the
implementation of NUMERIC_STORAGE_SIZE so that the calculation of its
value is deferred until it is needed so that the effects of
-fdefault-integer-8 or -fdefault-real-8 are reflected. Emit a warning
when NUMERIC_STORAGE_SIZE is used from the module file and the default
integer and real sizes do not match.
Fixes https://github.com/llvm/llvm-project/issues/87476.
This PR changes the build system to use use the sources for the module
`omp_lib` and the `omp_lib.h` include file from the `openmp` runtime
project and not from a separate copy of these files. This will greatly
reduce potential for inconsistencies when adding features to the OpenMP
runtime implementation.
When the OpenMP subproject is not configured, this PR also disables the
corresponding LIT tests with a "REQUIRES" directive at the beginning of
the OpenMP test files.
---------
Co-authored-by: Valentin Clement (バレンタイン クレメン) <clementval@gmail.com>