llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 14:56:39 +00:00

Author	SHA1	Message	Date
Peter Klausler	72144d119a	[flang][runtime] Fix recently broken big-endian formatted integer input (#135417 ) My recent change to speed up formatted integer input has a bug on big-endian targets that has shown up on ppc64 AIX build bots. Fix.	2025-04-11 12:52:23 -07:00
Valentin Clement (バレンタインクレメン)	1d8966e246	[flang][cuda] Use the provided stream in kernel launch (#135267 )	2025-04-10 17:15:23 -07:00
Valentin Clement (バレンタインクレメン)	49f8ccd1eb	[flang][cuda] Pass stream information to kernel launch functions (#135246 )	2025-04-10 13:50:50 -07:00
Slava Zakharin	755016a3a8	[flang-rt] Fixed warnings and miscompilations in CUDA build. (#134470 ) * DescribeIEEESignaledExceptions() is unused on the device - warning. * StopStatementText() could return while marked noreturn - warning. * Including cuda/std/complex only in the device compilation may cause nvcc to try to register variables in `cuda` namespace, while they are not defined in the host compilation - error. I decided to include cuda/std/complex always under RT_USE_LIBCUDACXX.	2025-04-10 11:27:03 -07:00
Peter Klausler	18fe0124e7	[flang][runtime] Formatted input optimizations (#134715 ) Make some minor tweaks (inlining, caching) to the formatting input path to improve integer input in a SPEC code. (None of the I/O library has been tuned yet for performance, and there are some easy optimizations for common cases.) Input integer values are now calculated with native C/C++ 128-bit integers. A benchmark that only reads about 5M lines of three integer values each speeds up from over 8 seconds to under 3 in my environment with these changeds. If this works out, the code here can be used to optimize the formatted input paths for real and character data, too. Fixes https://github.com/llvm/llvm-project/issues/134026.	2025-04-10 09:56:46 -07:00
Valentin Clement (バレンタインクレメン)	56b792322a	[flang][cuda] Use the aysncId in device allocation (#135099 ) Use `cudaMallocAsync` in the `CUFAllocDevice` allocator when asyncId is provided. More work is needed to be able to call `cudaFreeAsync` since the allocated address and stream needs to be tracked.	2025-04-09 17:34:48 -07:00
Peter Klausler	e0950ebb9c	[flang][runtime] Tweak width-free I/G formatted I&O (#135047 ) For Fujitsu test case 0561/0561_0168.f90, adjust both input and output sides of the extension I (and G) edit descriptors with no width (as distinct from I0/G0). On input, be sure to halt on a separator character rather than complaining about an invalid character; on output, be sure to emit a leading space.	2025-04-09 12:31:36 -07:00
Valentin Clement (バレンタインクレメン)	f4d87c42a6	[flang][cuda] Add asyncId to allocate entry point (#134947 )	2025-04-09 10:52:02 -07:00
Valentin Clement (バレンタインクレメン)	5ebe22a35d	[flang][cuda] Add async id to allocators (#134724 ) Add async id to allocators in preparation for stream allocation.	2025-04-08 10:16:59 -07:00
Eugene Epshteyn	61af05fe82	[flang] Add runtime and lowering implementation for extended intrinsic PUTENV (#134412 ) Implement extended intrinsic PUTENV, both function and subroutine forms. Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and semantic unit tests.	2025-04-04 16:26:08 -04:00
Peter Klausler	ade9d1f810	[flang][runtime] Remove bad runtime assertion (#134176 ) The RUNTIME_CHECK in question doesn't allow for the possibility that an allocatable or pointer component could be processed by defined I/O. Remove it in favor of a dynamic allocation check.	2025-04-04 08:43:02 -07:00
Peter Klausler	262b3f7615	[flang] Remove runtime dependence on C++ support for types (#134164 ) Fortran::runtime::Descriptor::BytesFor() only works for Fortran intrinsic types for which a C++ type counterpart exists, so it crashes on some types that are legitimate Fortran types like REAL(2). Move some logic from Evaluate into a new header in flang/Common, then use it to avoid this needless dependence on C++.	2025-04-04 08:42:38 -07:00
Peter Klausler	c8bde44cfc	[flang] Implement FSEEK and FTELL (#133003 ) Add function and subroutine forms of FSEEK and FTELL as intrinsic procedures. Accept common aliases from legacy compilers as well. A separate patch to llvm-test-suite will enable tests for these procedures once this patch has merged. Depends on https://github.com/llvm/llvm-project/pull/132423; CI builds will likely fail until that patch is merged and this PR is rebased.	2025-04-04 08:40:51 -07:00
Andre Kuhlenschmidt	b11eece1bb	[flang][intrinsics] Implement the time intrinsic (#133823 ) This PR implements the nonstandard intrinsic time. In addition to running the unit tests, I also double checked that the example code works by manually compiling and running it.	2025-04-03 15:33:40 -07:00
Andre Kuhlenschmidt	85fdab33b0	[flang][intrinsic] add nonstandard intrinsic unlink (#134162 ) This PR adds the intrinsic `unlink` to flang. ## Test plan - Added two codegen unit tests and ensured flang-check continues to pass. - Manually compiled and ran the example from the documentation.	2025-04-03 14:33:53 -07:00
Valentin Clement (バレンタインクレメン)	bb179c483a	[flang][rt] Allow ReportFatalUserError to be build on device (#133979 )	2025-04-01 13:50:42 -07:00
Valentin Clement (バレンタインクレメン)	afa32d3e0e	[flang][cuda] Fix char argument This would fail with `error: argument of type "char" is incompatible with parameter of type "const char *"`	2025-04-01 11:00:50 -07:00
Valentin Clement (バレンタインクレメン)	01889de8e9	[flang][device] Enable Stop functions on device build (#133803 ) Update `StopStatement` and `StopStatementText` to be build for the device.	2025-04-01 10:06:45 -07:00
Slava Zakharin	1ab3a4f234	[flang-rt][NFC] Work around CTK12.8 compilation failure. (#133833 ) It happened in https://lab.llvm.org/buildbot/#/builders/152/builds/1131 when the buildbot was switched from CTK12.3 to CTK12.8. The logs are gone by now, so the above link is useless. The error was: error: ‘auto’ not permitted in template argument This workaround helps, but I also reported the issue to NVCC devs.	2025-04-01 08:04:45 -07:00
Jean-Didier PAILLEUX	513a91a5f1	[flang/flang-rt] Implement PERROR intrinsic form GNU Extension (#132406 ) Add the implementation of the `PERROR(STRING) ` intrinsic from the GNU Extension to prints on the stderr a newline-terminated error message corresponding to the last system error prefixed by `STRING`. (https://gcc.gnu.org/onlinedocs/gfortran/PERROR.html)	2025-04-01 15:47:54 +02:00
Valentin Clement (バレンタインクレメン)	0b31f08537	[flang][cuda] Add support for NV_CUDAFOR_DEVICE_IS_MANAGED (#133778 ) Add support for the environment variable `NV_CUDAFOR_DEVICE_IS_MANAGED` as described in the documentation: https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#controlling-device-data-is-managed. This mainly switch device allocation to managed allocation.	2025-03-31 13:17:21 -07:00
Peter Klausler	4ea5aa09de	[flang][NFC] Restore I/O runtime API header name (#132423 ) flang/include/flang/Runtime/io-api.h was changed into io-api-consts.h, then wrapped into a new io-api.h that includes io-api-consts.h, does some redundant includes and declarations, and then declares the prototype of one function, InquiryKeywordHashDecode. Make that function static in io-stmt.cpp prior to its sole call site, then undo the renaming, to reduce confusion and redundancy.	2025-03-26 12:09:16 -07:00
Slava Zakharin	613a077b05	[flang] Generate quadmath_wrapper.h for Flang Evaluate. (#132817 ) When building Flang with Clang, we need to do the same quadmath.h wrapping as we do for flang-rt. I extracted the CMake code into FlangCommon.cmake, and cleaned up the arguments passing to execute_process (note that `-###` was treated as `-` in the original code, because `#` starts a comment). I believe the Clang command does not require the input source file, so I removed it as well.	2025-03-25 12:08:38 -07:00
Eugene Epshteyn	2c8e26081f	[flang] Add HOSTNM runtime and lowering intrinsics implementation (#131910 ) Implement GNU extension intrinsic HOSTNM, both function and subroutine forms. Add HOSTNM documentation to `flang/docs/Intrinsics.md`. Add lowering and semantic unit tests. (This change is modeled after GETCWD implementation.)	2025-03-25 13:17:17 -04:00
vdonaldson	92e0560347	[flang] ieee_denorm (#132307 ) Add support for the nonstandard ieee_denorm exception for real kinds 3, 4, 8 on x86 processors.	2025-03-25 13:02:43 -04:00
Joseph Huber	85974a0537	[flang-rt] Add experimental support for GPU build (#131826 ) Summary: This patch adds initial support for compiling `flang-rt` directly for the GPU. The method used here matches what's already done for `libc` and `libc++` for the GPU and builds off of those projects. Mainly this requires setting up some flags and setting the sources that currently work. This will deposit the resulting library in the appropriate directory. These files are then intended to be linked via `-Xoffload-linker` support in the offloading driver. ``` lib/clang/21/lib/nvptx64-nvidia-cuda/libflang_rt.runtime.a lib/clang/21/lib/amdgcn-amd-amdhsa/libflang_rt.runtime.a ``` This is obviously missing a lot of functions, mainly the `io` support. Most of what we cannot support is due to using POSIX things that just don't make sense on the GPU. Stuff like `pthreads` or `sema`. Getting unit tests to run on this will also be a challenge. We could run tests the same way we do with `libc`, but the problem there is that the `libc` test suite is freestanding while `gtest` currently doesn't compile on the GPU bcause it uses a lot of weird stuff. If the unit tests were simply `int main` then it would work. I don't understand the actual runtime code very well, I'd appreciate some guidance on how to actually support Fortran IO from this interface. As I understand it, Fortran IO requires a stack-like operation, which conflicts with the SIMT model GPUs use. Worst case scenario we could burn some LDS to keep a stack, or serialize it somehow since we can always just iterate over all the active lanes. Building this right now looks like this, which depends on the arguments added in https://github.com/llvm/llvm-project/pull/131695. ``` -DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES=compiler-rt;libc;libcxx;libcxxabi;flang-rt \ -DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=compiler-rt;libc;libcxx;libcxxabi;flang-rt \ -DRUNTIMES_nvptx64-nvidia-cuda_FLANG_RT_LIBC_PROVIDER=llvm \ -DRUNTIMES_nvptx64-nvidia-cuda_FLANG_RT_LIBCXX_PROVIDER=llvm \ -DRUNTIMES_amdgcn-amd-amdhsa_FLANG_RT_LIBC_PROVIDER=llvm \ -DRUNTIMES_amdgcn-amd-amdhsa_FLANG_RT_LIBCXX_PROVIDER=llvm ```	2025-03-24 08:31:42 -05:00
Valentin Clement (バレンタインクレメン)	ecaef010f3	[flang][cuda] Support corner case of data transfer (#132451 ) The flang runtime will complain when the number of elements in the two descriptors involved in the data transfer are not matching. In some cases, we can still perform the data transfer to match the behavior of the reference compiler. When the RHS elements count is bigger than the LHS elements count and both descriptors are contiguous, we can perform the data transfer with the bare pointers and the number of bytes from the LHS. We don't really have unit tests set up for data transfer, this is why I didn't include one here.	2025-03-21 15:39:05 -07:00
jeanPerier	cd0a2a3f1b	[flang] add QSORT extension intrinsic to the runtime (#132033 ) Add support for legacy Fortran intrinsic QSORT from lib 3f. This is a thin Fortran wrapper over libc qsort.	2025-03-19 16:14:37 +01:00
Slava Zakharin	7a9473b1b0	[flang-rt] Fixed build issue in flang-runtime-cuda-clang.	2025-03-18 16:18:30 -07:00
Slava Zakharin	7d7b58bc5d	[flang-rt] Added ShallowCopy API. (#131702 ) This API will be used for copying non-contiguous arrays into contiguous temporaries to support `-frepack-arrays`. The builder factory API will be used in the following commits.	2025-03-18 12:58:25 -07:00
Slava Zakharin	f326036767	[flang-rt] Added IsContiguousUpTo runtime function. (#131048 ) I want to be able to check if the storage is contiguous in the innermost dimension, so I decided to add an entry point that takes `dim` as the number of leading dimensions to check. It seems that a runtime call might result in less code size even when `dim` is 1, so here it is. For opt-for-speed I am going to inline it in FIR. Depends on #131047.	2025-03-14 17:13:21 -07:00
Slava Zakharin	c8b8415b1a	[flang-rt] Install flang_rt.cuda with the toolchain. (#131373 )	2025-03-14 16:12:32 -07:00
Michael Kruse	7341753a2e	[Flang-RT] Environment introspection for quadmath.h (#130411 ) When compiling Flang-RT with Clang, query Clang for the GCC installation it uses. If found, create `quadmath_wrapper.h` that points to the `quadmath.h` of that GCC installation. `quadmath.h` is only available when compiling with gcc, and Clang has no equivalent even though gcc's version compiles fine with Clang (at least up to and including gcc 13). It is still available into gcc's installation resource dir (in constrast to a system-wide indirectory such as `/usr/include` or `/usr/local/include`) and therefore not available to any compiler other than the gcc of that installation. quadmath may also be a different OS package than gcc itself, so it is not necessarily presesent. Clang actually already appropriates a GCC installation for its libraries such that `libquadmath.a` is already found, but it does not do so for the include paths. Because adding that directory to the header search path may have wide-reaching consquences, we create only a wrapper header that points to the real `quadmath.h` in the same GCC installation that Clang uses.	2025-03-11 14:18:06 +01:00
مهدي شينون (Mehdi Chinoune)	cf5aa559a8	[flang] Don't redefine pid_t on MinGW-w64. (#130288 )	2025-03-10 17:27:47 +00:00
Kelvin Li	a7a65a824e	[flang] explicitly cast the pointer to void* in std::memcpy calls (NFC) (#129946 ) This patch is to add the explicit cast to the first argument of std::memcpy.	2025-03-06 16:09:16 -05:00
Valentin Clement (バレンタインクレメン)	c8898b09f9	[flang][rt] Use allocator registry to allocate the pointer payload (#129992 ) pointer allocation is done through `AllocateValidatedPointerPayload`. This function was not updated to use the registered allocators in the descriptor to perform the allocation. This patch makes use of the allocator. The footer word is not set and not checked for allocator other than the default one. The support will likely come in a follow up patch but this will necessitate more functions to be registered to be able to set and get the footer value when the allocation in on the device.	2025-03-06 08:47:27 -08:00
Slava Zakharin	9b1604065e	[flang-rt] Move unit-map.cpp to host-only sources list. (#129763 ) This file is not enabled for the offload builds. This patch aligns the list with flang/runtime/CMakeLists.txt (that is about to be removed).	2025-03-04 14:39:16 -08:00
Valentin Clement (バレンタインクレメン)	ae84717d11	[flang][cuda] Fix descriptor sync in data transfer (#129333 ) The destination descriptor on the device needs to be sync with the destination descriptor on the host, not the src one.	2025-02-28 14:52:33 -08:00
Peter Klausler	abe1ecff54	[flang][runtime] Detect byte order reversal problems (#129093 ) When reading an unformatted sequential file with variable-length records, detect byte order reversal problems with the first record's header and footer words, and emit a more detailed error message.	2025-02-27 16:16:15 -08:00
Tom Stellard	2b340c10a6	flang: Fix build with latest libc++ (#127362 ) I think this first stopped working with 954836634abb446f18719b14120c386a929a42d1. This patch fixes the following error: /home/runner/work/llvm-project/llvm-project/flang/runtime/io-api-minimal.cpp:153:11: error: '__libcpp_verbose_abort' is missing exception specification 'noexcept' 153 \| void std::__libcpp_verbose_abort(char const format, ...) { \| ^ \| noexcept /mnt/build/bin/../include/c++/v1/__verbose_abort:30:28: note: previous declaration is here 30 \| __printf__, 1, 2) void __libcpp_verbose_abort(const char __format, ...) _LIBCPP_VERBOSE_ABORT_NOEXCEPT; \| ^ 1 error generated.	2025-02-19 06:53:30 -08:00
Michael Kruse	4c4fc4650f	[Flang-RT] Build libflang_rt.so (#121782 ) Under non-Windows platforms, also create a dynamic library version of the runtime. Build of either version of the library can be switched on using FLANG_RT_ENABLE_STATIC=ON respectively FLANG_RT_ENABLE_SHARED=ON. Default is to build only the static library, consistent with previous behaviour. This is because the way the flang driver invokes the linker, most linkers choose the dynamic library by default, if available. Building the dynamic library therefore causes flang-built executables to depend on `libflang_rt.so`, unless explicitly told otherwise.	2025-02-17 12:53:12 +01:00
Michael Kruse	b55f7512a7	[Flang] LLVM_ENABLE_RUNTIMES=flang-rt (#110217 ) Extract Flang's runtime library to use the LLVM_ENABLE_RUNTIME mechanism. It will only become active when `LLVM_ENABLE_RUNTIMES=flang-rt` is used, which also changes the `FLANG_INCLUDE_RUNTIME` to `OFF` so the old runtime build rules do not conflict. This also means that unless `LLVM_ENABLE_RUNTIMES=flang-rt` is passed, nothing changes with the current build process. Motivation: * Consistency with LLVM's other runtime libraries (compiler-rt, libc, libcxx, openmp offload, ...) * Allows compiling the runtime for multiple targets at once using the LLVM_RUNTIME_TARGETS configuration options * Installs the runtime into the compiler's per-target resource directory so it can be automatically found even when cross-compiling Also see RFC discussion at https://discourse.llvm.org/t/rfc-use-llvm-enable-runtimes-for-flangs-runtime/80826	2025-02-16 15:39:52 +01:00
Michael Kruse	81c85ea30f	[flang-rt] Fix aarch64-libcxx build failure There seems to be multiple declarations of __libcpp_verbose_abort, some with noexcept and some without. Reverting to the previous forward-declaration (without noexcept) which seemes to have worked before.	2025-02-16 13:46:30 +01:00
Michael Kruse	54f37133b7	[Flang][NFC] Move runtime library files to flang-rt (#110298 ) Mostly mechanical changes in preparation of extracting the Flang-RT "subproject" in #110217. This PR intends to only move pre-existing files to the new folder structure, with no behavioral change. Common files (headers, testing, cmake) shared by Flang-RT and Flang remain in `flang/`. Some cosmetic changes and files paths were necessary: * Relative paths to the new path for the source files and `add_subdirectory`. * Add the new location's include directory to `include_directories` * The unittest/Evaluate directory has unitests for flang-rt and Flang. A new `CMakeLists.txt` was introduced for the flang-rt tests. * Change the `#include` paths relative to the include directive * clang-format on the `#include` directives * Since the paths are part if the copyright header and include guards, a script was used to canonicalize those * `test/Runtime` and runtime tests in `test/Driver` are moved, but the lit.cfg.py mechanism to execute the will only be added in #110217.	2025-02-16 13:25:31 +01:00

44 Commits