llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-25 01:16:05 +00:00

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	16c2a1016e	Revert "[flang] Allow to pass an async id to allocate the descriptor (#118713 )" (#119109 ) This reverts commit 7d1c661381d36018fd105f4ad4c2d6dc45e7288b. This commit breaks some device runtime builds. Need time to investigate.	2024-12-07 19:55:12 -08:00
Michael Kruse	c91ba04328	[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188 ) Split some headers into headers for public and private declarations in preparation for #110217. Moving the runtime-private headers in runtime-private include directory will occur in #110298. * Do not use `sizeof(Descriptor)` in the compiler. The size of the descriptor is target-dependent while `sizeof(Descriptor)` is the size of the Descriptor for the host platform which might be too small when cross-compiling to a different platform. Another problem is that the emitted assembly ((cross-)compiling to the same target) is not identical between Flang's running on different systems. Moving the declaration of `class Descriptor` out of the included header will also reduce the amount of #included sources. * Do not use `sizeof(ArrayConstructorVector)` and `alignof(ArrayConstructorVector)` in the compiler. Same reason as with `Descriptor`. * Compute the descriptor's extra flags without instantiating a Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime source, but not the compiler source. * Move `InquiryKeywordHashDecode` into runtime-private header. The function is defined in the runtime sources and trying to call it in the compiler would lead to a link-error. * Move allocator-kind magic numbers into common header. They are the only declarations out of `allocator-registry.h` in the compiler as well. This does not make Flang cross-compile ready yet, the main goal is to avoid transitive header dependencies from Flang to clang-rt. There are more assumptions that host platform is the same as the target platform.	2024-12-06 15:29:00 +01:00
Valentin Clement (バレンタインクレメン)	83ccaad473	[flang][cuda] Use async id for device stream allocation (#118733 ) When stream is specified use cudaMallocAsync with the specified stream	2024-12-05 08:57:10 -08:00
Michael Kruse	0cda970ecc	[Flang][NFC] Split common headers to reduce dependencies. (#110244 ) Fortran.h and target.h are defining symbols where some are used by both, the Fortran runtime (Flang-RT) and Fortran compiler (Flang), and others are used by Flang only. With the upcoming refactoring of the Fortran runtime into its own subproject (#110217), move the declarations that are used by both into new headers to minimize the amount of code that will need to be shared by Flang-RT and Flang. Details: * `Fortran.h`: Flang-RT only uses some enum definitions out of this file, but not `AsFortran` which is defined in `Fortran.cpp`. Moving the enums into `Fortran-consts.h` allows keeping `Fortran.cpp` within Flang. * `target.h`: Contains some floating-point definitions that is used by the non-GTest unittests in `fp-testing.h`. Flang-RT also uses some non-GTest as well. Moving those definitions avoids the dependence on the entire FortranEvaluate library.	2024-12-05 11:29:32 +01:00
Valentin Clement (バレンタインクレメン)	7d1c661381	[flang] Allow to pass an async id to allocate the descriptor (#118713 ) This is a patch in preparation for the support stream ordered memory allocator in CUDA Fortran. This patch adds an asynchronous id to the AllocatableAllocate runtime function and to Descriptor::Allocate so it can be passed down to the registered allocator. It is up to the allocator to use this value or not. A follow up patch will implement that asynchronous allocator for CUDA Fortran.	2024-12-04 18:24:40 -08:00
Kelvin Li	af35e21cfe	[flang] Update CommandTest for AIX (NFC) (#118403 ) With the change in commit e335563, the behavior for `ECLGeneralErrorCommandErrorSync` on AIX is the same as on Linux.	2024-12-03 10:27:14 -05:00
David Truby	e335563806	[NFC][flang] Fix execute_command_line test for odd environments (#117714 ) One of the execute_command_line tests currently runs `cat` on an invalid file and checks its return value, but since we don't control `cat` or the user's path, the return value might not be reliably stable on a per-platform basis. For example, if `git` is installed on Windows in certain configurations it adds a directory to the path containing a `cat` with a different set of error codes to the default Windows one. This patch changes the test to use the `not` binary built by LLVM for testing purposes, which should always return 1 on any platform regardless of the user's environment.	2024-11-27 00:43:56 +00:00
Valentin Clement	308c00749d	[flang][cuda][NFC] Fix format	2024-11-01 12:42:06 -07:00
Valentin Clement (バレンタインクレメン)	32473864cb	[flang][cuda] Data transfer with descriptor (#114598 ) Reopen PR #114302 as it was automatically closed. Review in #114302	2024-11-01 12:35:48 -07:00
jeanPerier	c4204c0b29	[flang] replace fir.complex usages with mlir complex (#110850 ) Core patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292. After that, the last step is to remove fir.complex from FIR types.	2024-10-03 17:10:57 +02:00
Yusuke MINATO	b91a25ef58	[flang] add nsw to operations in subscripts (#110060 ) This patch adds nsw to operations when lowering subscripts. See also the discussion in the following discourse post. https://discourse.llvm.org/t/rfc-add-nsw-flags-to-arithmetic-integer-operations-using-the-option-fno-wrapv/77584/9	2024-10-03 10:56:01 +09:00
David Truby	856c38d542	[flang] Implement GETUID and GETGID intrinsics (#110679 ) GETUID and GETGID are non-standard intrinsics supported by a number of other Fortran compilers. On supported platforms these intrinsics simply call the POSIX getuid() and getgid() functions and return the result. The only platform we support that does not have these is Windows. Windows does not have the same concept of UIDs and GIDs, so on Windows we issue a warning indicating this and return 1 from both functions. Co-authored-by: Yi Wu <yi.wu2@arm.com>	2024-10-02 13:26:40 +01:00
jeanPerier	3b7989cd9b	[flang] remove support for std::complex value lowering. (#110643 ) To avoid ABIs issues, std::complex should be passed/returned by reference to the runtime. Part of the [RFC to use mlir complex type](https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292).	2024-10-02 10:11:58 +02:00
David Truby	7a0a7947ee	Revert "[flang] Implement GETUID and GETGID intrinsics" (#110531 ) Reverts llvm/llvm-project#108017	2024-09-30 17:35:27 +01:00
David Truby	054eadcb11	[flang] Implement GETUID and GETGID intrinsics (#108017 ) GETUID and GETGID are non-standard intrinsics supported by a number of other Fortran compilers. On supported platforms these intrinsics simply call the POSIX getuid() and getgid() functions and return the result. The only platform we support that does not have these is Windows. Windows does not have the same concept of UIDs and GIDs, so on Windows we issue a warning indicating this and return 1 from both functions. Co-authored-by: Yi Wu <yi.wu2@arm.com> --------- Co-authored-by: Yi Wu <yi.wu2@arm.com>	2024-09-30 14:36:39 +01:00
Valentin Clement (バレンタインクレメン)	fa627d98e8	[flang][cuda] Add entry point for alloc/free and simple copy (#109867 ) These will be used to translate simple cuf.alloc/cuf.free and cuf.data_transfer on scalar and constant size arrays.	2024-09-24 20:00:11 -07:00
Slava Zakharin	fc51c7f0cc	[flang][runtime] Disable LDBL_MANT_DIG == 113 for the offload builds. (#109339 ) When compiling on aarch64 some `LDBL_MANT_DIG == 113` entries end up trying to use `complex<long double>` for which there are no certain specializations in `libcudacxx`. This change-set includes a clean-up for `LDBL_MANT_DIG == 113` usage, which is replaced with `HAS_LDBL128` that is set in `float128.h`.	2024-09-19 15:45:45 -07:00
Valentin Clement (バレンタインクレメン)	cdf447baa5	[flang][cuda] Add function to allocate and deallocate device module variable (#109213 ) This patch adds new runtime entry points that perform the simple allocation/deallocation of module allocatable variable with cuda attributes. When the allocation is initiated on the host, the descriptor on the device is synchronized. Both descriptors point to the same data on the device. This is the first PR of a stack.	2024-09-18 20:22:06 -07:00
Slava Zakharin	104f3c1806	Reland "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078 )" (#109207 ) `std::complex` operators do not work for the CUDA device compilation of F18 runtime. This change makes use of `cuda::std::complex` from `libcudacxx`. `cuda::std::complex` does not have specializations for `long double`, so the change is accompanied with a clean-up for `long double` usage. Additional change on top of #109078 is to use `cuda::std::complex` only for the device compilation, otherwise the host compilation fails because `libcudacxx` may not support `long double` specialization at all (depending on the compiler).	2024-09-18 17:41:33 -07:00
Slava Zakharin	36192fdfb9	Revert "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build." (#109173 ) Reverts llvm/llvm-project#109078	2024-09-18 11:22:31 -07:00
Slava Zakharin	be187a6812	[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078 ) `std::complex` operators do not work for the CUDA device compilation of F18 runtime. This change makes use of `cuda::std::complex` from `libcudacxx`. `cuda::std::complex` does not have specializations for `long double`, so the change is accompanied with a clean-up for `long double` usage.	2024-09-18 10:59:05 -07:00
Youngsuk Kim	d5dd7d230e	[flang] Tidy uses of raw_string_ostream (NFC) As specified in the docs, 1) raw_string_ostream is always unbuffered and 2) the underlying buffer may be used directly ( 65b13610a5226b84889b923bae884ba395ad084d for further reference ) Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.	2024-09-17 12:20:21 -05:00
Peter Klausler	7aad87312a	[flang][runtime] Accept some real input for integer NAMELIST (#108268 ) A few other Fortran compilers silently accept real values for integer variables in NAMELIST input. Handling an exponent would be difficult, but it's easy to skip and ignore a fractional part when one is present.	2024-09-12 09:14:20 -07:00
Peter Klausler	500f6cc25c	[flang][runtime] Support SPACING for REAL(2 & 3) (#106575 ) Add runtime APIs for the intrinsic function SPACING for REAL kinds 2 & 3 in two ways: Spacing2 (& 3) for build environments with std::float16_t, and Spacing2By4 (& 3By4) variants (for any build environment) which compute SPACING for those types but accept and return their values as 32-bit floats. SPACING for REAL(2) is needed by HDF5.	2024-09-04 10:53:22 -07:00
Peter Klausler	9e53e77265	[flang] Fix warnings from more recent GCCs (#106567 ) While experimenting with some more recent C++ features, I ran into trouble with warnings from GCC 12.3.0 and 14.2.0. These warnings looked legitimate, so I've tweaked the code to avoid them.	2024-09-04 10:52:51 -07:00
Kelvin Li	ba5e8fcece	[flang] Adjust execute_command_line intrinsic return values for AIX (NFC) (#106472 )	2024-08-29 15:21:06 -04:00
Kelvin Li	66ab4b80a4	[flang] Re-enable date_and_time intrinsic test (NFC) (#104967 )	2024-08-20 15:47:15 -04:00
Valentin Clement (バレンタインクレメン)	5c3a3dc9eb	[flang][cuda] Add version in libCufRuntime name (#104506 )	2024-08-15 20:45:33 -07:00
Valentin Clement	743e99dcf5	Reland "[flang][cuda] Use cuda runtime API #103488 " CUDA Fortran is meant to be an equivalent to the runtime API. Therefore, it makes more sense to use the cuda rt API in the allocators for CUF.	2024-08-14 14:56:00 -07:00
Valentin Clement (バレンタインクレメン)	f6e3dbc27d	Revert "[flang][cuda] Use cuda runtime API" (#104232 ) Reverts llvm/llvm-project#103488	2024-08-14 13:44:49 -07:00
Valentin Clement (バレンタインクレメン)	00ab8a6a4c	[flang][cuda] Use cuda runtime API (#103488 ) CUDA Fortran is meant to be an equivalent to the runtime API. Therefore, it makes more sense to use the cuda rt API in the allocators for CUF. @bdudleback	2024-08-14 12:34:45 -07:00
Valentin Clement (バレンタインクレメン)	4c1dbbe7aa	[flang][cuda] Make CUFRegisterAllocator callable from C/Fortran (#102543 )	2024-08-08 17:09:53 -07:00
Valentin Clement (バレンタインクレメン)	10d7805c4f	[flang][cuda][NFC] Disambiguate namespace with cuf dialect (#102194 ) Rename namespace `Fortran::runtime::cuf` to `Fortran::runtime::cuda` to avoid embiguity with the namespace `::cuf` that is defined in the CUF dialect.	2024-08-06 14:04:45 -07:00
Valentin Clement (バレンタインクレメン)	a3ccaed3b9	[flang][cuda] Allocate local descriptor in managed memory (#102060 ) This patch adds entry point in the runtime to be able to allocate descriptors in managed memory. These entry points currently only call `CUFAllocManaged` and `CUFFreeManaged` but could be more complicated in the future. `cuf.alloc` and `cuf.free` related to local descriptors are converted into runtime calls.	2024-08-06 11:17:11 -07:00
Kazu Hirata	6f8ef5ad2f	[flang] Construct SmallVector with ArrayRef (NFC) (#101901 )	2024-08-05 04:03:10 -07:00
Valentin Clement (バレンタインクレメン)	bbdb1e400f	[flang][cuda] Set the allocator on fir.embox operation (#101722 ) This patch set the `allocator_idx` attribute for allocatable descriptor that have specific CUDA attribute.	2024-08-02 14:00:26 -07:00
Peter Klausler	b1a1d4e08e	[flang][runtime] Don't emit excess digits for ES0.0E0 or EN0.0E0 (#101238 ) Don't emit any digits after the decimal point.	2024-08-02 12:04:12 -07:00
Valentin Clement (バレンタインクレメン)	1417633943	[flang][cuda] Add CUF allocator (#101216 ) Add allocators for CUDA fortran allocation on the device. 3 allocators are added for pinned, device and managed/unified memory allocation. `CUFRegisterAllocator()` is called to register the allocators in the allocator registry added in #100690. Since this require CUDA, a cmake option `FLANG_CUF_RUNTIME` is added to conditionally build these.	2024-08-02 10:02:34 -07:00
vdonaldson	4cdc19b84c	[flang] IEEE_NEXT_AFTER, IEEE_NEXT_DOWN, IEEE_NEXT_UP, NEAREST (#100782 ) IEEE_ARITHMETIC intrinsic module procedures IEEE_NEXT_AFTER, IEEE_NEXT_DOWN, and IEEE_NEXT_UP, and intrinsic NEAREST return larger or smaller values adjacent to their primary REAL argument. The four procedures vary in how the direction is chosen, in how special cases are treated, and in what exceptions are generated. Implement the three IEEE_ARITHMETIC procedures. Update the NEAREST implementation to support all six REAL kinds 2,3,4,8,10,16, and fix several bugs. IEEE_NEXT_AFTER(X,Y) returns a NaN when Y is a NaN as that seems to be the universal choice of other compilers. Change the front end compile time implementation of these procedures to return normal (HUGE) values for infinities when applicable, rather than always returning the input infinity.	2024-07-29 09:22:36 -04:00
Alexis Perry-Holby	f1d3fe7aae	Add basic -mtune support (#98517 ) Initial implementation for the -mtune flag in Flang. This PR is a clean version of PR #96688, which is a re-land of PR #95043	2024-07-16 16:48:24 +01:00
Slava Zakharin	8ce1aed55f	[flang] Lower MATMUL to type specific runtime calls. (#97547 ) Lower MATMUL to the new runtime entries added in #97406.	2024-07-03 21:18:56 -07:00
Slava Zakharin	dd22085308	[flang][runtime] Split MATMUL[_TRANSPOSE] into separate entries. (#97406 ) Device compilation is much faster for separate MATMUL[_TRANPOSE] entries than for a single one that covers all data types. The lowering changes and the removal of the generic entries will follow.	2024-07-02 21:30:37 -07:00
Yi Wu	dfd2711f8f	Revert "Revert "[flang] Fix execute_command_line cmdstat is not set when error occurs" (#96365 )" (#96774 ) The fix broke llvm-test-suite, so it was reverted previously. With test fixes added in https://github.com/llvm/llvm-test-suite/pull/137, it should now pass the tests This reverts commit 435635652fd226fa292abcff6a10d3df9dbd74e3.	2024-06-27 10:53:56 +01:00
Tarun Prabhu	8dd9494056	Revert "[flang] Add basic -mtune support" (#96678 ) Reverts llvm/llvm-project#95043	2024-06-25 13:25:39 -06:00
Alexis Perry-Holby	a790279bf2	[flang] Add basic -mtune support (#95043 ) This PR adds -mtune as a valid flang flag and passes the information through to LLVM IR as an attribute on all functions. No specific architecture optimizations are added at this time.	2024-06-25 18:39:35 +01:00
David Truby	954b692bd7	[flang] Allow derf as alternate spelling for erf (#95784 ) This patch adds derf as an alternate spelling for the erf intrinsic. This spelling is supported by multiple other compilers and used by WRF.	2024-06-25 01:24:49 +01:00
Peter Klausler	5d15f606da	[flang][preprocessing] Mix preprocessing directives with free form li… (#96244 ) …ne continuation Allow preprocessing directives to appear between a source line and its continuation, including conditional compilation directives (#if, #ifdef, &c.). Fixes https://github.com/llvm/llvm-project/issues/95476.	2024-06-24 10:18:50 -07:00
Kiran Chandramohan	435635652f	Revert "[flang] Fix execute_command_line cmdstat is not set when error occurs" (#96365 ) Reverts llvm/llvm-project#93023 Reverting due to buildbot failure. https://lab.llvm.org/buildbot/#/builders/41/builds/227 test-suite :: Fortran/gfortran/regression/gfortran-regression-execute-regression__execute_command_line_3_f90	2024-06-21 23:47:13 +01:00
Yi Wu	4232dd586b	[flang] Fix execute_command_line cmdstat is not set when error occurs (#93023 ) Fixes: https://github.com/llvm/llvm-project/issues/92929 Also added cmdstat for common linux return code 1, 126, 127	2024-06-21 14:42:07 +01:00
Peter Klausler	f8fc883da9	[flang][runtime] Distinguish VALUE from non-VALUE operations in REDUCE (#95297 ) Accommodate operations with VALUE dummy arguments in the runtime support for the REDUCE intrinsic function by splitting most entry points into Reduce...Ref and Reduce...Value variants. Further work will be needed in lowering to call the ...Value entry points.	2024-06-13 11:10:32 -07:00

1 2 3 4 5 ...

463 Commits