llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-25 02:26:05 +00:00

Author	SHA1	Message	Date
Kareem Ergawy	f9734b9df1	[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in `omp.target` ops (#116576 ) This PR adds support to translate the `private` clause from MLIR to LLVMIR when used on allocatables in the context of an `omp.target` op. This replaces https://github.com/llvm/llvm-project/pull/113208. Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the latest commit is relevant to the PR.	2024-12-12 14:39:58 +01:00
Tom Eccles	32403f79f4	[flang][unittests] fix test broken when run as root (#119604 ) It is convenient to run tests as root inside of a docker container. The test (and the library function it is testing) are already unsupported on Windows so it is safe to use UNIX-isms here.	2024-12-12 09:41:44 +00:00
Valentin Clement (バレンタインクレメン)	956d0dd624	[flang][cuda] Support builtin global in device global pass (#119626 )	2024-12-11 17:09:56 -08:00
Valentin Clement (バレンタインクレメン)	151901c762	[flang][rt][device] Use enum-set.h as Fortran.h (#119611 )	2024-12-11 15:38:38 -08:00
Slava Zakharin	5eef9ba784	[flang] Inline hlfir.cshift as hlfir.elemental. (#119480 )	2024-12-11 15:00:07 -08:00
Leandro Lupori	db9856b516	[flang][OpenMP][NFC] Turn symTable into a reference (#119435 ) Convert `DataSharingProcessor::symTable` from pointer to reference. This avoids accidental null pointer dereferences and makes it possible to use `symTable` when delayed privatization is disabled.	2024-12-11 16:26:19 -03:00
Mats Petersson	00e1cc4c9d	[flang][OpenMP]Add support for fail clause (#118683 ) Support the atomic compare option of a fail(memory-order) clauses. Additional tests introduced to check that parsing and semantics checks for the new clause is handled. Lowering for atomic compare is still unsupported and wil end in a TOOD (aka "Not yet implemented"). A test for this case with the fail clause is also present.	2024-12-11 16:29:02 +00:00
Paul Osmialowski	03019c687f	[clang][driver] When -fveclib=ArmPL flag is in use, always link against libamath (#116432 ) Using `-fveclib=ArmPL` without `-lamath` likely effects in the link-time errors.	2024-12-11 14:01:29 +00:00
khaki3	609899f443	[flang][cuda] Avoid stack corruption when setting kernel launch parameters (#119469 ) In order to get the pointer to a structure member, `getelementptr` typically requires two indices: one to indicate the structure itself, and another to specify the member's position. We are missing the former in `GPULaunchKernelConversion`, so generated code may cause stack corruption. This PR corrects the indices of a structure used as a kernel launch temp.	2024-12-10 16:08:22 -08:00
Valentin Clement (バレンタインクレメン)	850c932f05	[flang][cuda] Walk through cuf kernel for implicit globals (#119455 ) Globals used in cuf kernel need to be flagged as well.	2024-12-10 14:01:53 -08:00
Valentin Clement (バレンタインクレメン)	8c19c24a78	[flang][cuda][NFC] Add missing template declaration (#119443 )	2024-12-10 13:10:23 -08:00
Valentin Clement (バレンタインクレメン)	dc5236e6b1	[flang][cuda] Update target rewrite to work on gpu.func (#119283 ) Update the pass so it can perform the signature rewrite on gpu.func.	2024-12-10 12:36:49 -08:00
khaki3	e9866d5d14	[flang][cuda] Fix GPULaunchKernelConversion to generate correct kernel launch parameters (#119431 ) For the call to _FortranACUFLaunchKernel, we store the pointer to a member of a temporary structure in a parameter array. However, when we obtain an element pointer from the parameter array, its address is calculated based on the type of the structure. This PR properly treats the parameter array as an array of pointers. Example: ```mlir %30 = llvm.load %29 : !llvm.ptr -> i32 %31 = llvm.mlir.constant(1 : i32) : i32 %32 = llvm.alloca %31 x !llvm.struct<(i64, i64, i32, ptr)> : (i32) -> !llvm.ptr %33 = llvm.mlir.constant(4 : i32) : i32 %34 = llvm.alloca %33 x !llvm.ptr : (i32) -> !llvm.ptr %35 = llvm.mlir.constant(0 : i32) : i32 %36 = llvm.getelementptr %32[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.struct<(i64, i64, i32, ptr)> llvm.store %8, %36 : i64, !llvm.ptr %37 = llvm.getelementptr %34[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.struct<(i64, i64, i32, ptr)> llvm.store %36, %37 : !llvm.ptr, !llvm.ptr ... llvm.call @_FortranACUFLaunchKernel(%47, %8, %8, %8, %2, %8, %8, %7, %34, %48) : (!llvm.ptr, i64, i64, i64, i64, i64, i64, i32, !llvm.ptr, !llvm.ptr) -> () ``` In this example, `%37 = llvm.getelementptr %34[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.struct<(i64, i64, i32, ptr)>` will be `%37 = llvm.getelementptr %34[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.ptr`.	2024-12-10 11:32:32 -08:00
Valentin Clement (バレンタインクレメン)	0469bb91aa	[flang][cuda] Fix lowering when step is a variable (#119421 ) Add missing conversion.	2024-12-10 09:48:15 -08:00
Slava Zakharin	c7634c1b61	[flang] Disabled hlfir.sum inlining by default. (#119287 ) To temporarily address exchange2 perf regression reported in #118556 I disabled the inlining by default, and put it under engineering option `-flang-simplify-hlfir-sum`.	2024-12-10 09:18:50 -08:00
jeanPerier	28a0ad09c1	[flang][hlfir] fix issue 118922 (#119219 ) hlfir.elemental codegen optimize-out the final as_expr copy for temps local to its body, but sometimes, clean-up may have been emitted for this temp, and the code did not handle that. This caused #118922 and @113843. Only elide the copy if the as_expr is the last op.	2024-12-10 15:00:32 +01:00
Paul Osmialowski	f8a1f42dd5	[test][flang][driver] Fix test that assumes libomp default (#119368 ) This patch supplements the fix introduced by PR #119319.	2024-12-10 13:52:55 +00:00
NimishMishra	edc50f3954	[flang][OpenMP] Add lowering support for task detach (#119128 ) This PR adds lowering task detach to MLIR.	2024-12-10 03:25:06 -08:00
执着	e8baa792e7	Backtrace support for flang (#118179 ) Fixed build failures in old PRs due to missing files	2024-12-10 10:31:48 +00:00
Yusuke MINATO	a88677edc0	Reland "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" (#118933 ) This relands #110063. The performance issue on 503.bwaves_r is found not to be related to the patch, and is resolved by fbd89bcc when LTO is enabled.	2024-12-10 16:26:53 +09:00
Valentin Clement	7bcd459dce	[flang][cuda][NFC] Fix typo in test filename	2024-12-09 19:22:30 -08:00
Valentin Clement (バレンタインクレメン)	a1d71c3693	[flang][cuda] Additional update to ExternalNameConversion (#119276 )	2024-12-09 17:39:51 -08:00
Valentin Clement (バレンタインクレメン)	650e736904	[flang][cuda][NFC] Add some diagnostic when module or fct are not found (#119277 )	2024-12-09 17:39:36 -08:00
Valentin Clement (バレンタインクレメン)	75623bfe1b	[flang][cuda] Handle gpu.return in AbstractResult pass (#119035 )	2024-12-09 17:39:16 -08:00
Razvan Lupusoru	a0eb794da8	[MLIR][acc] Introduce varType to acc data clause operations (#119007 ) The acc data clause operations hold an operand named `varPtr`. This was intended to hold a pointer to a variable - where the element type of that pointer specifies the type of the variable. However, for both memref and llvm dialects, this assumption is not true. This is because memref element type for cases like memref<10xf32> is simply f32 and for LLVM, after opaque pointers, the variable type is no longer recoverable. Thus, introduce varType to ensure that appropriate semantics are kept. Both the parser and printer for this new type attribute allow it to not be specified in cases where a dialect's getElementType() applied to `varPtr`'s type has a recoverable type. And more specifically, for FIR, no changes are needed in the MLIR unit tests.	2024-12-09 15:14:48 -08:00
Slava Zakharin	44cd8f0d06	[flang] Lower CSHIFT to hlfir.cshift operation. (#118917 )	2024-12-09 14:02:58 -08:00
Valentin Clement (バレンタインクレメン)	1d4b5c161f	[flang][cuda] Change how abstract result pass is scheduled on func.func and gpu.func (#119034 ) Use `pm.nest` to schedule the pass on nested `func.func` and `gpu.func` in the `gpu.module`. AbstractResult pass is not meant to run on the whole gpu.module at once.	2024-12-09 13:31:27 -08:00
Slava Zakharin	110b891f93	[flang] Added lowering for hlfir.cshift operation. (#118918 )	2024-12-09 11:02:11 -08:00
Kiran Chandramohan	4e59721cc6	[Flang][OpenMP] Make boxed procedure pass aware of OpenMP private ops (#118261 ) Fixes #109727	2024-12-09 17:27:18 +00:00
Kiran Chandramohan	2344cc4983	[Flang] Update Maintainers (#117124 ) Move to a markdown file and update maintainers. This brings the project closer to updated guidance (https://llvm.org/docs/DeveloperPolicy.html#maintainers). A list of active and inactive maintainers is provided. Maintainers are also grouped into lead or component maintainers.	2024-12-09 17:18:06 +00:00
Slava Zakharin	084451cdd2	[flang] Do not inline SUM with invalid DIM argument. (#118911 ) Such SUMs might appear in dead code after constant propagation. They do not have to be inlined.	2024-12-09 07:55:52 -08:00
Slava Zakharin	1ca392764a	[flang] Added definition of hlfir.cshift operation. (#118732 ) CSHIFT intrinsic will be lowered to this operation, which then can be optimized as inline sequence or lowered into a runtime call.	2024-12-09 07:55:22 -08:00
Zhaoxin Yang	669f704d0d	[Flang][LoongArch] Enable clang command-line options in flang. (#118244 ) Mainly including the following LoongArch specific options: -m[no-]lsx, -m[no-]lasx, -msimd=, -m[no-]frecipe, -m[no-]lam-bh, -m[no-]lamcas, -m[no-]ld-seq-sa, -m[no-]div32, -m[no-]annotate-tablejump	2024-12-09 19:59:39 +08:00
Valentin Clement (バレンタインクレメン)	16c2a1016e	Revert "[flang] Allow to pass an async id to allocate the descriptor (#118713 )" (#119109 ) This reverts commit 7d1c661381d36018fd105f4ad4c2d6dc45e7288b. This commit breaks some device runtime builds. Need time to investigate.	2024-12-07 19:55:12 -08:00
Paul Osmialowski	755519f7f6	[clang][driver] Use $ prefix with config file options to have them added after all of the command line options (#117573 ) Currently, if a -l (or -Wl,) flag is added into a config file (e.g. clang.cfg), it is situated before any object file in the effective command line. If the library requested by given -l flag is static, its symbols will not be made visible to any of the object files provided by the user. Also, the presence of any of the linker flags in a config file confuses the driver whenever the user invokes clang without any parameters (see issue #67209). This patch attempts to solve both of the problems, by allowing a split of the arguments list into two parts. The head part of the list will be used as before, but the tail part will be appended after the command line flags provided by the user and only when it is known that the linking should occur. The $-prefixed arguments will be added to the tail part.	2024-12-07 11:18:44 +00:00
Thirumalai Shaktivel	e73ec1a74a	[Flang][OpenMP] Add some semantic checks for Linear clause (#111354 ) This PR adds all the missing semantics for the Linear clause based on the OpenMP 5.2 restrictions. The restriction details are mentioned below. OpenMP 5.2: 5.4.6 linear Clause restrictions - A linear-modifier may be specified as ref or uval only on a declare simd directive. - If linear-modifier is not ref, all list items must be of type integer. - If linear-modifier is ref or uval, all list items must be dummy arguments without the VALUE attribute. - List items must not be Cray pointers or variables that have the POINTER attribute. Cray pointer support has been deprecated. - If linear-modifier is ref, list items must be polymorphic variables, assumed-shape arrays, or variables with the ALLOCATABLE attribute. - A common block name must not appear in a linear clause. - The list-item cannot appear more than once 4.4.4 ordered Clause restriction - If n is explicitly specified, a linear clause must not be specified on the same directive. 5.11 aligned Clause restriction - Each list item must have C_PTR or Cray pointer type or have the POINTER or ALLOCATABLE attribute. Cray pointer support has been deprecated.	2024-12-06 12:11:46 -06:00
Krzysztof Parzyszek	02db35a1d6	[flang][OpenMP] Implement `CheckReductionObjects` for all reduction c… (#118689 ) …lauses Currently we only do semantic checks for REDUCTION. There are two other clauses, IN_REDUCTION, and TASK_REDUCTION which will also need those checks. Implement a function that checks the common list-item requirements for all those clauses.	2024-12-06 12:00:48 -06:00
jeanPerier	d6ec7c82f3	[flang][CUF] fix missing header after #112188 (#118993 ) Otherwise, builds with `-DFLANG_CUF_RUNTIME` hits: ``` runtime/CUDA/descriptor.cpp:44:24: error: invalid use of incomplete type 'const class Fortran::runtime::Descriptor' 44 \| std::size_t count{src->SizeInBytes()}; ```	2024-12-06 17:22:47 +01:00
Michael Kruse	c91ba04328	[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188 ) Split some headers into headers for public and private declarations in preparation for #110217. Moving the runtime-private headers in runtime-private include directory will occur in #110298. * Do not use `sizeof(Descriptor)` in the compiler. The size of the descriptor is target-dependent while `sizeof(Descriptor)` is the size of the Descriptor for the host platform which might be too small when cross-compiling to a different platform. Another problem is that the emitted assembly ((cross-)compiling to the same target) is not identical between Flang's running on different systems. Moving the declaration of `class Descriptor` out of the included header will also reduce the amount of #included sources. * Do not use `sizeof(ArrayConstructorVector)` and `alignof(ArrayConstructorVector)` in the compiler. Same reason as with `Descriptor`. * Compute the descriptor's extra flags without instantiating a Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime source, but not the compiler source. * Move `InquiryKeywordHashDecode` into runtime-private header. The function is defined in the runtime sources and trying to call it in the compiler would lead to a link-error. * Move allocator-kind magic numbers into common header. They are the only declarations out of `allocator-registry.h` in the compiler as well. This does not make Flang cross-compile ready yet, the main goal is to avoid transitive header dependencies from Flang to clang-rt. There are more assumptions that host platform is the same as the target platform.	2024-12-06 15:29:00 +01:00
Renaud Kauffmann	27e458c8cb	[flang][cuda] Distinguish constant fir.global from globals with a #cuf.cuda<constant> attribute (#118912 ) 1. In `CufOpConversion` `isDeviceGlobal` was renamed `isRegisteredGlobal` and moved to the common file. `isRegisteredGlobal` excludes constant `fir.global` operation from registration. This is to avoid calls to `_FortranACUFGetDeviceAddress` on globals which do not have any symbols in the runtime. This was done for `_FortranACUFRegisterVariable` in #118582, but also needs to be done here after #118591 2. `CufDeviceGlobal` no longer adds the `#cuf.cuda<constant>` attribute to the constant global. As discussed in #118582 a module variable with the #cuf.cuda<constant> attribute is not a compile time constant. Yet, the compile time constant also needs to be copied into the GPU module. The candidates for copy to the GPU modules are - the globals needing regsitrations regardless of their uses in device code (they can be referred to in host code as well) - the compile time constant when used in device code 3. The registration of "constant" module device variables ( #cuf.cuda<constant>) can be restored in `CufAddConstructor`	2024-12-05 18:36:48 -08:00
Slava Zakharin	cc46d0bee9	[flang] Expand SUM(DIM=CONSTANT) into an hlfir.elemental. (#118556 ) An array SUM with the specified constant DIM argument may be expanded into hlfir.elemental with a reduction loop inside it processing all elements of the specified dimension. The expansion allows further optimization of the cases like `A=SUM(B+1,DIM=1)` in the optimized bufferization pass (given that it can prove there are no read/write conflicts).	2024-12-05 09:36:12 -08:00
Slava Zakharin	3f0cc068ce	[flang] Assume matching shapes in elemental assignment with non-realloc lhs. (#118552 ) The optimized bufferization pass cannot optimize very simple cases of elemental assignments, because of the suboptimal checks order. This patch relies on the fact that in a legal program the lhs and rhs of an assignment have matching shapes, when lhs is not an allocatable and rhs is a result of an elemental array operation.	2024-12-05 09:34:32 -08:00
Valentin Clement (バレンタインクレメン)	83ccaad473	[flang][cuda] Use async id for device stream allocation (#118733 ) When stream is specified use cudaMallocAsync with the specified stream	2024-12-05 08:57:10 -08:00
Krzysztof Parzyszek	8a90b5b317	[flang][test] Change re.I to flags=re.I in re.sub Follow-up to da6099c9ad. As a positional argument, the `re.I` was in place of `count`, not `flags`.	2024-12-05 09:41:40 -06:00
jeanPerier	ff78cd5f3d	[flang] fix private pointers and default initialized variables (#118494 ) Both OpenMP privatization and DO CONCURRENT LOCAL lowering was incorrect for pointers and derived type with default initialization. For pointers, the descriptor was not established with the rank/type code/element size, leading to undefined behavior if any inquiry was made to it prior to a pointer assignment (and if/when using the runtime for pointer assignments, the descriptor must have been established). For derived type with default initialization, the copies were not default initialized.	2024-12-05 14:09:48 +01:00
Krzysztof Parzyszek	da6099c9ad	[flang][test] Recognize !$acc and !$omp spelled with capital letters (#118666 ) If there are any continuation lines in the source, they will be printed by the unparser with capital letters (at least in case of OpenMP). To avoid having them stripped out, recognize their spellings using capital letters as well. --------- Co-authored-by: Michael Kruse <github@meinersbur.de>	2024-12-05 06:44:38 -06:00
Michael Kruse	0cda970ecc	[Flang][NFC] Split common headers to reduce dependencies. (#110244 ) Fortran.h and target.h are defining symbols where some are used by both, the Fortran runtime (Flang-RT) and Fortran compiler (Flang), and others are used by Flang only. With the upcoming refactoring of the Fortran runtime into its own subproject (#110217), move the declarations that are used by both into new headers to minimize the amount of code that will need to be shared by Flang-RT and Flang. Details: * `Fortran.h`: Flang-RT only uses some enum definitions out of this file, but not `AsFortran` which is defined in `Fortran.cpp`. Moving the enums into `Fortran-consts.h` allows keeping `Fortran.cpp` within Flang. * `target.h`: Contains some floating-point definitions that is used by the non-GTest unittests in `fp-testing.h`. Flang-RT also uses some non-GTest as well. Moving those definitions avoids the dependence on the entire FortranEvaluate library.	2024-12-05 11:29:32 +01:00
Valentin Clement (バレンタインクレメン)	7d1c661381	[flang] Allow to pass an async id to allocate the descriptor (#118713 ) This is a patch in preparation for the support stream ordered memory allocator in CUDA Fortran. This patch adds an asynchronous id to the AllocatableAllocate runtime function and to Descriptor::Allocate so it can be passed down to the registered allocator. It is up to the allocator to use this value or not. A follow up patch will implement that asynchronous allocator for CUDA Fortran.	2024-12-04 18:24:40 -08:00
vdonaldson	df43af40ec	Vkd1 (#118721 )	2024-12-04 19:16:49 -05:00
vdonaldson	17f99accf2	[flang] build test fix/suppression (#118716 )	2024-12-04 18:47:45 -05:00

1 2 3 4 5 ...

9399 Commits