llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 06:56:05 +00:00

Author	SHA1	Message	Date
David Truby	44aa476aa1	[flang] AArch64 ABI for BIND(C) VALUE parameters (#118305 ) This patch adds handling for derived type VALUE parameters in BIND(C) functions for AArch64.	2024-12-18 07:43:22 +00:00
David Truby	4c6e13f644	[flang] Add cmake error if building with clang-cl and MSVC 17.12 (#120114 )	2024-12-18 06:15:29 +00:00
Kareem Ergawy	dc936f3c19	Revert "[flang][OpenMP] Implicitly map allocatable record fields (#117867 )" (#120360 )	2024-12-18 06:52:24 +01:00
Kareem Ergawy	db09014a07	[flang][OpenMP] Implicitly map allocatable record fields (#117867 ) This is a starting PR to implicitly map allocatable record fields. This PR contains the following changes: 1. Re-purposes some of the utils used in `Lower/OpenMP.cpp` so that these utils work on the `mlir::Value` level rather than the `semantics::Symbol` level. This takes one step towards to enabling MLIR passes to more easily do some lowering themselves (e.g. creating `omp.map.bounds` ops for implicitely caputured data like this PR does). 2. Adds support for implicitely capturing and mapping allocatable fields in record types. There is quite some distant to still cover to have full support for this. I added a number of todos to guide further development. Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com> Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>	2024-12-18 05:37:58 +01:00
Valentin Clement (バレンタインクレメン)	81333cfc52	[flang][cuda] Relax host array check for cuda constant (#120333 ) Array with CONSTANT attribute declared in module spec part are device arrays and should not trigger the host array check.	2024-12-17 17:04:32 -08:00
Valentin Clement (バレンタインクレメン)	15c61a208f	[flang][cuda] Do not consider SHARED array as host array (#120306 ) Update the current `FindHostArray` to not return shared array as host array.	2024-12-17 13:42:14 -08:00
Valentin Clement (バレンタインクレメン)	97b7bace67	[flang][cuda] Allow host array with PARAMETER attribute in device context (#120298 ) Host arrays are normally not allowed in device context unless they have a `PARAMETER` attribute. This patch update the check so no error is emitted.	2024-12-17 13:41:24 -08:00
Valentin Clement (バレンタインクレメン)	bbeafe4b94	[flang][cuda] Apply implict data attribute to local arrays (#120293 ) Add the implicit data attribute to local arrays that don't have one. This simplifies the host array detection in semantic.	2024-12-17 12:56:39 -08:00
Peter Klausler	a957cedea9	[flang] Handle substring in data statement constant (#120130 ) The case of a constant substring wasn't handled in the parser for data statement constants. Fixes https://github.com/llvm/llvm-project/issues/119005.	2024-12-17 12:10:50 -08:00
Peter Klausler	b2c363e261	[flang] Fix generic resolution with actual/dummy procedure incompatib… (#120105 ) …ility We generally allow any legal procedure pointer target as an actual argument for association with a dummy procedure, since many actual procedures are underspecified EXTERNALs. But for proper generic resolution, it is necessary to disallow incompatible functions with explicit result types. Fixes https://github.com/llvm/llvm-project/issues/119151.	2024-12-17 12:10:29 -08:00
Slava Zakharin	9d33874936	[flang] Support -f[no-]realloc-lhs. (#120165 ) -frealloc-lhs is the default. If -fno-realloc-lhs is specified, then an allocatable on the left side of an intrinsic assignment is not implicitly (re)allocated to conform with the right hand side. Fortran runtime will issue an error if there is a mismatch in shape/type/allocation-status.	2024-12-17 09:06:05 -08:00
paperchalice	b07e7b76c5	[cmake] Drop `AddFileDependencies` and `CMakeParseArguments` (#120002 ) Theses modules are deprecated and have trivial implementations in modern cmake.	2024-12-17 19:24:32 +08:00
Valentin Clement (バレンタインクレメン)	5e1f87e849	[flang][cuda] Correctly allocate memory for descriptor load (#120164 ) CodeGen will allocate memory for a new descriptor on descriptor loads. CUDA Fortran local descriptor are allocated in managed memory by the runtime. The newly allocated storage for cuda descriptor must also be allocated through the runtime.	2024-12-16 19:12:05 -08:00
Valentin Clement (バレンタインクレメン)	67ae944bfa	[flang][cuda] Check for use of host array in device context (#119756 ) Now that variables have implicit attribute, we can check for illegal use of module host variable in device context.	2024-12-16 13:34:22 -08:00
Slava Zakharin	0b442bc516	[flang][NFC] Added debug output to opt-bufferization pass. (#119936 )	2024-12-16 12:58:22 -08:00
Valentin Clement (バレンタインクレメン)	f8656204d7	[flang][cuda] Do not lower device target in porgram as global (#120126 ) As it was done in #102512, do not create global for arrays declared in program unit with cuda data attribute.	2024-12-16 12:34:01 -08:00
Slava Zakharin	2402bccc80	[flang] Turn SimplifyHLFIRIntrinsics into a greedy rewriter. (#119946 ) This is almost an NFC, except that folding changed ordering of some operations.	2024-12-16 08:00:29 -08:00
Slava Zakharin	f239922cdc	[flang] Enable hlfir.sum inlining by default. (#119937 ) There is already a LIT test for hlfir.sum inlining that uses the engineering option. I would like to keep the option for short period of time to be able to revert in case of performance regressions that I was not able to see.	2024-12-16 07:59:15 -08:00
Valentin Clement (バレンタインクレメン)	65e00315c9	[flang][cuda] Adapt TargetRewrite to support gpu.launch_func (#119933 )	2024-12-16 06:53:46 -08:00
macurtis-amd	3c3094b60d	[flang] Ensure directive sentinels are in cols 1-5 in pp output (#119406 ) Preprocessor output is intended to be valid fixed form.	2024-12-16 08:25:37 -06:00
Valentin Clement (バレンタインクレメン)	1345ee4232	[flang][cuda] Do not apply implicit data attribute on dummy arg with VALUE (#119927 ) Dummy arguments with the VALUE attribute do not need the implicit data attribute.	2024-12-13 14:41:49 -08:00
Valentin Clement (バレンタインクレメン)	3273d0bb14	[flang][cuda] Apply implicit data attribute only in device context (#119919 ) Fix the condition so the implicit device data attribute is not applied when the routine has `attribute(host)`	2024-12-13 13:43:33 -08:00
Slava Zakharin	a00946fc94	[flang] Simplify hlfir.sum total reductions. (#119482 ) I am trying to switch to keeping the reduction value in a temporary scalar location so that I can use hlfir::genLoopNest easily. This also allows using omp.loop_nest with worksharing for OpenMP.	2024-12-13 13:08:28 -08:00
Slava Zakharin	af5d3afff5	[flang] Improve disjoint/identical slices recognition in opt-bufferization. (#119780 ) The changes are needed to be able to optimize 'x(9,:)=SUM(x(1:8,:),DIM=1)' without a temporary array. This pattern exists in exchange2. The patch also fixes an existing problem in Flang with this test: ``` program main integer :: a(10) = (/1,2,3,4,5,6,7,8,9,10/) integer :: expected(10) = (/1,10,9,8,7,6,5,4,3,2/) print , 'INPUT: ', a print , 'EXPECTED: ', expected call test(a, 10, 2, 10, 9) print *, 'RESULT: ', a contains subroutine test(a, size, x, y, z) integer :: x, y, z, size integer :: a(:) a(x:y:1) = a(z:x-1:-1) + 1 end subroutine test end program main ```	2024-12-13 13:08:02 -08:00
Krzysztof Parzyszek	c57a8f5b3f	[flang][OpenMP] Remove redundant `Fortran::` from namespaces, NFC Apply clang-format after the changes.	2024-12-13 11:00:05 -06:00
Mats Petersson	75e6d0eb4d	[flang][OpenMP]Add support for OpenMP ERROR directive (#119582 ) Lowering leads to a TODO, with a test to confirm. Also testing unparse. --------- Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>	2024-12-13 14:05:48 +00:00
Ivan R. Ivanov	7c9404c279	[flang][OpenMP] Add frontend support for ompx_bare clause (#111106 )	2024-12-13 21:44:43 +09:00
Valentin Clement (バレンタインクレメン)	37978c466b	[flang][cuda] Remove unused variable	2024-12-12 15:04:16 -08:00
Valentin Clement (バレンタインクレメン)	ea04148c27	[flang][cuda] Extend implicit global handling to any type descriptor (#119769 ) Relax the check to also handle other type descriptor globals.	2024-12-12 14:52:49 -08:00
Valentin Clement (バレンタインクレメン)	7141837957	[flang][cuda] Implicitly add DEVICE attribute in device/global functions (#119743 ) Variables in global and device function/subroutine that have no CUDA Fortran data attribute are implicitly DEVICE.	2024-12-12 12:47:34 -08:00
Slava Zakharin	139e69b7bc	[flang] Simple folding for hlfir.shape_of. (#119649 ) This folding makes sure there are no hlfir.shape_of users of hlfir.elemental - this may enable more InlineElementals matches, because it is looking for exactly two uses of an hlfir.elemental.	2024-12-12 10:38:34 -08:00
Krzysztof Parzyszek	03cbe42627	[flang][OpenMP] Rework LINEAR clause (#119278 ) The OmpLinearClause class was a variant of two classes, one for when the linear modifier was present, and one for when it was absent. These two classes did not follow the conventions for parse tree nodes, (i.e. tuple/wrapper/union formats), which necessitated specialization of the parse tree visitor. The new form of OmpLinearClause is the standard tuple with a list of modifiers and an object list. The specialization of parse tree visitor for it has been removed. Parsing and unparsing of the new form bears additional complexity due to syntactical differences between OpenMP 5.2 and prior versions: in OpenMP 5.2 the argument list is post-modified, while in the prior versions, the step modifier was a post-modifier while the linear modifier had an unusual syntax of `modifier(list)`. With this change the LINEAR clause is no different from any other clauses in terms of its structure and use of modifiers. Modifier validation and all other checks work the same as with other clauses.	2024-12-12 12:19:35 -06:00
Krzysztof Parzyszek	58f9c4fc00	[flang][OpenMP] Semantic checks for IN_REDUCTION and TASK_REDUCTION (#118841 ) Update parsing of these two clauses and add semantic checks for them. Simplify some code in IsReductionAllowedForType and CheckReductionOperator.	2024-12-12 12:19:12 -06:00
Kareem Ergawy	f9734b9df1	[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in `omp.target` ops (#116576 ) This PR adds support to translate the `private` clause from MLIR to LLVMIR when used on allocatables in the context of an `omp.target` op. This replaces https://github.com/llvm/llvm-project/pull/113208. Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the latest commit is relevant to the PR.	2024-12-12 14:39:58 +01:00
Tom Eccles	32403f79f4	[flang][unittests] fix test broken when run as root (#119604 ) It is convenient to run tests as root inside of a docker container. The test (and the library function it is testing) are already unsupported on Windows so it is safe to use UNIX-isms here.	2024-12-12 09:41:44 +00:00
Valentin Clement (バレンタインクレメン)	956d0dd624	[flang][cuda] Support builtin global in device global pass (#119626 )	2024-12-11 17:09:56 -08:00
Valentin Clement (バレンタインクレメン)	151901c762	[flang][rt][device] Use enum-set.h as Fortran.h (#119611 )	2024-12-11 15:38:38 -08:00
Slava Zakharin	5eef9ba784	[flang] Inline hlfir.cshift as hlfir.elemental. (#119480 )	2024-12-11 15:00:07 -08:00
Leandro Lupori	db9856b516	[flang][OpenMP][NFC] Turn symTable into a reference (#119435 ) Convert `DataSharingProcessor::symTable` from pointer to reference. This avoids accidental null pointer dereferences and makes it possible to use `symTable` when delayed privatization is disabled.	2024-12-11 16:26:19 -03:00
Mats Petersson	00e1cc4c9d	[flang][OpenMP]Add support for fail clause (#118683 ) Support the atomic compare option of a fail(memory-order) clauses. Additional tests introduced to check that parsing and semantics checks for the new clause is handled. Lowering for atomic compare is still unsupported and wil end in a TOOD (aka "Not yet implemented"). A test for this case with the fail clause is also present.	2024-12-11 16:29:02 +00:00
Paul Osmialowski	03019c687f	[clang][driver] When -fveclib=ArmPL flag is in use, always link against libamath (#116432 ) Using `-fveclib=ArmPL` without `-lamath` likely effects in the link-time errors.	2024-12-11 14:01:29 +00:00
khaki3	609899f443	[flang][cuda] Avoid stack corruption when setting kernel launch parameters (#119469 ) In order to get the pointer to a structure member, `getelementptr` typically requires two indices: one to indicate the structure itself, and another to specify the member's position. We are missing the former in `GPULaunchKernelConversion`, so generated code may cause stack corruption. This PR corrects the indices of a structure used as a kernel launch temp.	2024-12-10 16:08:22 -08:00
Valentin Clement (バレンタインクレメン)	850c932f05	[flang][cuda] Walk through cuf kernel for implicit globals (#119455 ) Globals used in cuf kernel need to be flagged as well.	2024-12-10 14:01:53 -08:00
Valentin Clement (バレンタインクレメン)	8c19c24a78	[flang][cuda][NFC] Add missing template declaration (#119443 )	2024-12-10 13:10:23 -08:00
Valentin Clement (バレンタインクレメン)	dc5236e6b1	[flang][cuda] Update target rewrite to work on gpu.func (#119283 ) Update the pass so it can perform the signature rewrite on gpu.func.	2024-12-10 12:36:49 -08:00
khaki3	e9866d5d14	[flang][cuda] Fix GPULaunchKernelConversion to generate correct kernel launch parameters (#119431 ) For the call to _FortranACUFLaunchKernel, we store the pointer to a member of a temporary structure in a parameter array. However, when we obtain an element pointer from the parameter array, its address is calculated based on the type of the structure. This PR properly treats the parameter array as an array of pointers. Example: ```mlir %30 = llvm.load %29 : !llvm.ptr -> i32 %31 = llvm.mlir.constant(1 : i32) : i32 %32 = llvm.alloca %31 x !llvm.struct<(i64, i64, i32, ptr)> : (i32) -> !llvm.ptr %33 = llvm.mlir.constant(4 : i32) : i32 %34 = llvm.alloca %33 x !llvm.ptr : (i32) -> !llvm.ptr %35 = llvm.mlir.constant(0 : i32) : i32 %36 = llvm.getelementptr %32[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.struct<(i64, i64, i32, ptr)> llvm.store %8, %36 : i64, !llvm.ptr %37 = llvm.getelementptr %34[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.struct<(i64, i64, i32, ptr)> llvm.store %36, %37 : !llvm.ptr, !llvm.ptr ... llvm.call @_FortranACUFLaunchKernel(%47, %8, %8, %8, %2, %8, %8, %7, %34, %48) : (!llvm.ptr, i64, i64, i64, i64, i64, i64, i32, !llvm.ptr, !llvm.ptr) -> () ``` In this example, `%37 = llvm.getelementptr %34[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.struct<(i64, i64, i32, ptr)>` will be `%37 = llvm.getelementptr %34[%35] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.ptr`.	2024-12-10 11:32:32 -08:00
Valentin Clement (バレンタインクレメン)	0469bb91aa	[flang][cuda] Fix lowering when step is a variable (#119421 ) Add missing conversion.	2024-12-10 09:48:15 -08:00
Slava Zakharin	c7634c1b61	[flang] Disabled hlfir.sum inlining by default. (#119287 ) To temporarily address exchange2 perf regression reported in #118556 I disabled the inlining by default, and put it under engineering option `-flang-simplify-hlfir-sum`.	2024-12-10 09:18:50 -08:00
jeanPerier	28a0ad09c1	[flang][hlfir] fix issue 118922 (#119219 ) hlfir.elemental codegen optimize-out the final as_expr copy for temps local to its body, but sometimes, clean-up may have been emitted for this temp, and the code did not handle that. This caused #118922 and @113843. Only elide the copy if the as_expr is the last op.	2024-12-10 15:00:32 +01:00
Paul Osmialowski	f8a1f42dd5	[test][flang][driver] Fix test that assumes libomp default (#119368 ) This patch supplements the fix introduced by PR #119319.	2024-12-10 13:52:55 +00:00

1 2 3 4 5 ...

9432 Commits