llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-23 20:16:06 +00:00

Author	SHA1	Message	Date
Krzysztof Parzyszek	cd26dd5595	[flang][OpenMP] Use OmpDirectiveSpecification in simple directives (#131162 ) The `OmpDirectiveSpecification` contains directive name, the list of arguments, and the list of clauses. It was introduced to store the directive specification in METADIRECTIVE, and could be reused everywhere a directive representation is needed. In the long term this would unify the handling of common directive properties, as well as creating actual constructs from METADIRECTIVE by linking the contained directive specification with any associated user code.	2025-03-19 11:34:40 -05:00
jeanPerier	cd0a2a3f1b	[flang] add QSORT extension intrinsic to the runtime (#132033 ) Add support for legacy Fortran intrinsic QSORT from lib 3f. This is a thin Fortran wrapper over libc qsort.	2025-03-19 16:14:37 +01:00
Valentin Clement (バレンタインクレメン)	20feca47c1	[flang][cuda] Allow ieee_arithmetic on the device (#131930 ) - Allow ieee_arithmetic on the device - Add ignore_tkr(d) to ieee_is_finite	2025-03-19 07:20:06 -07:00
Kiran Chandramohan	96b112fb61	Revert "[flang][openmp] Adds Parser and Semantic Support for Interop Construct, and Init and Use Clauses." (#132005 ) Reverts llvm/llvm-project#120584 Reverting due to CI failure https://lab.llvm.org/buildbot/#/builders/157/builds/22946	2025-03-19 11:13:52 +00:00
swatheesh-mcw	ee8a759bfb	[flang][openmp] Adds Parser and Semantic Support for Interop Construct, and Init and Use Clauses. (#120584 ) Adds Parser and Semantic Support for the below construct and clauses: - Interop Construct - Init Clause - Use Clause Note: The other clauses supported by Interop Construct such as Destroy, Use, Depend and Device are added already.	2025-03-19 10:49:17 +00:00
Tom Eccles	e7c6e3557b	[flang][OpenMP] Fix threadprivate pointer variable in common block (#131888 ) Fixes #112538 The problem was that the host associated symbol for the threadprivate variable doesn't have all of the symbol attributes (e.g. POINTER). This caused the lowering code to generate the wrong type, eventually hitting an assertion.	2025-03-19 10:12:52 +00:00
jeanPerier	b8271ec8b3	[flang] accept character type in fir::changeTypeShape (#131892 ) There is no reason for character element type to be forbidden in this helper. The assert was firing in character pointer assignment in FORALL after #130772 added a usage of this helper.	2025-03-19 10:01:24 +01:00
Slava Zakharin	fd0e20a64b	[flang] Generate fir.pack/unpack_array in Lowering. (#131704 ) Basic generation of array repacking operations in Lowering.	2025-03-18 21:26:33 -07:00
Slava Zakharin	9ed772cecc	[flang] Fixed computation of position of function's arg in AddDebugInfo. (#131672 ) I am working on `-frepack-array` feature (#127147), which produces non-trivial manipulations with arguments of `fir.declare`. In this case, we end up with CFG computation of the `fir.declare` argument, and AddDebugInfo pass incorrectly mapped two dummy arguments to the same arg index in the debug attributes. This patch makes sure that we assign the arg index only if we can prove that we've traced the block argument to the function's entry block. I believe this problem is not specific to `-frepack-arrays`, e.g. it may appear due to MLIR inlining as well.	2025-03-18 16:46:59 -07:00
Valentin Clement (バレンタインクレメン)	b7ed5c8e06	[flang][cuda] Check for ignore_tkr(d) when resolving generic call (#131923 )	2025-03-18 15:39:04 -07:00
Slava Zakharin	7d7b58bc5d	[flang-rt] Added ShallowCopy API. (#131702 ) This API will be used for copying non-contiguous arrays into contiguous temporaries to support `-frepack-arrays`. The builder factory API will be used in the following commits.	2025-03-18 12:58:25 -07:00
Kelvin Li	6c7c660afe	[flang] Use C-style casts to silence message (NFC) (#131796 )	2025-03-18 13:02:18 -04:00
Slava Zakharin	e0bcf3aa0b	[flang] Allow no type parameters for fir.pack_array. (#131662 ) Arrays with assumed-length types are represented with a box without explicit length parameters. This patch fixes the verification to allow it for `fir.pack_array`.	2025-03-18 07:59:04 -07:00
Akash Banerjee	cbc5c11fec	[MLIR][OpenMP] Add Lowering support for implicitly linking to default declare mappers (#131006 )	2025-03-18 13:17:10 +00:00
Kareem Ergawy	83658ddb1b	[flang][OpenMP] Enable delayed privatization by default for `omp.distribute` (#131574 ) Switches delayed privatization for `omp.distribute` to be on by default: controlled by the `-openmp-enable-delayed-privatization` instead of by `-openmp-enable-delayed-privatization-staging`. ### GFortran & Fujitsu test suite results: #### gfotran test-suite (this PR): ``` Testing Time: 34.51s Passed: 6569 ``` #### Fujitsu without changes (commit: 0813c5cf5f52): ``` Testing Time: 155.39s Passed : 88325 Failed : 156 Executable Missing: 408 ``` #### Fujitsu with changes (this PR): ``` Testing Time: 158.54s Passed : 88325 Failed : 156 Executable Missing: 408 ```	2025-03-18 14:07:41 +01:00
Kareem Ergawy	1094ffcafb	[flang][fir] Add MLIR op for `do concurrent` (#130893 ) Adds new MLIR ops to model `do concurrent`. In order to make `do concurrent` representation self-contained, a loop is modeled using 2 ops, one wrapper and one that contains the actual body of the loop. For example, a 2D `do concurrent` loop is modeled as follows: ```mlir fir.do_concurrent { %i = fir.alloca i32 %j = fir.alloca i32 fir.do_concurrent.loop (%i_iv, %j_iv) = (%i_lb, %j_lb) to (%i_ub, %j_ub) step (%i_st, %j_st) { %0 = fir.convert %i_iv : (index) -> i32 fir.store %0 to %i : !fir.ref<i32> %1 = fir.convert %j_iv : (index) -> i32 fir.store %1 to %j : !fir.ref<i32> } } ``` The `fir.do_concurrent` wrapper op encapsulates both the actual loop and the allocations required for the iteration variables. The `fir.do_concurrent.loop` op is a multi-dimensional op that contains the loop control and body. See the ops' docs for more info.	2025-03-18 10:53:44 +01:00
Valentin Clement (バレンタインクレメン)	e5ec7bb21b	[flang][cuda] Set correct offsets for multiple variables in dynamic shared memory (#131674 )	2025-03-17 17:13:06 -07:00
Valentin Clement (バレンタインクレメン)	74d4fc0a3e	[flang][cuda][NFC] Use ssa value for offset in shared memory op (#131661 ) Switch from attribute to a value as we need to support dynamic offset when multiple variables are used with dynamic shared memory.	2025-03-17 14:23:34 -07:00
Kiran Chandramohan	93e0df07c2	[Flang][OpenMP] Allow zero trait score (#131473 )	2025-03-17 09:49:08 +00:00
sharang.12492	7eb8b73178	[Flang][OpenMP][taskloop] Adding missing semantic checks in Taskloop (#128431 ) Below semantic checks for Taskloop clause mentioned in OpenMP [5.2] specification were missing, this patch contains the semantic checks, corresponding error messages and test cases: OpenMP standard [5.2]: [12.6] Taskloop Construct [Restrictions] Restrictions to the taskloop construct are as follows: • The reduction-modifier must be default. • The conditional lastprivate-modifier must not be specified. Authored-by: shkaushi <sharang.kaushik@amd.com>	2025-03-17 12:35:37 +05:30
Valentin Clement (バレンタインクレメン)	4fde8c341f	[flang][cuda] Lower CUDA shared variable with cuf.shared_memory op (#131399 ) Use `cuf.shared_memory` operation instead of `cuf.alloc` for CUDA shared variable. These variables do not need free operations.	2025-03-16 17:44:56 -07:00
Valentin Clement (バレンタインクレメン)	e86081b6c2	[flang][cuda] Convert cuf.shared_memory operation to LLVM ops (#131396 ) Convert the operation to `llvm.addressof` operation with `llvm.getelementptr` with the appropriate offset.	2025-03-14 19:34:55 -07:00
Valentin Clement (バレンタインクレメン)	4fb20b85fd	[flang][cuda] Compute offset on cuf.shared_memory ops (#131395 ) Add a pass to compute the size of the shared memory (static shared memory) and the offsets of each variables to be placed in shared memory. The global representing the shared memory is also created during this pass. In case of dynamic shared memory, the global as a type of `!fir.array<0xi8>` and the size of the memory is set at kernel launch.	2025-03-14 19:34:35 -07:00
Valentin Clement (バレンタインクレメン)	4818623924	[flang][cuda] Add cuf.shared_memory operation (#131392 ) Introduce `cuf.shared_memory` operation. The operation is used to get the pointer in shared memory for a specific variable. The shared memory is materialized as a global in address space 3 and the different variables are pointing to it at different offset. Follow up patches will add lowering and conversion of this operation.	2025-03-14 15:43:25 -07:00
Valentin Clement (バレンタインクレメン)	a862b6deae	[flang][cuda] Lower shared global to the correct NVVM address space (#131368 ) Global with the CUDA shared data attribute needs to be lowered to llvm globals with the correct address space (3). Address space is set from the `mlir::NVVM::NVVMMemorySpace::kSharedMemorySpace` enum from `mlir/Dialect/LLVMIR/NVVMDialect.h`	2025-03-14 15:28:32 -07:00
Shilei Tian	dccc0a836c	[NFC][AMDGPU] Replace more direct arch comparison with isAMDGCN() (#131379 ) This is an extension of #131357. Hopefully this would be the last one.	2025-03-14 17:02:15 -04:00
Slava Zakharin	00f9c855fb	[flang] Added fir.is_contiguous_box and fir.box_total_elements ops. (#131047 ) These are helper operations to aid with expanding of fir.pack_array.	2025-03-14 08:25:05 -07:00
jeanPerier	3ff3b29dd6	[flang] lower remaining cases of pointer assignments inside forall (#130772 ) Implement handling of `NULL()` RHS, polymorphic pointers, as well as lower bounds or bounds remapping in pointer assignment inside FORALL. These cases eventually do not require updating hlfir.region_assign, lowering can simply prepare the new descriptor for the LHS inside the RHS region. Looking more closely at the polymorphic cases, there is not need to call the runtime, fir.rebox and fir.embox do handle the dynamic type setting correctly. After this patch, the last remaining TODO is the allocatable assignment inside FORALL, which like some cases here, is more likely an accidental feature given FORALL was deprecated in F2003 at the same time than allocatable components where added.	2025-03-14 10:51:46 +01:00
Michael Kruse	bddf24ddbd	[Flang] Add omp_lib dependency to check-flang (#130975 ) With `LLVM_ENABLE_RUNTIMES=openmp`, flang enables the OpenMP regression tests, but `check-flang` was not ensuring that the OpenMP requirements are built first. Fix by adding a `libomp-mod` to `flang-test-depends`. Adding `libomp-mod` to extra_targets is necessary because there is no target from openmp/ that is reachable from the parent bootstrapping-build. `ninja openmp` fails because openmp/ has no `openmp` target. `check-openmp` would also run the OpenMP tests and does not even build `omp_lib.mod`. `runtimes` would build all the runtimes, not just OpenMP. Also fix the misleading CMake configure status messages that suggest the only way to build omp_lib.mod/.h is `LLVM_ENABLE_PROJECTS=openmp`.	2025-03-14 09:24:28 +01:00
Valentin Clement (バレンタインクレメン)	57d87ed7f0	[flang][NFC] Add parenthesis to avoid warning (#131219 ) Remove warning introduced in 369da8421c2f7	2025-03-13 14:28:35 -07:00
Kelvin Li	c2b66ce655	[flang][OpenMP] Silence unused-but-set-variable message (NFC) (#130979 )	2025-03-13 14:09:47 -04:00
Valentin Clement (バレンタインクレメン)	369da8421c	[flang][cuda] Allow assumed-size declaration for SHARED variable (#130833 ) Avoid triggering an assertion for shared variable using the assumed-size syntax. ``` attributes(global) subroutine sharedstar() real, shared :: s(*) ! ok. dynamic shared memory. end subroutine ```	2025-03-13 11:06:17 -07:00
Michael Kruse	d06937aea3	[Flang][NFC] Fix typo (#130960 ) This was mainly a test of the pre-merge CI, but merging it since it fixes an actual typo.	2025-03-13 17:55:07 +01:00
Tom Eccles	01aca42363	[flang] Add support for -f[no-]verbose-asm (#130788 ) This flag provides extra commentary in the assembly output.	2025-03-13 15:22:13 +00:00
Kareem Ergawy	b003face11	[flang][OpenMP] Add `OutlineableOpenMPOpInterface` to `omp.teams` (#131109 ) Given the following input: ```fortran program rep_loopbind implicit none integer :: i real :: priv_val !$omp teams private(priv_val) !$omp distribute do i=1,1000 end do !$omp end teams end program ``` the `AllocaOpConversion` pattern in `FIRToLLVMLowering` would move the private allocations that belong to the `teams` directive (i.e. the allocations needed for the private copies of `priv_val` and the loop's iteration variable) from the the `omp.teams` op to the outside scope. This is not correct since these allocations should be eventually emitted inside the outlined region for the `teams` directive. Without this fix, these allocation would be emitted in the parent function (or the parent scope whatever it is).	2025-03-13 16:03:19 +01:00
Michael Klemm	28ffa7f6a4	[flang][OpenMP] Fix missing missing inode issue (#130798 ) When outlining an offload region, Flang creates a unique name by querying an inode ID. However, when the name of the actual source file does not match the logical file in a `#line` preprocessor directive, code-gen was failing as it could not determine the inode ID. This PR checks for this condition and if the logical file name does not exist, the inode is replaced with a hash value created from the source code itself.	2025-03-13 15:50:37 +01:00
Kajetan Puchalski	0c5eb4d68b	[flang] Use precompiled parsing headers (#130600 ) Most of the high memory usage and compilation time in the frontend units is due to including large parsing headers. This commit moves out several of the largest parsing headers into a new precompiled header linked to flangFrontend. The new compilation metrics for FrontendActions.cpp are as follows: User time (seconds): 38.40 System time (seconds): 2.00 Maximum resident set size (kbytes): 2710964 (2.58 GB) (vs 3.78 GB) ParserActions.cpp: User time (seconds): 69.37 System time (seconds): 1.81 Maximum resident set size (kbytes): 2599456 (2.47 GB) (vs 4 GB) Alongside the new precompiled header compilation unit User time (seconds): 41.61 System time (seconds): 2.72 Maximum resident set size (kbytes): 3107644 (2.96 GB) --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-03-13 10:28:31 +00:00
Mats Petersson	d0188ebcc2	[flang][OpenMP]Add symbls omp_in, omp_out and omp_priv in DECLARE RED… (#129908 ) …UCTION This patch allows better parsing of the reduction and initializer components, including supporting derived types in both those places. There is more work needed here, but this is a definite improvement in what can be handled through parser and semantics. Note that declare reduction is still not supported in lowering, so any attempt to compile DECLARE REDUCTION code will end with a TODO aka "Not yet implemented" abort in the compiler. Note that this version of the code does not cover declaring multiple reductions using the same name with different types. This is will be fixed in a future patch. [This was also the case before this change]. One existing test modified to actually compile (as it didn't in the original form).	2025-03-13 09:39:45 +00:00
Krzysztof Parzyszek	f4fc2d731c	[flang][OpenMP] Map ByRef if size/alignment exceed that of a pointer (#130832 ) Improve the check for whether a type can be passed by copy. Currently, passing by copy is done via the OMP_MAP_LITERAL mapping, which can only transfer as much data as can be contained in a pointer representation.	2025-03-12 19:41:11 -05:00
Slava Zakharin	c542991703	[flang-rt] Fixed HAVE_LDBL_MANT_DIG_113 detection. (#131010 ) I thought I guessed a fix in #130836, but I was wrong. We actually had the same code in `flang/cmake/modules/FlangCommon.cmake`. The check does not pass in flang-rt bootstrap build, because `-nostdinc++` is added for all `runtimes` checks. I decided to make the check with the C header, though, I am still unsure whether it is reliable with a clang that has not been installed (it is taken from the build structure during flang-rt configure step). I verified that this PR enables REAL(16) math entries on aarch64.	2025-03-12 16:50:01 -07:00
Nikita Popov	1a626e63b5	[flang] Fix deprecation warning Adjust for #130940.	2025-03-12 18:00:06 +01:00
Nikita Popov	f137c3d592	[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940 ) This avoids doing a Triple -> std::string -> Triple round trip in lots of places, now that the Module stores a Triple.	2025-03-12 17:35:09 +01:00
Iñaki Amatria Barral	bdbe8fa1f3	[flang] Align `-x` language modes with `gfortran` (#130268 ) This PR addresses some of the issues described in https://github.com/llvm/llvm-project/issues/127617. Key changes: - Stop assuming fixed-form for `-x f95` unless the input is a `.i` file. This change ensures compatibility with `-save-temps` workflows while preventing unintended fixed-form assumptions. - Ensure `-x f95-cpp-input` enables `-cpp` by default, aligning Flang's behavior with `gfortran`.	2025-03-12 16:45:33 +01:00
Asher Mancinelli	982527eef0	[flang] Use saturated intrinsics for floating point to integer conversions (#130686 ) The saturated floating point conversion intrinsics match the semantics in the standard more closely than the fptosi/fptoui instructions. Case 2 of 16.9.100 is > INT (A [, KIND]) > If A is of type real, there are two cases: if \|A\| < 1, INT (A) has the value 0; if \|A\| ≥ 1, INT (A) is the integer whose magnitude is the largest integer that does not exceed the magnitude of A and whose sign is the same as the sign of A. Currently, converting a floating point value into an integer type too small to hold the constant will be converted to poison in opt, leaving us with garbage: ``` > cat t.f90 program main real(kind=16) :: f integer(kind=4) :: i f=huge(f) i=f print , i end program main # current upstream > for i in `seq 10`; do; ./a.out; done -862156992 -1497393344 -739096768 -1649494208 1761228608 -1959270592 -746244288 -1629194432 -231217344 382322496 ``` With the saturated fptoui/fptosi intrinsics, we get the appropriate values ``` # mine > flang -O2 ./t.f90 && ./a.out 2147483647 > perl -e 'printf "%d\n", (2 * 31) - 1' 2147483647 ``` One notable difference: NaNs being converted to ints will become zero, unlike current flang (and some other compilers). Newer versions of GCC have this behavior.	2025-03-12 08:14:46 -07:00
Michael Kruse	cbeae3e117	[Flang] Fix libquadmath in non-LLVM_ENABLE_RUNTIMES build. The LLVM_ENABLE_RUNTIMES build introduced a new configure-time header quadmath_wrapper.h. Also create the header in non-LLVM_ENABLE_RUNTIMES builds.	2025-03-12 14:55:09 +01:00
Sergio Afonso	cf68c9378b	[Flang][OpenMP] Move declare mapper sym creation outside loop, NFC (#130794 ) This patch simplifies the definition of `ClauseProcessor::processMapObjects` by hoisting the creation of the MLIR symbol associated to an existing `omp.declare_mapper` operation outside of the loop processing all mapped objects. That change removes some inter-iteration dependencies that made the implementation more difficult to follow.	2025-03-12 11:54:29 +00:00
Tom Eccles	c851ee38ad	[flang][OpenMP] catch namelist access through equivalence (#130804 ) The standard prohibits privatising namelist variables. We also decided in #110671 to prohibit reductions of namelist variables. This commit prevents this rule from being circumvented through the use of equivalence statements. Fixes #122824	2025-03-12 11:45:15 +00:00
jeanPerier	15e335f04f	[flang] also set llvm ABI argument attributes on direct calls (#130736 ) So far, flang was not setting argument attributes on direct calls assuming that putting them on the function operation was enough. It was clarified in `38565da525` that they must be set on both call and functions, even for direct calls. Crashes have been observed because of the lack of the attribute when compiling `abs(x)` at `O2` and above on X86-64 for complex(16).	2025-03-12 09:55:05 +01:00
Slava Zakharin	74eba972ca	[flang] Definitions of fir.pack/unpack_array operations. (#130698 ) As defined in #127147.	2025-03-11 14:15:29 -07:00
jeanPerier	356bf3fa2d	Reland " [flang] Rely on global initialization for simpler derived types" (#130290 ) Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization. Note: this relands #114002 with the fix for the LLVM timeout regressions that have been seen. The fix is to use the added fir.copy to avoid aggregate load/store. Co-authored-by: NimishMishra <42909663+NimishMishra@users.noreply.github.com>	2025-03-11 15:19:43 +01:00

1 2 3 4 5 ...

9983 Commits