llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-17 01:46:36 +00:00

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	18ff8df958	[flang][cuda] Register managed variables with double descriptor (#134444 ) Allocatable or pointer module variables with the CUDA managed attribute are defined with a double descriptor. One on the host and one on the device. Only the data pointed to by the descriptor will be allocated in managed memory. Allow the registration of any allocatable or pointer module variables like device or constant.	2025-04-04 14:38:01 -07:00
Valentin Clement (バレンタインクレメン)	24dfcc0c02	[flang][cuda] Use the nvvm.vote.sync op for all and any (#134433 ) NVVM operations are now available for all and any as well. Use the op and clean up the generation function to handle all the 3 vote sync kinds.	2025-04-04 13:45:03 -07:00
Eugene Epshteyn	61af05fe82	[flang] Add runtime and lowering implementation for extended intrinsic PUTENV (#134412 ) Implement extended intrinsic PUTENV, both function and subroutine forms. Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and semantic unit tests.	2025-04-04 16:26:08 -04:00
Valentin Clement (バレンタインクレメン)	cd2f85a24b	[mlir][NVVM] Add ops for vote all and any sync (#134309 ) Add operations for `nvvm.vote.all.sync` and `nvvm.vote.any.sync` intrinsics similar to `nvvm.vote.ballot.sync`.	2025-04-04 11:06:10 -07:00
Peter Klausler	5942f0269e	[flang] Preserve compiler directives in -E output (#133959 ) No longer require -fopenmp or -fopenacc with -E, unless specific version number options are also required for predefined macros. This means that most source can be preprocessed with -E and then later compiled with -fopenmp, -fopenacc, or neither. This means that OpenMP conditional compilation lines (!$) are also passed through to -E output. The tricky part of this patch was dealing with the fact that those conditional lines can also contain regular Fortran line continuation, and that now has to be deferred when !$ lines are interspersed.	2025-04-04 09:49:57 -07:00
Peter Klausler	1bef59c9db	[flang][preprocessor] Further macro replacement of continued identifiers (#134302 ) The preprocessor can perform macro replacement within identifiers when they are split up with Fortran line continuation, but is failing to do macro replacement on a continued identifier when none of its parts are replaced.	2025-04-04 08:44:22 -07:00
Peter Klausler	507ce46b6f	[flang][preprocessor] Directive continuation must skip empty macros (#134149 ) When a compiler directive continuation line starts with keyword macro names that have empty expansions, skip them.	2025-04-04 08:43:56 -07:00
Peter Klausler	efd7caac2e	[flang] IEEE_SUPPORT_FLAG(..., LOCAL) in specification expression (#134270 ) The optional second argument to IEEE_SUPPORT_FLAG (and related functions from the intrinsic IEEE_ARITHMETIC module) is needed only for its type, not its value. Restrictions on local objects as arguments to function references in specification expressions shouldn't apply to it. Define a new attribute for dummy data object characteristics to distinguish such arguments, set it for the appropriate intrinsic function references, and test it during specification expression validation.	2025-04-04 08:43:25 -07:00
Peter Klausler	262b3f7615	[flang] Remove runtime dependence on C++ support for types (#134164 ) Fortran::runtime::Descriptor::BytesFor() only works for Fortran intrinsic types for which a C++ type counterpart exists, so it crashes on some types that are legitimate Fortran types like REAL(2). Move some logic from Evaluate into a new header in flang/Common, then use it to avoid this needless dependence on C++.	2025-04-04 08:42:38 -07:00
Peter Klausler	3674a5f18e	[flang] Permit unused USE association of subprogram name (#134009 ) A function or subroutine can allow an object of the same name to appear in its scope, so long as the name is not used. This is similar to the case of a name being imported from multiple distinct modules, and implemented by the same representation. It's not clear whether this is conforming behavior or a common extension.	2025-04-04 08:41:32 -07:00
Peter Klausler	c8bde44cfc	[flang] Implement FSEEK and FTELL (#133003 ) Add function and subroutine forms of FSEEK and FTELL as intrinsic procedures. Accept common aliases from legacy compilers as well. A separate patch to llvm-test-suite will enable tests for these procedures once this patch has merged. Depends on https://github.com/llvm/llvm-project/pull/132423; CI builds will likely fail until that patch is merged and this PR is rebased.	2025-04-04 08:40:51 -07:00
Asher Mancinelli	85fd83ed49	[flang][nfc] Use llvm memmove intrinsic over regular call (#134294 ) Follow up to #134170. We should be using the LLVM intrinsics instead of plain fir.calls when we can. Existing code creates a declaration for the llvm intrinsic and a regular fir.call, which makes it hard for consumers of the IR to find all the intrinsic calls.	2025-04-04 06:13:30 -07:00
Sergio Afonso	a17d49687a	[Flang][Driver][AMDGPU] Fix -mcode-object-version (#134230 ) This patch updates flang to follow clang's behavior when processing the `-mcode-object-version` option. It is now used to populate an LLVM module flag called `amdhsa_code_object_version` expected by the backend and also updates the driver to add the `--amdhsa-code-object-version` option to the frontend invocation for device compilation of AMDGPU targets.	2025-04-04 11:54:49 +01:00
Kareem Ergawy	6333f8457c	[flang][OpenMP] Move reductions from `loop` to `teams` when `loop` is mapped to `distribute` (#132920 ) Follow-up to #132003, in particular, see https://github.com/llvm/llvm-project/pull/132003#issuecomment-2739701936. This PR extends reduction support for `loop` directives. Consider the following scenario: ```fortran subroutine bar implicit none integer :: x, i !$omp teams loop reduction(+: x) DO i = 1, 5 call foo() END DO end subroutine ``` Note the following: * According to the spec, the `reduction` clause will be attached to `loop` during earlier stages in the compiler. * Additionally, `loop` cannot be mapped to `distribute parallel for` due to the call to a foreign function inside the loop's body. * Therefore, `loop` must be mapped to `distribute`. * However, `distribute` does not have `reduction` clauses. * As a result, we have to move the `reduction`s from the `loop` to its parent `teams` directive, which is what is done by this PR.	2025-04-04 06:20:51 +02:00
Andre Kuhlenschmidt	b11eece1bb	[flang][intrinsics] Implement the time intrinsic (#133823 ) This PR implements the nonstandard intrinsic time. In addition to running the unit tests, I also double checked that the example code works by manually compiling and running it.	2025-04-03 15:33:40 -07:00
Andre Kuhlenschmidt	85fdab33b0	[flang][intrinsic] add nonstandard intrinsic unlink (#134162 ) This PR adds the intrinsic `unlink` to flang. ## Test plan - Added two codegen unit tests and ensured flang-check continues to pass. - Manually compiled and ran the example from the documentation.	2025-04-03 14:33:53 -07:00
Valentin Clement (バレンタインクレメン)	fb6f60ddc5	[flang][cuda][NFC] Use NVVM VoteBallotOp (#134307 ) `llvm.nvvm.vote.ballot.sync` has its own operation so use it in lowering.	2025-04-03 14:19:31 -07:00
Valentin Clement (バレンタインクレメン)	de40f6101d	[flang][cuda][NFC] Use NVVM op for match all (#134303 )	2025-04-03 14:19:21 -07:00
Valentin Clement (バレンタインクレメン)	7288f1bc32	[flang][cuda] Use nvvm operation for match any (#134283 ) The string used for intrinsic was not the correct one "llvm.nvvm.match.any.sync.i32p". There was an extra `p` at the end. Use the NVVM operation instead so we don't duplicate it.	2025-04-03 12:08:30 -07:00
Slava Zakharin	3f6ae3f0a8	[flang] Added driver options for arrays repacking. (#134002 ) Added options: * -f[no-]repack-arrays * -f[no-]stack-repack-arrays * -frepack-arrays-contiguity=whole/innermost	2025-04-03 10:43:28 -07:00
Valentin Clement (バレンタインクレメン)	3e59ff27e5	[flang][cuda] Fix pred type for vote functions (#134166 )	2025-04-03 10:33:09 -07:00
Asher Mancinelli	d7d91500b6	[flang][nfc] Initial changes needed to use llvm intrinsics instead of regular calls (#134170 ) Flang uses `fir.call <llvm intrinsic>` in a few places. This means consumers of the IR need to strcmp every fir.call if they want to find a particular LLVM intrinsic. Emit LLVM memcpy intrinsics instead.	2025-04-03 08:37:40 -07:00
Sergio Afonso	18dd299fb1	[Flang][MLIR][OpenMP] Host-evaluation of omp.loop bounds (#133908 ) This patch updates Flang lowering and kernel flags identification in MLIR so that loop bounds on `target teams loop` constructs are evaluated on the host, making the trip count available to the corresponding `__tgt_target_kernel` call emitted for the target region. This is necessary in order to properly execute these constructs as `target teams distribute parallel do`. Co-authored-by: Kareem Ergawy <kareem.ergawy@amd.com>	2025-04-03 15:06:19 +01:00
Valentin Clement (バレンタインクレメン)	db21ae7803	[flang][cuda] Support any_sync and ballot_sync (#134135 )	2025-04-02 14:26:09 -07:00
Krzysztof Parzyszek	564e04b703	[flang][OpenMP] Use function symbol on DECLARE TARGET (#134107 ) Consider: ``` function foo() !$omp declare target(foo) ! This `foo` was a function-result symbol ... end ``` When resolving symbols, for this case use the symbol corresponding to the function instead of the symbol corresponding to the function result. Currently, this will result in an error: ``` error: A variable that appears in a DECLARE TARGET directive must be declared in the scope of a module or have the SAVE attribute, either explicitly or implicitly ```	2025-04-02 15:16:33 -05:00
Kazu Hirata	aa33c09561	[flang] Fix a warning This patch fixes: flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp:184:18: error: unused variable 'loc' [-Werror,-Wunused-variable]	2025-04-02 10:14:50 -07:00
vdonaldson	8a0f694381	[flang] Legacy ASSIGN statement target processing (#133737 ) Like other target statements, the statement associated with the label in a legacy ASSIGN statement could be inside a construct. Constructs containing such a target must therefore be marked as unstructured, fairly similar to how targets are processed in `markBranchTarget`.	2025-04-02 09:52:13 -04:00
Kareem Ergawy	de6c9096ba	[flang][OpenMP] Handle "loop-local values" in `do concurrent` nests (#127635 ) Extends `do concurrent` mapping to handle "loop-local values". A loop-local value is one that is used exclusively inside the loop but allocated outside of it. This usually corresponds to temporary values that are used inside the loop body for initialzing other variables for example. After collecting these values, the pass localizes them to the loop nest by moving their allocations. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635 (this PR)	2025-04-02 15:43:19 +02:00
مهدي شينون (Mehdi Chinoune)	666df54ea6	[flang] Fold double bessel functions on Windows. (#130253 ) There are no functions for `float`. see: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/bessel-functions-j0-j1-jn-y0-y1-yn	2025-04-02 14:43:09 +01:00
Jean-Didier PAILLEUX	c309abd925	[flang] Implement !DIR$ NOVECTOR and !DIR$ NOUNROLL[_AND_JAM] (#133885 ) Hi, This patch implements support for the following directives : - `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO LOOP. - `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP. - `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.	2025-04-02 14:30:01 +02:00
Kareem Ergawy	ef56b53712	[flang][OpenMP] Extend `do concurrent` mapping to multi-range loops (#127634 ) Adds support for converting mulit-range loops to OpenMP (on the host only for now). The changes here "prepare" a loop nest for collapsing by sinking iteration variables to the innermost `fir.do_loop` op in the nest. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 (this PR) - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 12:43:04 +02:00
Tom Eccles	9b2fd1a6ec	[flang][OpenMP] Bump default OpenMP version to 3.1 (#133745 ) Precise OpenMP standards support information is being documented in #132707 Flang now has good support for OpenMP Version 3.1 and earlier.	2025-04-02 10:43:48 +01:00
Kareem Ergawy	3f8bfc9f7f	[flang][OpenMP] Map simple `do concurrent` loops to OpenMP host constructs (#127633 ) Upstreams one more part of the ROCm `do concurrent` to OpenMP mapping pass. This PR add support for converting simple loops to the equivalent OpenMP constructs on the host: `omp parallel do`. Towards that end, we have to collect more information about loop nests for which we add new utils in the `looputils` name space. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 (this PR) - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 11:26:58 +02:00
Kareem Ergawy	41d718b1cf	[flang][OpenMP] Upstream `do concurrent` loop-nest detection. (#127595 ) Upstreams the next part of do concurrent to OpenMP mapping pass (from AMD's ROCm implementation). See https://github.com/llvm/llvm-project/pull/126026 for more context. This PR add loop nest detection logic. This enables us to discover muli-range do concurrent loops and then map them as "collapsed" loop nests to OpenMP. This is a follow up for https://github.com/llvm/llvm-project/pull/126026, only the latest commit is relevant. This is a replacement for https://github.com/llvm/llvm-project/pull/127478 using a `/user/<username>/<branchname>` branch. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 (this PR) - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 10:12:52 +02:00
Kareem Ergawy	5d364481e3	[flang][OpenMP] Upstream first part of `do concurrent` mapping (#126026 ) This PR starts the effort to upstream AMD's internal implementation of `do concurrent` to OpenMP mapping. This replaces #77285 since we extended this WIP quite a bit on our fork over the past year. An important part of this PR is a document that describes the current status downstream, the upstreaming status, and next steps to make this pass much more useful. In addition to this document, this PR also contains the skeleton of the pass (no useful transformations are done yet) and some testing for the added command line options. This looks like a huge PR but a lot of the added stuff is documentation. It is also worth noting that the downstream pass has been validated on https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived performance speed-ups that match pure OpenMP, for GPU mapping we are still working on extending our support for implicit memory mapping and locality specifiers. PR stack: - https://github.com/llvm/llvm-project/pull/126026 (this PR) - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 09:24:38 +02:00
Valentin Clement (バレンタインクレメン)	ae8dd63681	[flang][cuda] Add interface and lowering for all_sync (#134001 )	2025-04-01 17:59:11 -07:00
Andre Kuhlenschmidt	b6edd25f17	[flang][intrinsics] NFC: make comment consistent (#133972 ) Just makes this named argument comment consistent with all the others in the file.	2025-04-01 14:30:10 -07:00
Slava Zakharin	58551faaf1	[flang] Inline fir.is_contiguous_box in some cases. (#133812 ) Added inlining for `rank == 1` and `innermost` cases.	2025-04-01 08:41:11 -07:00
Jean-Didier PAILLEUX	513a91a5f1	[flang/flang-rt] Implement PERROR intrinsic form GNU Extension (#132406 ) Add the implementation of the `PERROR(STRING) ` intrinsic from the GNU Extension to prints on the stderr a newline-terminated error message corresponding to the last system error prefixed by `STRING`. (https://gcc.gnu.org/onlinedocs/gfortran/PERROR.html)	2025-04-01 15:47:54 +02:00
Tom Eccles	e17d864f55	[flang][OpenMP][Lower] lower array subscripts for task depend (#132994 ) The OpenMP standard says that all dependencies in the same set of inter-dependent tasks must be non-overlapping. This simplification means that the OpenMP only needs to keep track of the base addresses of dependency variables. This can be seen in kmp_taskdeps.cpp, which stores task dependency information in a hash table, using the base address as a key. This patch generates a rebox operation to slice boxed arrays, but only the box data address is used for the task dependency. The extra box is optimized away by LLVM at O3. Vector subscripts are TODO (I will address in my next patch). This also fixes a bug for ordinary subscripts when the symbol was mapped to a box: Fixes #132647	2025-04-01 10:26:14 +01:00
Jean-Didier PAILLEUX	bae3577002	[flang] Define ERF, ERFC and ERFC_SCALED intrinsics with Q and D prefix (#125217 ) `ERF`, `ERFC` and `ERFC_SCALED` intrinsics prefixed by `Q` and `D` are missing. Codes such as `CP2K`(https://github.com/cp2k/cp2k) and `TurboRVB`(https://github.com/sissaschool/turborvb) use these intrinsics just like defined in the GNU standard and here: https://www.ibm.com/docs/fr/xl-fortran-aix/16.1.0?topic=reference-intrinsic-procedures These intrinsics are based on the existing intrinsics but apply a restriction on the type kind. - `DERF`, `DERFC` and `DERFC_SCALED` are for double précision only. - `QERF`, `QERFC` and `QERFC_SCALED` are for quad précision only.	2025-04-01 08:07:26 +02:00
Thirumalai Shaktivel	091dcb8fc2	[Flang] Make a private copy for the common block variables in copyin clause (#111359 ) Fixes: https://github.com/llvm/llvm-project/issues/82949	2025-04-01 11:35:44 +05:30
Slava Zakharin	5f268d04f9	[flang] Code generation for fir.pack/unpack_array. (#132080 ) The code generation relies on `ShallowCopyDirect` runtime to copy data between the original and the temporary arrays (both directions). The allocations are done by the compiler generated code. The heap allocations could have been passed to `ShallowCopy` runtime, but I decided to expose the allocations so that the temporary descriptor passed to `ShallowCopyDirect` has `nocapture` - maybe this will be better for LLVM optimizations.	2025-03-31 11:42:17 -07:00
Slava Zakharin	0ac8cb1b3d	[flang] Recognize fir.pack_array in LoopVersioning. (#133191 ) This change enables LoopVersioning when `fir.pack_array` is met in the def-use chain. It fixes a couple of huge performance regressions caused by enabling `-frepack-arrays`.	2025-03-31 11:41:43 -07:00
Thirumalai Shaktivel	374a5bea52	[Flang][OpenMP] Add PointerAssociateScalar to Cray Pointer used in the DSA (#133232 ) Issue: Cray Pointer is not associated to Cray Pointee, leading to Segmentation fault Fix: GetUltimate, retrieves the base symbol in the current scope, which gets passed all the references and returns the original symbol --------- Co-authored-by: Michael Klemm <michael.klemm@amd.com>	2025-03-29 15:39:12 +01:00
swatheesh-mcw	fe30cf18ab	Revert "Revert "[flang][openmp] Adds Parser and Semantic Support for Interop Construct, and Init and Use Clauses."" (#132343 ) Reverts llvm/llvm-project#132005	2025-03-28 15:21:52 +00:00
Nick Sarnie	48b7530273	[clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU function to Triple (#126956 ) I'm adding support for SPIR-V, so let's consolidate these checks. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-03-28 14:19:20 +00:00
Joseph Huber	772173f548	[Clang][AMDGPU] Remove special handling for COV4 libraries (#132870 ) Summary: When we were first porting to COV5, this lead to some ABI issues due to a change in how we looked up the work group size. Bitcode libraries relied on the builtins to emit code, but this was changed between versions. This prevented the bitcode libraries, like OpenMP or libc, from being used for both COV4 and COV5. The solution was to have this 'none' functionality which effectively emitted code that branched off of a global to resolve to either version. This isn't a great solution because it forced every TU to have this variable in it. The patch in https://github.com/llvm/llvm-project/pull/131033 removed support for COV4 from OpenMP, which was the only consumer of this functionality. Other users like HIP and OpenCL did not use this because they linked the ROCm Device Library directly which has its own handling (The name was borrowed from it after all). So, now that we don't need to worry about backward compatibility with COV4, we can remove this special handling. Users can still emit COV4 code, this simply removes the special handling used to make the OpenMP device runtime bitcode version agnostic.	2025-03-28 07:35:16 -05:00
Bruno Cardoso Lopes	7c3ecffe9b	[MLIR][LLVMIR] Add support for the full form of global_{ctor,dtor} (#133176 ) Currently only ctor/dtor list and their priorities are supported. This PR adds support for the missing data field. Few implementation notes: - The assembly printer has a fixed form because previous `attr_dict` will sort the dict by key name, making global_dtor and global_ctor differ in the order of printed arguments. - LLVM's `ptr null` is being converted to `#llvm.zero` otherwise we'd have to create a region to use the default operation conversion from `ptr null`, which is silly given that the field only support null or a symbol.	2025-03-27 14:11:05 -07:00
Andre Kuhlenschmidt	077940621d	[flang][openacc] Make OpenACC block construct parse errors less verbose. (#131042 ) This PR does reduces the verbosity of parser errors for OpenACC block constructs that do not parse correctly because they are missing their trailing end block directive by: - Removing the redundant error messages created by parsing 3 different styles of directive tokens. - Providing a general mechanism of configuring the max number of contexts printed for every syntax error. - Not printing less specific contexts that are at the same location. Prior to the changes: ``` $ flang -fc1 -fopenacc -fsyntax-only flang/test/Parser/acc-data-statement.f90 2>&1 \| tee acc-data-statement.prior.log \| wc -l 262 ``` [acc-data-statement.prior.log](https://github.com/user-attachments/files/19298165/acc-data-statement.prior.log) ``` $ flang -fc1 -fopenacc -fsyntax-only flang/test/Parser/acc-data-statement.f90 2>&1 \| tee acc-data-statement.prior.log \| wc -l 73 ``` [acc-data-statement.post.log](https://github.com/user-attachments/files/19298181/acc-data-statement.post.log)	2025-03-26 12:36:04 -07:00

1 2 3 4 5 ...

7511 Commits