llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 04:46:07 +00:00

Author	SHA1	Message	Date
Asher Mancinelli	b2711e1526	[flang][nfc] Support volatile on ref, box, and class types (#134386 ) Part one of merging #132486. Add support for representing volatility in the type system for reference, box, and class types. Don't do anything with volatile just yet, only support and test their representation and utility functions. The naming convention is a little goofy - `fir::isa_volatile_type` and `fir::updateTypeWithVolatility` use different capitalization, but I put them near similar functions and tried to match the surrounding conventions and [the docs](https://github.com/llvm/llvm-project/blob/main/flang/docs/C%2B%2Bstyle.md#naming) best I could.	2025-04-07 06:51:02 -07:00
Tom Eccles	b9c876dd9a	[flang][test] fix sporadically failing test (#134608 ) The test is checking output from MLIR debug prints. MLIR passes can be executed in parallel, for example a pass on func.func might schedule different func.func operations in different threads. This led to intermittent test failures where debug output from different threads became mixed up. Fix by disabling mlir multithreading for this test.	2025-04-07 12:10:43 +01:00
Zhen Wang	8f0d8d28cc	Delete duplicated hlfir.declare op of induction variables of do concurrent when inside cuf kernel directive. (#134467 ) Delete duplicated creation of hlfir.declare op of do concurrent induction variables when inside cuf kernel directive. Obtain the correct hlfir.declare op generated from bindSymbol, and add it to ivValues.	2025-04-06 19:31:09 -07:00
Slava Zakharin	7001993880	[flang] Temporary include variant.h in enum-class.h. (#134460 ) I am having problems building Fortran runtime for CUDA after #134164. I need more time to investigate it, but in the meantime including variant.h (or any header that eventually includes a libcudacxx header) resolves the issue.	2025-04-04 16:39:49 -07:00
Valentin Clement (バレンタインクレメン)	18ff8df958	[flang][cuda] Register managed variables with double descriptor (#134444 ) Allocatable or pointer module variables with the CUDA managed attribute are defined with a double descriptor. One on the host and one on the device. Only the data pointed to by the descriptor will be allocated in managed memory. Allow the registration of any allocatable or pointer module variables like device or constant.	2025-04-04 14:38:01 -07:00
Valentin Clement (バレンタインクレメン)	24dfcc0c02	[flang][cuda] Use the nvvm.vote.sync op for all and any (#134433 ) NVVM operations are now available for all and any as well. Use the op and clean up the generation function to handle all the 3 vote sync kinds.	2025-04-04 13:45:03 -07:00
Eugene Epshteyn	61af05fe82	[flang] Add runtime and lowering implementation for extended intrinsic PUTENV (#134412 ) Implement extended intrinsic PUTENV, both function and subroutine forms. Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and semantic unit tests.	2025-04-04 16:26:08 -04:00
Valentin Clement (バレンタインクレメン)	cd2f85a24b	[mlir][NVVM] Add ops for vote all and any sync (#134309 ) Add operations for `nvvm.vote.all.sync` and `nvvm.vote.any.sync` intrinsics similar to `nvvm.vote.ballot.sync`.	2025-04-04 11:06:10 -07:00
Peter Klausler	5942f0269e	[flang] Preserve compiler directives in -E output (#133959 ) No longer require -fopenmp or -fopenacc with -E, unless specific version number options are also required for predefined macros. This means that most source can be preprocessed with -E and then later compiled with -fopenmp, -fopenacc, or neither. This means that OpenMP conditional compilation lines (!$) are also passed through to -E output. The tricky part of this patch was dealing with the fact that those conditional lines can also contain regular Fortran line continuation, and that now has to be deferred when !$ lines are interspersed.	2025-04-04 09:49:57 -07:00
Peter Klausler	1bef59c9db	[flang][preprocessor] Further macro replacement of continued identifiers (#134302 ) The preprocessor can perform macro replacement within identifiers when they are split up with Fortran line continuation, but is failing to do macro replacement on a continued identifier when none of its parts are replaced.	2025-04-04 08:44:22 -07:00
Peter Klausler	507ce46b6f	[flang][preprocessor] Directive continuation must skip empty macros (#134149 ) When a compiler directive continuation line starts with keyword macro names that have empty expansions, skip them.	2025-04-04 08:43:56 -07:00
Peter Klausler	efd7caac2e	[flang] IEEE_SUPPORT_FLAG(..., LOCAL) in specification expression (#134270 ) The optional second argument to IEEE_SUPPORT_FLAG (and related functions from the intrinsic IEEE_ARITHMETIC module) is needed only for its type, not its value. Restrictions on local objects as arguments to function references in specification expressions shouldn't apply to it. Define a new attribute for dummy data object characteristics to distinguish such arguments, set it for the appropriate intrinsic function references, and test it during specification expression validation.	2025-04-04 08:43:25 -07:00
Peter Klausler	262b3f7615	[flang] Remove runtime dependence on C++ support for types (#134164 ) Fortran::runtime::Descriptor::BytesFor() only works for Fortran intrinsic types for which a C++ type counterpart exists, so it crashes on some types that are legitimate Fortran types like REAL(2). Move some logic from Evaluate into a new header in flang/Common, then use it to avoid this needless dependence on C++.	2025-04-04 08:42:38 -07:00
Peter Klausler	3674a5f18e	[flang] Permit unused USE association of subprogram name (#134009 ) A function or subroutine can allow an object of the same name to appear in its scope, so long as the name is not used. This is similar to the case of a name being imported from multiple distinct modules, and implemented by the same representation. It's not clear whether this is conforming behavior or a common extension.	2025-04-04 08:41:32 -07:00
Peter Klausler	c8bde44cfc	[flang] Implement FSEEK and FTELL (#133003 ) Add function and subroutine forms of FSEEK and FTELL as intrinsic procedures. Accept common aliases from legacy compilers as well. A separate patch to llvm-test-suite will enable tests for these procedures once this patch has merged. Depends on https://github.com/llvm/llvm-project/pull/132423; CI builds will likely fail until that patch is merged and this PR is rebased.	2025-04-04 08:40:51 -07:00
Asher Mancinelli	85fd83ed49	[flang][nfc] Use llvm memmove intrinsic over regular call (#134294 ) Follow up to #134170. We should be using the LLVM intrinsics instead of plain fir.calls when we can. Existing code creates a declaration for the llvm intrinsic and a regular fir.call, which makes it hard for consumers of the IR to find all the intrinsic calls.	2025-04-04 06:13:30 -07:00
Sergio Afonso	a17d49687a	[Flang][Driver][AMDGPU] Fix -mcode-object-version (#134230 ) This patch updates flang to follow clang's behavior when processing the `-mcode-object-version` option. It is now used to populate an LLVM module flag called `amdhsa_code_object_version` expected by the backend and also updates the driver to add the `--amdhsa-code-object-version` option to the frontend invocation for device compilation of AMDGPU targets.	2025-04-04 11:54:49 +01:00
Kareem Ergawy	6333f8457c	[flang][OpenMP] Move reductions from `loop` to `teams` when `loop` is mapped to `distribute` (#132920 ) Follow-up to #132003, in particular, see https://github.com/llvm/llvm-project/pull/132003#issuecomment-2739701936. This PR extends reduction support for `loop` directives. Consider the following scenario: ```fortran subroutine bar implicit none integer :: x, i !$omp teams loop reduction(+: x) DO i = 1, 5 call foo() END DO end subroutine ``` Note the following: * According to the spec, the `reduction` clause will be attached to `loop` during earlier stages in the compiler. * Additionally, `loop` cannot be mapped to `distribute parallel for` due to the call to a foreign function inside the loop's body. * Therefore, `loop` must be mapped to `distribute`. * However, `distribute` does not have `reduction` clauses. * As a result, we have to move the `reduction`s from the `loop` to its parent `teams` directive, which is what is done by this PR.	2025-04-04 06:20:51 +02:00
Slava Zakharin	65b85bf8bc	[flang] Fixed driver link LIT test for PPC targets. (#134320 ) After #131041, the F128 libraries are not linked for PPC targets even when the driver is built with FLANG_RUNTIME_F128_MATH_LIB.	2025-04-03 16:58:11 -07:00
Andre Kuhlenschmidt	b11eece1bb	[flang][intrinsics] Implement the time intrinsic (#133823 ) This PR implements the nonstandard intrinsic time. In addition to running the unit tests, I also double checked that the example code works by manually compiling and running it.	2025-04-03 15:33:40 -07:00
Andre Kuhlenschmidt	85fdab33b0	[flang][intrinsic] add nonstandard intrinsic unlink (#134162 ) This PR adds the intrinsic `unlink` to flang. ## Test plan - Added two codegen unit tests and ensured flang-check continues to pass. - Manually compiled and ran the example from the documentation.	2025-04-03 14:33:53 -07:00
Valentin Clement (バレンタインクレメン)	fb6f60ddc5	[flang][cuda][NFC] Use NVVM VoteBallotOp (#134307 ) `llvm.nvvm.vote.ballot.sync` has its own operation so use it in lowering.	2025-04-03 14:19:31 -07:00
Valentin Clement (バレンタインクレメン)	de40f6101d	[flang][cuda][NFC] Use NVVM op for match all (#134303 )	2025-04-03 14:19:21 -07:00
Valentin Clement (バレンタインクレメン)	7288f1bc32	[flang][cuda] Use nvvm operation for match any (#134283 ) The string used for intrinsic was not the correct one "llvm.nvvm.match.any.sync.i32p". There was an extra `p` at the end. Use the NVVM operation instead so we don't duplicate it.	2025-04-03 12:08:30 -07:00
Slava Zakharin	b8b752db2b	[flang][NFC] Create required Source dir for flang-doc. (#134000 )	2025-04-03 10:43:49 -07:00
Slava Zakharin	3f6ae3f0a8	[flang] Added driver options for arrays repacking. (#134002 ) Added options: * -f[no-]repack-arrays * -f[no-]stack-repack-arrays * -frepack-arrays-contiguity=whole/innermost	2025-04-03 10:43:28 -07:00
Valentin Clement (バレンタインクレメン)	3e59ff27e5	[flang][cuda] Fix pred type for vote functions (#134166 )	2025-04-03 10:33:09 -07:00
Asher Mancinelli	d7d91500b6	[flang][nfc] Initial changes needed to use llvm intrinsics instead of regular calls (#134170 ) Flang uses `fir.call <llvm intrinsic>` in a few places. This means consumers of the IR need to strcmp every fir.call if they want to find a particular LLVM intrinsic. Emit LLVM memcpy intrinsics instead.	2025-04-03 08:37:40 -07:00
Daniel Chen	2080334574	[flang-rt] Pass the whole path of libflang_rt.runtime.a to linker on AIX and LoP (#131041 ) This PR is to improve the driver code to build `flang-rt` path by re-using the logic and code of `compiler-rt`. 1. Moved `addFortranRuntimeLibraryPath` and `addFortranRuntimeLibs` to `ToolChain.h` and made them virtual so that they can be overridden if customization is needed. The current implementation of those two procedures is moved to `ToolChain.cpp` as the base implementation to default to. 2. Both AIX and PPCLinux now override `addFortranRuntimeLibs`. The overriding function of `addFortranRuntimeLibs` for both AIX and PPCLinux calls `getCompilerRTArgString` => `getCompilerRT` => `buildCompilerRTBasename` to get the path to `flang-rt`. This code handles `LLVM_ENABLE_PER_TARGET_RUNTIME_DIR` setting. As shown in `PPCLinux.cpp`, `FT_static` is the default. If not found, it will search and build for `FT_shared`. To differentiate `flang-rt` from `clang-rt`, a boolean flag `IsFortran` is passed to the chain of functions in order to reach `buildCompilerRTBasename`.	2025-04-03 11:21:19 -04:00
Sergio Afonso	18dd299fb1	[Flang][MLIR][OpenMP] Host-evaluation of omp.loop bounds (#133908 ) This patch updates Flang lowering and kernel flags identification in MLIR so that loop bounds on `target teams loop` constructs are evaluated on the host, making the trip count available to the corresponding `__tgt_target_kernel` call emitted for the target region. This is necessary in order to properly execute these constructs as `target teams distribute parallel do`. Co-authored-by: Kareem Ergawy <kareem.ergawy@amd.com>	2025-04-03 15:06:19 +01:00
Valentin Clement (バレンタインクレメン)	db21ae7803	[flang][cuda] Support any_sync and ballot_sync (#134135 )	2025-04-02 14:26:09 -07:00
Krzysztof Parzyszek	564e04b703	[flang][OpenMP] Use function symbol on DECLARE TARGET (#134107 ) Consider: ``` function foo() !$omp declare target(foo) ! This `foo` was a function-result symbol ... end ``` When resolving symbols, for this case use the symbol corresponding to the function instead of the symbol corresponding to the function result. Currently, this will result in an error: ``` error: A variable that appears in a DECLARE TARGET directive must be declared in the scope of a module or have the SAVE attribute, either explicitly or implicitly ```	2025-04-02 15:16:33 -05:00
Juan Manuel Martinez Caamaño	beae0e9f1a	[AMDGPU] Use a target feature to enable __builtin_amdgcn_global_load_lds on gfx9/10 (#133055 ) This patch introduces the `vmem-to-lds-load-insts` target feature, which can be used to enable builtins `__builtin_amdgcn_global_load_lds` and `__builtin_amdgcn_raw_ptr_buffer_load_lds` on platforms which have this feature. This feature is only available on gfx9/10. A limitation of using a common target feature for both builtins is that we could have made `__builtin_amdgcn_raw_ptr_buffer_load_lds` available on gfx6,7,8.	2025-04-02 20:00:09 +02:00
Kazu Hirata	aa33c09561	[flang] Fix a warning This patch fixes: flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp:184:18: error: unused variable 'loc' [-Werror,-Wunused-variable]	2025-04-02 10:14:50 -07:00
vdonaldson	8a0f694381	[flang] Legacy ASSIGN statement target processing (#133737 ) Like other target statements, the statement associated with the label in a legacy ASSIGN statement could be inside a construct. Constructs containing such a target must therefore be marked as unstructured, fairly similar to how targets are processed in `markBranchTarget`.	2025-04-02 09:52:13 -04:00
Kareem Ergawy	de6c9096ba	[flang][OpenMP] Handle "loop-local values" in `do concurrent` nests (#127635 ) Extends `do concurrent` mapping to handle "loop-local values". A loop-local value is one that is used exclusively inside the loop but allocated outside of it. This usually corresponds to temporary values that are used inside the loop body for initialzing other variables for example. After collecting these values, the pass localizes them to the loop nest by moving their allocations. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635 (this PR)	2025-04-02 15:43:19 +02:00
مهدي شينون (Mehdi Chinoune)	666df54ea6	[flang] Fold double bessel functions on Windows. (#130253 ) There are no functions for `float`. see: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/bessel-functions-j0-j1-jn-y0-y1-yn	2025-04-02 14:43:09 +01:00
Jean-Didier PAILLEUX	c309abd925	[flang] Implement !DIR$ NOVECTOR and !DIR$ NOUNROLL[_AND_JAM] (#133885 ) Hi, This patch implements support for the following directives : - `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO LOOP. - `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP. - `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.	2025-04-02 14:30:01 +02:00
Kareem Ergawy	ef56b53712	[flang][OpenMP] Extend `do concurrent` mapping to multi-range loops (#127634 ) Adds support for converting mulit-range loops to OpenMP (on the host only for now). The changes here "prepare" a loop nest for collapsing by sinking iteration variables to the innermost `fir.do_loop` op in the nest. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 (this PR) - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 12:43:04 +02:00
Tom Eccles	9b2fd1a6ec	[flang][OpenMP] Bump default OpenMP version to 3.1 (#133745 ) Precise OpenMP standards support information is being documented in #132707 Flang now has good support for OpenMP Version 3.1 and earlier.	2025-04-02 10:43:48 +01:00
Kareem Ergawy	3f8bfc9f7f	[flang][OpenMP] Map simple `do concurrent` loops to OpenMP host constructs (#127633 ) Upstreams one more part of the ROCm `do concurrent` to OpenMP mapping pass. This PR add support for converting simple loops to the equivalent OpenMP constructs on the host: `omp parallel do`. Towards that end, we have to collect more information about loop nests for which we add new utils in the `looputils` name space. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 (this PR) - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 11:26:58 +02:00
Kareem Ergawy	41d718b1cf	[flang][OpenMP] Upstream `do concurrent` loop-nest detection. (#127595 ) Upstreams the next part of do concurrent to OpenMP mapping pass (from AMD's ROCm implementation). See https://github.com/llvm/llvm-project/pull/126026 for more context. This PR add loop nest detection logic. This enables us to discover muli-range do concurrent loops and then map them as "collapsed" loop nests to OpenMP. This is a follow up for https://github.com/llvm/llvm-project/pull/126026, only the latest commit is relevant. This is a replacement for https://github.com/llvm/llvm-project/pull/127478 using a `/user/<username>/<branchname>` branch. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 (this PR) - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 10:12:52 +02:00
Kareem Ergawy	5d364481e3	[flang][OpenMP] Upstream first part of `do concurrent` mapping (#126026 ) This PR starts the effort to upstream AMD's internal implementation of `do concurrent` to OpenMP mapping. This replaces #77285 since we extended this WIP quite a bit on our fork over the past year. An important part of this PR is a document that describes the current status downstream, the upstreaming status, and next steps to make this pass much more useful. In addition to this document, this PR also contains the skeleton of the pass (no useful transformations are done yet) and some testing for the added command line options. This looks like a huge PR but a lot of the added stuff is documentation. It is also worth noting that the downstream pass has been validated on https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived performance speed-ups that match pure OpenMP, for GPU mapping we are still working on extending our support for implicit memory mapping and locality specifiers. PR stack: - https://github.com/llvm/llvm-project/pull/126026 (this PR) - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 09:24:38 +02:00
Valentin Clement (バレンタインクレメン)	ae8dd63681	[flang][cuda] Add interface and lowering for all_sync (#134001 )	2025-04-01 17:59:11 -07:00
Andre Kuhlenschmidt	b6edd25f17	[flang][intrinsics] NFC: make comment consistent (#133972 ) Just makes this named argument comment consistent with all the others in the file.	2025-04-01 14:30:10 -07:00
Valentin Clement (バレンタインクレメン)	bb179c483a	[flang][rt] Allow ReportFatalUserError to be build on device (#133979 )	2025-04-01 13:50:42 -07:00
Valentin Clement (バレンタインクレメン)	01889de8e9	[flang][device] Enable Stop functions on device build (#133803 ) Update `StopStatement` and `StopStatementText` to be build for the device.	2025-04-01 10:06:45 -07:00
Slava Zakharin	58551faaf1	[flang] Inline fir.is_contiguous_box in some cases. (#133812 ) Added inlining for `rank == 1` and `innermost` cases.	2025-04-01 08:41:11 -07:00
Jean-Didier PAILLEUX	513a91a5f1	[flang/flang-rt] Implement PERROR intrinsic form GNU Extension (#132406 ) Add the implementation of the `PERROR(STRING) ` intrinsic from the GNU Extension to prints on the stderr a newline-terminated error message corresponding to the last system error prefixed by `STRING`. (https://gcc.gnu.org/onlinedocs/gfortran/PERROR.html)	2025-04-01 15:47:54 +02:00
Jeremy Morse	1ebc308bba	[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855 ) During the transition from debug intrinsics to debug records, we used several different command line options to customise handling: the printing of debug records to bitcode and textual could be independent of how the debug-info was represented inside a module, whether the autoupgrader ran could be customised. This was all valuable during development, but now that totally removing debug intrinsics is coming up, this patch removes those options in favour of a single flag (experimental-debuginfo-iterators), which enables autoupgrade, in-memory debug records, and debug record printing to bitcode and textual IR. We need to do this ahead of removing the experimental-debuginfo-iterators flag, to reduce the amount of test-juggling that happens at that time. There are quite a number of weird test behaviours related to this -- some of which I simply delete in this commit. Things like print-non-instruction-debug-info.ll , the test suite now checks for debug records in all tests, and we don't want to check we can print as intrinsics. Or the update_test_checks tests -- these are duplicated with write-experimental-debuginfo=false to ensure file writing for intrinsics is correct, but that's something we're imminently going to delete. A short survey of curious test changes: * free-intrinsics.ll: we don't need to test that debug-info is a zero cost intrinsic, because we won't be using intrinsics in the future. * undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory mode while we sorted something out; it works now either way. * salvage-cast-debug-info.ll: was testing intrinsics-in-memory get salvaged, isn't necessary now * localize-constexpr-debuginfo.ll: was producing "dead metadata" intrinsics for optimised-out variable values, dbg-records takes the (correct) representation of poison/undef as an operand. Looks like we didn't update this in the past to avoid spurious test differences. * Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing that debug-info affected codegen, and we deferred updating the tests until now. This is just one of those silent gnochange issues that get fixed by RemoveDIs. Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc, that checks we can autoupgrade debug intrinsics that are in bitcode into the new debug records.	2025-04-01 14:27:11 +01:00

1 2 3 4 5 ...

10089 Commits