llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-19 00:26:52 +00:00

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	49f8ccd1eb	[flang][cuda] Pass stream information to kernel launch functions (#135246 )	2025-04-10 13:50:50 -07:00
Valentin Clement (バレンタインクレメン)	ca53463137	[flang][cuda] Propagate stream information to gpu.launch_func op (#135227 ) Use the information from `cuf.kernel_launch` to `gpu.launch_func`	2025-04-10 11:58:18 -07:00
Asher Mancinelli	8f23d4296c	Reland "[flang][nfc] Support volatility in Fir ops" (#135039 ) #134858 had an extraneous include which caused the shared library builds to break.	2025-04-09 12:45:55 -07:00
David Spickett	fb73086dd2	Revert "[flang][nfc] Support volatility in Fir ops" (#135034 ) Reverts llvm/llvm-project#134858 Fails to build when shared libraries are enabled: https://lab.llvm.org/buildbot/#/builders/80/builds/12361 ``` : && /usr/local/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion -Wcovered-switch-default -Wno-nested-anon-types -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/tcwg-buildbot/worker/flang-aarch64-sharedlibs/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libFIRDialect.so.21.0git -o lib/libFIRDialect.so.21.0git tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIRAttr.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIRDialect.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIRType.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FirAliasTagOpInterface.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FortranVariableInterface.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/Inliner.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/tcwg-buildbot/worker/flang-aarch64-sharedlibs/build/lib:" lib/libCUFAttrs.so.21.0git lib/libFIRDialectSupport.so.21.0git lib/libLLVMAsmPrinter.so.21.0git lib/libMLIRBuiltinToLLVMIRTranslation.so.21.0git lib/libMLIROpenMPToLLVM.so.21.0git lib/libMLIRLLVMToLLVMIRTranslation.so.21.0git lib/libMLIRFuncToLLVM.so.21.0git lib/libMLIRArithToLLVM.so.21.0git lib/libMLIRArithAttrToLLVMConversion.so.21.0git lib/libMLIRArithTransforms.so.21.0git lib/libMLIRBufferizationTransforms.so.21.0git lib/libMLIRBufferizationDialect.so.21.0git lib/libMLIRSparseTensorDialect.so.21.0git lib/libMLIRSCFDialect.so.21.0git lib/libMLIRFuncTransforms.so.21.0git lib/libMLIRShardingInterface.so.21.0git lib/libMLIRMeshDialect.so.21.0git lib/libMLIRVectorDialect.so.21.0git lib/libMLIRTensorDialect.so.21.0git lib/libMLIRParallelCombiningOpInterface.so.21.0git lib/libMLIRMaskableOpInterface.so.21.0git lib/libMLIRMaskingOpInterface.so.21.0git lib/libMLIRVectorInterfaces.so.21.0git lib/libMLIRControlFlowToLLVM.so.21.0git lib/libMLIRControlFlowDialect.so.21.0git lib/libMLIRMemRefToLLVM.so.21.0git lib/libMLIRLLVMCommonConversion.so.21.0git lib/libMLIRMemRefUtils.so.21.0git lib/libMLIRAffineDialect.so.21.0git lib/libMLIRMemRefDialect.so.21.0git lib/libMLIRArithUtils.so.21.0git lib/libMLIRComplexDialect.so.21.0git lib/libMLIRArithDialect.so.21.0git lib/libMLIRCastInterfaces.so.21.0git lib/libMLIRInferIntRangeCommon.so.21.0git lib/libMLIRShapedOpInterfaces.so.21.0git lib/libMLIRDialect.so.21.0git lib/libMLIRDialectUtils.so.21.0git lib/libMLIROpenMPDialect.so.21.0git lib/libMLIROpenACCMPCommon.so.21.0git lib/libMLIRTargetLLVMIRExport.so.21.0git lib/libMLIRDLTIDialect.so.21.0git lib/libMLIRLLVMIRTransforms.so.21.0git lib/libMLIRTransforms.so.21.0git lib/libMLIRUBDialect.so.21.0git lib/libMLIRRuntimeVerifiableOpInterface.so.21.0git lib/libMLIRFuncDialect.so.21.0git lib/libMLIRNVVMDialect.so.21.0git lib/libMLIRTranslateLib.so.21.0git lib/libMLIRParser.so.21.0git lib/libMLIRBytecodeReader.so.21.0git lib/libMLIRAsmParser.so.21.0git lib/libMLIRTransformUtils.so.21.0git lib/libMLIRSubsetOpInterface.so.21.0git lib/libMLIRValueBoundsOpInterface.so.21.0git lib/libMLIRDestinationStyleOpInterface.so.21.0git lib/libMLIRRewrite.so.21.0git lib/libMLIRRewritePDL.so.21.0git lib/libMLIRPDLToPDLInterp.so.21.0git lib/libMLIRPass.so.21.0git lib/libMLIRAnalysis.so.21.0git lib/libMLIRInferIntRangeInterface.so.21.0git lib/libMLIRLoopLikeInterface.so.21.0git lib/libMLIRPresburger.so.21.0git lib/libMLIRViewLikeInterface.so.21.0git lib/libMLIRPDLInterpDialect.so.21.0git lib/libMLIRPDLDialect.so.21.0git lib/libLLVMFrontendOpenMP.so.21.0git lib/libLLVMTransformUtils.so.21.0git lib/libMLIRLLVMDialect.so.21.0git lib/libMLIRInferTypeOpInterface.so.21.0git lib/libMLIRControlFlowInterfaces.so.21.0git lib/libMLIRDataLayoutInterfaces.so.21.0git lib/libMLIRFunctionInterfaces.so.21.0git lib/libMLIRCallInterfaces.so.21.0git lib/libMLIRMemorySlotInterfaces.so.21.0git lib/libMLIRSideEffectInterfaces.so.21.0git lib/libMLIRIR.so.21.0git lib/libLLVMBitWriter.so.21.0git lib/libLLVMAnalysis.so.21.0git lib/libLLVMAsmParser.so.21.0git lib/libLLVMBitReader.so.21.0git lib/libMLIRSupport.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMRemarks.so.21.0git lib/libLLVMBinaryFormat.so.21.0git lib/libLLVMTargetParser.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/tcwg-buildbot/worker/flang-aarch64-sharedlibs/build/lib && : /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::CharBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir12CharBoxValue4dumpEv[_ZNK3fir12CharBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::CharBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::PolymorphicValue::dump() const': FIROps.cpp:(.text._ZNK3fir16PolymorphicValue4dumpEv[_ZNK3fir16PolymorphicValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::PolymorphicValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::ArrayBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir13ArrayBoxValue4dumpEv[_ZNK3fir13ArrayBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::ArrayBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::CharArrayBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir17CharArrayBoxValue4dumpEv[_ZNK3fir17CharArrayBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::CharArrayBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::ProcBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir12ProcBoxValue4dumpEv[_ZNK3fir12ProcBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::ProcBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::BoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir8BoxValue4dumpEv[_ZNK3fir8BoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::BoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::MutableBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir15MutableBoxValue4dumpEv[_ZNK3fir15MutableBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::MutableBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::ExtendedValue::dump() const': FIROps.cpp:(.text._ZNK3fir13ExtendedValue4dumpEv[_ZNK3fir13ExtendedValue4dumpEv]+0x18): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::ExtendedValue const&)' clang++: error: linker command failed with exit code 1 (use -v to see invocation) ```	2025-04-09 15:41:45 +01:00
Asher Mancinelli	e42f860985	[flang][nfc] Support volatility in Fir ops (#134858 ) Part two of merging #132486. Support volatility in fir ops. * Introduce a new operation fir.volatile_cast, whose only purpose is to add or take away the volatility of an SSA value's type. The types must be otherwise identical, and any other type conversions must be handled by fir.convert. fir.convert will give an error if the volatility of the inputs does not match, such that all changes to volatility must be handled explicitly through fir.volatile_cast. * Add memory effects to ops that read from or write to memory. The precedent for this comes from the LLVM dialect (feb7beaf70) where llvm.load/store ops with the volatile attribute report read/write effects to a generic memory resource. This change is similar in spirit but different in two ways: the volatility of an operation is determined by the type of its memref, not an attribute on the op, and the memory effects of a load- or store-like operation on a volatile reference type are reported against a particular memory resource, `VolatileMemoryResource`. This is so MLIR optimizations are able to reorder operations that are not volatile around operations that are, which we believe more precisely models LLVM's volatile memory semantics. @vzakhari suggested this in #132486 citing LangRef. See https://llvm.org/docs/LangRef.html#volatile-memory-accesses Changes needed to generate IR with volatile types are not included in this change, so it should be non-functional, containing only the changes to Fir ops and op utilities that will be needed once we enable lowering to generate volatile types.	2025-04-09 05:55:24 -07:00
Asher Mancinelli	b2711e1526	[flang][nfc] Support volatile on ref, box, and class types (#134386 ) Part one of merging #132486. Add support for representing volatility in the type system for reference, box, and class types. Don't do anything with volatile just yet, only support and test their representation and utility functions. The naming convention is a little goofy - `fir::isa_volatile_type` and `fir::updateTypeWithVolatility` use different capitalization, but I put them near similar functions and tried to match the surrounding conventions and [the docs](https://github.com/llvm/llvm-project/blob/main/flang/docs/C%2B%2Bstyle.md#naming) best I could.	2025-04-07 06:51:02 -07:00
Valentin Clement (バレンタインクレメン)	18ff8df958	[flang][cuda] Register managed variables with double descriptor (#134444 ) Allocatable or pointer module variables with the CUDA managed attribute are defined with a double descriptor. One on the host and one on the device. Only the data pointed to by the descriptor will be allocated in managed memory. Allow the registration of any allocatable or pointer module variables like device or constant.	2025-04-04 14:38:01 -07:00
Slava Zakharin	5f268d04f9	[flang] Code generation for fir.pack/unpack_array. (#132080 ) The code generation relies on `ShallowCopyDirect` runtime to copy data between the original and the temporary arrays (both directions). The allocations are done by the compiler generated code. The heap allocations could have been passed to `ShallowCopy` runtime, but I decided to expose the allocations so that the temporary descriptor passed to `ShallowCopyDirect` has `nocapture` - maybe this will be better for LLVM optimizations.	2025-03-31 11:42:17 -07:00
Bruno Cardoso Lopes	7c3ecffe9b	[MLIR][LLVMIR] Add support for the full form of global_{ctor,dtor} (#133176 ) Currently only ctor/dtor list and their priorities are supported. This PR adds support for the missing data field. Few implementation notes: - The assembly printer has a fixed form because previous `attr_dict` will sort the dict by key name, making global_dtor and global_ctor differ in the order of printed arguments. - LLVM's `ptr null` is being converted to `#llvm.zero` otherwise we'd have to create a region to use the default operation conversion from `ptr null`, which is silly given that the field only support null or a symbol.	2025-03-27 14:11:05 -07:00
Valentin Clement (バレンタインクレメン)	e6dda9c23a	[flang][cuda] Only create shared memory global when needed (#132999 )	2025-03-26 09:26:50 -07:00
Valentin Clement (バレンタインクレメン)	5be9082fed	[flang][cuda] Carry over the dynamic shared memory size to gpu.launch_func (#132837 )	2025-03-24 18:37:19 -07:00
Sergio Afonso	b231f6f862	[MLIR][OpenMP] Improve omp.map.info verification (#132066 ) This patch makes the `map_type` and `map_capture_type` arguments of the `omp.map.info` operation required, which was already an invariant being verified by its users via `verifyMapClause()`. This makes it clearer, as getters no longer return misleading `std::optional` values. Checks for the `mapper_id` argument are moved to a verifier for the operation, rather than being checked by users. Functionally NFC, but not marked as such due to a reordering of arguments in the assembly format of `omp.map.info`.	2025-03-20 15:48:45 +00:00
Slava Zakharin	e0bcf3aa0b	[flang] Allow no type parameters for fir.pack_array. (#131662 ) Arrays with assumed-length types are represented with a box without explicit length parameters. This patch fixes the verification to allow it for `fir.pack_array`.	2025-03-18 07:59:04 -07:00
Kareem Ergawy	1094ffcafb	[flang][fir] Add MLIR op for `do concurrent` (#130893 ) Adds new MLIR ops to model `do concurrent`. In order to make `do concurrent` representation self-contained, a loop is modeled using 2 ops, one wrapper and one that contains the actual body of the loop. For example, a 2D `do concurrent` loop is modeled as follows: ```mlir fir.do_concurrent { %i = fir.alloca i32 %j = fir.alloca i32 fir.do_concurrent.loop (%i_iv, %j_iv) = (%i_lb, %j_lb) to (%i_ub, %j_ub) step (%i_st, %j_st) { %0 = fir.convert %i_iv : (index) -> i32 fir.store %0 to %i : !fir.ref<i32> %1 = fir.convert %j_iv : (index) -> i32 fir.store %1 to %j : !fir.ref<i32> } } ``` The `fir.do_concurrent` wrapper op encapsulates both the actual loop and the allocations required for the iteration variables. The `fir.do_concurrent.loop` op is a multi-dimensional op that contains the loop control and body. See the ops' docs for more info.	2025-03-18 10:53:44 +01:00
Valentin Clement (バレンタインクレメン)	e5ec7bb21b	[flang][cuda] Set correct offsets for multiple variables in dynamic shared memory (#131674 )	2025-03-17 17:13:06 -07:00
Valentin Clement (バレンタインクレメン)	74d4fc0a3e	[flang][cuda][NFC] Use ssa value for offset in shared memory op (#131661 ) Switch from attribute to a value as we need to support dynamic offset when multiple variables are used with dynamic shared memory.	2025-03-17 14:23:34 -07:00
Valentin Clement (バレンタインクレメン)	e86081b6c2	[flang][cuda] Convert cuf.shared_memory operation to LLVM ops (#131396 ) Convert the operation to `llvm.addressof` operation with `llvm.getelementptr` with the appropriate offset.	2025-03-14 19:34:55 -07:00
Valentin Clement (バレンタインクレメン)	4fb20b85fd	[flang][cuda] Compute offset on cuf.shared_memory ops (#131395 ) Add a pass to compute the size of the shared memory (static shared memory) and the offsets of each variables to be placed in shared memory. The global representing the shared memory is also created during this pass. In case of dynamic shared memory, the global as a type of `!fir.array<0xi8>` and the size of the memory is set at kernel launch.	2025-03-14 19:34:35 -07:00
Valentin Clement (バレンタインクレメン)	4818623924	[flang][cuda] Add cuf.shared_memory operation (#131392 ) Introduce `cuf.shared_memory` operation. The operation is used to get the pointer in shared memory for a specific variable. The shared memory is materialized as a global in address space 3 and the different variables are pointing to it at different offset. Follow up patches will add lowering and conversion of this operation.	2025-03-14 15:43:25 -07:00
Valentin Clement (バレンタインクレメン)	a862b6deae	[flang][cuda] Lower shared global to the correct NVVM address space (#131368 ) Global with the CUDA shared data attribute needs to be lowered to llvm globals with the correct address space (3). Address space is set from the `mlir::NVVM::NVVMMemorySpace::kSharedMemorySpace` enum from `mlir/Dialect/LLVMIR/NVVMDialect.h`	2025-03-14 15:28:32 -07:00
Slava Zakharin	00f9c855fb	[flang] Added fir.is_contiguous_box and fir.box_total_elements ops. (#131047 ) These are helper operations to aid with expanding of fir.pack_array.	2025-03-14 08:25:05 -07:00
Kareem Ergawy	b003face11	[flang][OpenMP] Add `OutlineableOpenMPOpInterface` to `omp.teams` (#131109 ) Given the following input: ```fortran program rep_loopbind implicit none integer :: i real :: priv_val !$omp teams private(priv_val) !$omp distribute do i=1,1000 end do !$omp end teams end program ``` the `AllocaOpConversion` pattern in `FIRToLLVMLowering` would move the private allocations that belong to the `teams` directive (i.e. the allocations needed for the private copies of `priv_val` and the loop's iteration variable) from the the `omp.teams` op to the outside scope. This is not correct since these allocations should be eventually emitted inside the outlined region for the `teams` directive. Without this fix, these allocation would be emitted in the parent function (or the parent scope whatever it is).	2025-03-13 16:03:19 +01:00
Asher Mancinelli	982527eef0	[flang] Use saturated intrinsics for floating point to integer conversions (#130686 ) The saturated floating point conversion intrinsics match the semantics in the standard more closely than the fptosi/fptoui instructions. Case 2 of 16.9.100 is > INT (A [, KIND]) > If A is of type real, there are two cases: if \|A\| < 1, INT (A) has the value 0; if \|A\| ≥ 1, INT (A) is the integer whose magnitude is the largest integer that does not exceed the magnitude of A and whose sign is the same as the sign of A. Currently, converting a floating point value into an integer type too small to hold the constant will be converted to poison in opt, leaving us with garbage: ``` > cat t.f90 program main real(kind=16) :: f integer(kind=4) :: i f=huge(f) i=f print , i end program main # current upstream > for i in `seq 10`; do; ./a.out; done -862156992 -1497393344 -739096768 -1649494208 1761228608 -1959270592 -746244288 -1629194432 -231217344 382322496 ``` With the saturated fptoui/fptosi intrinsics, we get the appropriate values ``` # mine > flang -O2 ./t.f90 && ./a.out 2147483647 > perl -e 'printf "%d\n", (2 * 31) - 1' 2147483647 ``` One notable difference: NaNs being converted to ints will become zero, unlike current flang (and some other compilers). Newer versions of GCC have this behavior.	2025-03-12 08:14:46 -07:00
jeanPerier	15e335f04f	[flang] also set llvm ABI argument attributes on direct calls (#130736 ) So far, flang was not setting argument attributes on direct calls assuming that putting them on the function operation was enough. It was clarified in `38565da525` that they must be set on both call and functions, even for direct calls. Crashes have been observed because of the lack of the attribute when compiling `abs(x)` at `O2` and above on X86-64 for complex(16).	2025-03-12 09:55:05 +01:00
Slava Zakharin	74eba972ca	[flang] Definitions of fir.pack/unpack_array operations. (#130698 ) As defined in #127147.	2025-03-11 14:15:29 -07:00
jeanPerier	1ddf18057a	[flang] introduce fir.copy to avoid load store of aggregates (#130289 ) Introduce a FIR operation to do memcopy/memmove of compile time constant size types. This is to avoid requiring derived type copies to done with load/store which is badly supported in LLVM when the aggregate type is "big" (no threshold can easily be defined here, better to always avoid them for fir.type). This was the root cause of the regressions caused by #114002 which introduced a load/store of fir.type<> which caused hand/asserts to fire in LLVM on several benchmarks. See https://llvm.org/docs/Frontend/PerformanceTips.html#avoid-creating-values-of-aggregate-type	2025-03-11 09:31:03 +01:00
R	1dffe8f364	Reland [flang] In AllocMemOp lowering, convert types for calling malloc on 32-bit (#130386 ) Previous PR: https://github.com/llvm/llvm-project/pull/129308 Changes: * The alloc-32.fir test is now marked as requiring the X86 target. * Drive-by fixes uncovered when fixing tests involving malloc	2025-03-11 02:01:57 +00:00
R	3121da52aa	Revert "[flang] In AllocMemOp lowering, convert types for calling malloc on 32-bit (#129308 )" This reverts commit cf1964af5a461196904b663ede04c26555fcff69. This causes breakage on all the non-x86 buildbots as they don't have the i686 target enabled. This was missed in pre-commit CI.	2025-03-08 02:42:24 +00:00
R	cf1964af5a	[flang] In AllocMemOp lowering, convert types for calling malloc on 32-bit (#129308 ) Although 32-bit targets are currently not officially supported, add a type conversion in the AllocMemOp lowering when calling the `malloc` function on 32-bit targets. This fixes a type mismatch, and this fix makes it easier to potentially support such targets in the future. This involves making sure the `LLVMTypeConverter` has the necessary information to know the target bit width. Co-authored-by: Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	2025-03-08 02:25:17 +00:00
Kelvin Li	83f8721201	[flang] handle passing bind(c) derived type by value for ppc64le and powerpc64-aix (#128780 )	2025-03-03 14:43:43 -05:00
jeanPerier	9805854699	[flang][NFC] clean-up fir.field_index legacy usages in tests (#129219 ) After #127231, fir.coordinate_of should directly carry the field. I updated the lowering and codegen tests in #12731, but not the FIR to FIR tests, which is what this patch is cleaning up.	2025-03-03 10:01:54 +01:00
jeanPerier	a8db1fb9b5	[flang] update fir.coordinate_of to carry the fields (#127231 ) This patch updates fir.coordinate_op to carry the field index as attributes instead of relying on getting it from the fir.field_index operations defining its operands. The rational is that FIR currently has a few operations that require DAGs to be preserved in order to be able to do code generation. This is the case of fir.coordinate_op, which requires its fir.field operand producer to be visible. This makes IR transformation harder/brittle, so I want to update FIR to get rid if this. Codegen/printer/parser of fir.coordinate_of and many tests need to be updated after this change.	2025-02-28 09:50:05 +01:00
Slava Zakharin	0caa8f42be	Reland "[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 )" This change is inspired by a case in facerec benchmark, where performance of scalar code may improve by about 6%@aarch64 due to getting rid of redundant loads from Fortran descriptors. These descriptors are corresponding to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest in LocalMove subroutine contains call to Fortran runtime IO functions, and LLVM globals-aa analysis cannot prove that these calls do not modify the globalized descriptors with internal linkage. This patch sets and propagates llvm.memory_effects attribute for fir.call operations calling Fortran runtime functions. In particular, it tries to set the Other memory effect to NoModRef. The Other memory effect includes accesses to globals and captured pointers, so we cannot set it for functions taking Fortran descriptors with one exception for calls where the Fortran descriptor arguments are all null. As long as different calls to the same Fortran runtime function may have different attributes, I decided to attach the attributes to the calls rather than functions. Moreover, attaching the attributes to func.func will require propagating these attributes to llvm.func, which is not happening right now. In addition to llvm.memory_effects, the new pass sets llvm.nosync and llvm.nocallback attributes that may also help LLVM alias analysis (e.g. see #127707). These attributes are ignored currently. I will support them in LLVM IR dialect in a separate patch. I also added another pass for developers to be able to print declarations/calls of all Fortran runtime functions that are recognized by the attributes setting pass. It should help with maintenance of the LIT tests.	2025-02-24 14:18:17 -08:00
Slava Zakharin	69cc16fb55	Revert "[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 )" This reverts commit 36fdeb2aded08a776fcffefa73cb7667e7fc6c2d.	2025-02-24 10:52:53 -08:00
Slava Zakharin	36fdeb2ade	[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 ) This change is inspired by a case in facerec benchmark, where performance of scalar code may improve by about 6%@aarch64 due to getting rid of redundant loads from Fortran descriptors. These descriptors are corresponding to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest in LocalMove subroutine contains call to Fortran runtime IO functions, and LLVM globals-aa analysis cannot prove that these calls do not modify the globalized descriptors with internal linkage. This patch sets and propagates llvm.memory_effects attribute for fir.call operations calling Fortran runtime functions. In particular, it tries to set the Other memory effect to NoModRef. The Other memory effect includes accesses to globals and captured pointers, so we cannot set it for functions taking Fortran descriptors with one exception for calls where the Fortran descriptor arguments are all null. As long as different calls to the same Fortran runtime function may have different attributes, I decided to attach the attributes to the calls rather than functions. Moreover, attaching the attributes to func.func will require propagating these attributes to llvm.func, which is not happening right now. In addition to llvm.memory_effects, the new pass sets llvm.nosync and llvm.nocallback attributes that may also help LLVM alias analysis (e.g. see #127707). These attributes are ignored currently. I will support them in LLVM IR dialect in a separate patch. I also added another pass for developers to be able to print declarations/calls of all Fortran runtime functions that are recognized by the attributes setting pass. It should help with maintenance of the LIT tests.	2025-02-24 09:27:48 -08:00
Valentin Clement (バレンタインクレメン)	93b2e47f12	[flang][cuda] Avoid assign element mismatch when doing data transfer from a constant (#128252 ) Currently when we do a CUDA data transfer from a constant, we embox it and delegate the assignment to the runtime. When the type of the constant is not exactly the same as the destination descriptor, the runtime will emit an assignment mismatch error. Convert the constant when necessary so the assignment is fine.	2025-02-21 17:46:46 -08:00
David Truby	449f84fea6	[flang] fix AArch64 PCS for struct following pointer (#127802 ) Pointers are already handled as taking up a register in the ABI handling, but the handling for structs was not taking this into account. This patch changes the struct handling to acknowledge that pointer arguments take up an integer register. Fixes #123075	2025-02-21 10:50:52 -08:00
Akash Banerjee	d6ab12c7cc	[MLIR][OpenMP] Add conversion support from FIR to LLVM Dialect for OMP DeclareMapper (#121005 ) Add conversion support from FIR to LLVM Dialect for OMP DeclareMapper. Depends on #121001	2025-02-18 17:47:32 +00:00
Nikita Popov	75c356c488	[MLIR][LLVMIR] Always use TargetFolder in IRBuilder (#126929 ) This is a followup to https://github.com/llvm/llvm-project/pull/126745, generalizing it to always use TargetFolder, including inside function bodies. This avoids generating non-canonical constant expressions that can be folded away.	2025-02-13 08:53:59 +01:00
Razvan Lupusoru	7b473dfe84	[flang][acc] Implement type categorization for FIR types (#126964 ) The OpenACC type interfaces have been updated to require that a type self-identify which type category it belongs to. Ensure that FIR types are able to provide this self identification. In addition to implementing the new API, the PointerLikeType interface attachment was moved to FIROpenACCSupport library like MappableType to ensure all type interfaces and their implementation are now in the same spot.	2025-02-12 21:09:59 -08:00
jeanPerier	5836d91845	[flang] add ABI argument attributes in indirect calls (#126896 ) Last piece that implements the TODO for sret and byval setting on indirect calls. This includes a fix to the codegen last patch. I thought types in in type attributes were automatically converted in dialect conversion passes, but that is not the case. The sret and byval type needs to be converted to llvm types in codegen (mlir FuncOp conversion is doing a similar conversion).	2025-02-12 17:31:34 +01:00
Nikita Popov	c03325cead	[MLIR][LLVMIR] Use TargetFolder when creating globals (#126745 ) The LLVM dialect lowers globals using IRBuilder, relying on it creating constant expressions where possible. As we remove support for more constant expressions (per https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179), this can cause issues for cases where the constant expression is no longer supported, and the operation cannot be constant folded without DataLayout being available. In particular, I ran into this issue with flang and the removal of mul constant expressions. Address this by using TargetFolder when creating globals, which will perform DL-aware constant folding. I think it would make sense to also do this in general, but I'm starting with globals where not doing this can result in translation failures. Ideally, globals with these problematic expressions would never be generated in the first place, but there has been little movement on fixing this (https://github.com/llvm/llvm-project/issues/96047).	2025-02-12 10:14:00 +01:00
jeanPerier	65075a863b	[flang][FIR] handle argument attributes in fir.call (#126711 ) Add pretty printer/parser for fir.call argument/result attributes and propagate them to llvm.call. This will allow implementing the TODO about ABI relevant argument attribute in indirect calls.	2025-02-12 09:49:52 +01:00
klensy	c491cbfe75	[flang][test] Fix filecheck annotation typos (#92387 )	2025-02-05 18:24:47 +00:00
Valentin Clement (バレンタインクレメン)	f1b075df2e	[flang][cuda] Pass the pinned variable in allocate calls (#125310 )	2025-02-02 18:05:59 -08:00
Tom Eccles	aeaafce464	[mlir][OpenMP][flang] make private variable allocation implicit in omp.private (#124019 ) The intention of this work is to give MLIR->LLVMIR conversion freedom to control how the private variable is allocated so that it can be allocated on the stack in ordinary cases or as part of a structure used to give closure context for tasks which might outlive the current stack frame. See RFC: https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084 For example, a privatizer for an integer used to look like ```mlir omp.private {type = private} @x.privatizer : !fir.ref<i32> alloc { ^bb0(%arg0: !fir.ref<i32>): %0 = ... allocate proper memory for the private clone ... omp.yield(%0 : !fir.ref<i32>) } ``` After this change, allocation become implicit in the operation: ```mlir omp.private {type = private} @x.privatizer : i32 ``` For more complex types that require initialization after allocation, an init region can be used: ``` mlir omp.private {type = private} @x.privatizer : !some.type init { ^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>): // initialize %arg1, using %arg0 as a mold for allocations omp.yield(%arg1 : !some.pointer<!some.type>) } dealloc { ^bb0(%arg0: !some.pointer<!some.type>): ... deallocate memory allocated by the init region ... omp.yield } ``` This patch lays the groundwork for delayed task execution but is not enough on its own. After this patch all gfortran tests which previously passed still pass. There are the following changes to the Fujitsu test suite: - 0380_0009 and 0435_0009 are fixed - 0688_0041 now fails at runtime. This patch is testing firstprivate variables with tasks. Previously we got lucky with the undefined behavior and won the race. After these changes we no longer get lucky. This patch lays the groundwork for a proper fix for this issue. In flang the lowering re-uses the existing lowering used for reduction init and dealloc regions. In flang, before this patch we hit a TODO with the same wording when generating the copy region for firstprivate polymorphic variables. After this patch the box-like fir.class is passed by reference into the copy region, leading to a different path that didn't hit that old TODO but the generated code still didn't work so I added a new TODO in DataSharingProcessor.	2025-01-31 09:35:26 +00:00
Valentin Clement (バレンタインクレメン)	382d3599c2	[flang][cuda] Propagate the data attribute on the converted calls (#124877 )	2025-01-29 08:04:22 -08:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Slava Zakharin	0b80491cd5	[flang] Support non-index shape/shift/slice for CG box operations. (#124625 ) That is another problem uncovered during hlfir.reshape inlining, where the shape bits could be any integer type. This patch adds explicit convertions to `index` type where needed.	2025-01-28 09:38:33 -08:00
ssijaric-nv	16e9601e19	[Flang] Adjust the trampoline size for AArch64 and PPC (#118678 ) Set the trampoline size to match that in compiler-rt/lib/builtins/trampoline_setup.c and AArch64 and PPC lowering.	2025-01-27 08:02:18 -08:00

1 2 3 4 5 ...

657 Commits