llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-18 14:26:42 +00:00

Author	SHA1	Message	Date
Matthias Springer	5acab1bd15	[mlir][SPIRV] `IfOpConversion`: Compute result types earlier (#134380 ) Compute the result types and bail out before modifying any IR. That is more efficient when type conversion failed, because no modifications must be rolled back. Note: This is in preparation of the One-Shot Dialect Conversion refactoring.	2025-04-04 09:37:15 -07:00
Kevin Gleason	e8d5009784	[mlir] Fix parsing of empty complex tensors (#134322 ) After https://github.com/llvm/llvm-project/pull/133220 we had some empty complex literals (`tensor<0xcomplex<f32>>`) failing to parse. This was largely due to the ambiguity between `shape.empty()` meaning splat (`dense<1>`) or empty literal (`dense<>`). Used type's numel to disambiguate during verification.	2025-04-04 09:29:51 -07:00
Matthias Springer	6966b4f4a5	[mlir][arith] Remove func patterns from `populateArithWideIntEmulationPatterns` (#134316 ) This function should populate only patterns that are related to wide integer operation emulation.	2025-04-04 06:23:17 -07:00
Jerry-Ge	d6c076eeaa	[mlir][tosa] Reorder Tosa_ExtensionAttrs to match with definition order (#134319 ) Simple refactor change. Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-04 11:33:52 +01:00
Alaa Ali	5812516ae2	[MLIR] Fix canonicalization pattern for 'shape.shape_of' (#134234 ) This PR will fix a bug in a canonicalization pattern (operation shape.shape_of: shape of reshape) ``` // Before func.func @f(%arg0: tensor<?x1xf32>, %arg1: tensor<3xi32>) -> tensor<3xindex> { %reshape = tensor.reshape %arg0(%arg1) : (tensor<?x1xf32>, tensor<3xi32>) -> tensor<?x1x1xf32> %0 = shape.shape_of %reshape : tensor<?x1x1xf32> -> tensor<3xindex> return %0 : tensor<3xindex> } //This is will error out as follows: error: 'tensor.cast' op operand type 'tensor<3xi32>' and result type 'tensor<3xindex>' are cast incompatible %0 = shape.shape_of %reshape : tensor<?x1x1xf32> -> tensor<3xindex> ^ note: see current operation: %0 = "tensor.cast"(%arg1) : (tensor<3xi32>) -> tensor<3xindex> ``` ``` // After func.func @f(%arg0: tensor<?x1xf32>, %arg1: tensor<3xi32>) -> tensor<3xindex> { %0 = arith.index_cast %arg1 : tensor<3xi32> to tensor<3xindex> return %0 : tensor<3xindex> } ``` See file canonicalize.mlir in the change list for an example. For the context, this bug was found while running a test on Keras 3, the canonicalizer errors out due to an invalid tensor.cast operation when the batch size is dynamic. The operands of the op are tensor<3xi32> cast to tensor<3xindex>. This change is related to a previous PR: https://github.com/llvm/llvm-project/pull/98531 --------- Co-authored-by: Alaa Ali <alaaali@ah-alaaali-l.dhcp.mathworks.com> Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-04-04 11:46:58 +02:00
Adrian Kuegel	c154d66339	[mlir] Apply ClangTidyBugProne finding (NFC). argument name 'outputType' in comment does not match parameter name 'outputTypes'	2025-04-04 09:01:56 +00:00
Igor Wodiany	9eb7e64145	[mlir][spirv] Add verification for Bias operand (#134231 )	2025-04-04 09:41:54 +01:00
Iris	92923e517c	[mlir][llvm] Add `LLVM_DependentLibrariesAttr` (#133385 ) https://llvm.org/docs/LangRef.html#dependent-libs-named-metadata --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2025-04-04 08:00:04 +02:00
Jerry-Ge	7d05c2326c	[mlir][tosa] Remove extra trailing whitespace (#134290 ) Trivial change. Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-03 18:15:39 -07:00
Jerry-Ge	fcfbef5582	[mlir][tosa] Remove extra declarations of MulOperandsAndResultElementType in TosaOps.td (#134300 ) Minor code cleanup Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-03 18:15:30 -07:00
modiking	9f2feeb189	[mlir][gpu][nvptx] Remove null terminator when outputting PTX (#133019 ) PTX source files are expected to only contain ASCII text (https://docs.nvidia.com/cuda/parallel-thread-execution/#source-format) and no null terminators. `ptxas` has so far not enforced this but is moving towards doing so. This revealed a problem where the null terminator is getting printed out in the output file in MLIR path when outputting ptx directly. Only add the null on the assembly output path for JIT instead of in output of `moduleToObject `.	2025-04-03 15:50:54 -07:00
MaheshRavishankar	a1bc979aa8	[mlir][Bufferization] Do not have read semantics for destination of `tensor.parallel_insert_slice`. (#134169 ) `tensor.insert_slice` needs to have read semantics on its destination operand. Since it has a return value, its semantics are - Copy dest to result - Copy source to subview of destination. `tensor.parallel_insert_slice` though has no result. So it does not need to have read semantics. The op description [here](`a3ac318e5f/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td (L1524)`) also says that it is expected to lower to a `memref.subview`, that does not have read semantics on the destination (its just a view). This patch drops the read semantics for destination of `tensor.parallel_insert_slice` but also makes the `shared_outs` operands of `scf.forall` have read semantics. Earlier it would rely indirectly on read semantics of destination operand of `tensor.parallel_insert_slice` to propagate the read semantics for `shared_outs`. Now that is specified more directly. Fixes #133964 --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-04-03 09:47:36 -07:00
Sergio Afonso	f59b5b8d59	[MLIR][OpenMP] Fix standalone distribute on the device (#133094 ) This patch updates the handling of target regions to set trip counts and kernel execution modes properly, based on clang's behavior. This fixes a race condition on `target teams distribute` constructs with no `parallel do` loop inside. This is how kernels are classified, after changes introduced in this patch: ```f90 ! Exec mode: SPMD. ! Trip count: Set. !$omp target teams distribute parallel do do i=... end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel do private(idx, y) do j=... end do end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel ... !$omp end parallel end do ! Exec mode: Generic. ! Trip count: Set. !$omp target teams distribute do i=... end do ! Exec mode: SPMD. ! Trip count: Not set. !$omp target parallel do do i=... end do ! Exec mode: Generic. ! Trip count: Not set. !$omp target ... !$omp end target ``` For the split `target teams distribute + parallel do` case, clang produces a Generic kernel which gets promoted to Generic-SPMD by the openmp-opt pass. We can't currently replicate that behavior in flang because our codegen for these constructs results in the introduction of calls to the `kmpc_distribute_static_loop` family of functions, instead of `kmpc_distribute_static_init`, which currently prevent promotion of the kernel to Generic-SPMD. For the time being, instead of relying on the openmp-opt pass, we look at the MLIR representation to find the Generic-SPMD pattern and directly tag the kernel as such during codegen. This is what we were already doing, but incorrectly matching other kinds of kernels as such in the process.	2025-04-03 15:41:00 +01:00
Sergio Afonso	18dd299fb1	[Flang][MLIR][OpenMP] Host-evaluation of omp.loop bounds (#133908 ) This patch updates Flang lowering and kernel flags identification in MLIR so that loop bounds on `target teams loop` constructs are evaluated on the host, making the trip count available to the corresponding `__tgt_target_kernel` call emitted for the target region. This is necessary in order to properly execute these constructs as `target teams distribute parallel do`. Co-authored-by: Kareem Ergawy <kareem.ergawy@amd.com>	2025-04-03 15:06:19 +01:00
Frank Schlimbach	586c5e3083	[mlir][mpi] fixing in-place and 0d mpi.all_reduce (#134225 ) * inplace allreduce needs special MPI token MPI_IN_PLACE as send buffer * 0d tensors have no sizes/strides in LLVM memref struct	2025-04-03 15:53:40 +02:00
Elen Kalda	c2355892a4	[mlir][tosa] Add ERROR_IF checks to TRANSPOSE_CONV2D verifier (#133234 ) This patch extends the verifier with following checks: ERROR_IF(out_pad_top <= -KH \|\| out_pad_bottom <= -KH); ERROR_IF(out_pad_left <= -KW \|\| out_pad_right <= -KW); ERROR_IF(stride_y < 1 \|\| stride_x < 1); ERROR_IF(OH != (IH - 1) * stride_y + out_pad_top + out_pad_bottom + KH); ERROR_IF(OW != (IW - 1) * stride_x + out_pad_left + out_pad_right + KW); ERROR_IF(BC != OC && BC != 1); Signed-off-by: Elen Kalda <elen.kalda@arm.com>	2025-04-03 14:04:28 +01:00
Hsiangkai Wang	2e7ed78cff	[mlir][spirv] Add instruction OpGroupNonUniformRotateKHR (#133428 ) Add an instruction under the extension SPV_KHR_subgroup_rotate. The specification for the extension is here: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_subgroup_rotate.html	2025-04-03 11:00:29 +01:00
Jack Frankland	6f324bd39b	[mlir][tosa] Remove Convolution Type Verifiers (#134077 ) Remove the test in the convolution verifier that checks the input and output element types of convolution operations conform to the constraints imposed by the TOSA 1.0 specification. These checks are too strict for users of the TOSA dialect who wish to allow more types than those allowed by the spec and provide compatibility issues with earlier TOSA implementation which allowed more type combinations. Users who do wish to constrain the convolution types combination to only those allowed by the TOSA 1.0 spec should run the TOSA validation pass which already performs these checks. Signed-off-by: Jack Frankland <jack.frankland@arm.com>	2025-04-03 10:30:10 +01:00
Jerry-Ge	94dbe5e405	[mlir][tosa] Remove extra whitespace in the PadOp example (#134113 ) Trivial cleanup change. Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-02 19:40:54 -07:00
Nirvedh Meshram	42b3f91fd6	[mlir] Vectorize tensor.pad with low padding for unit dims (#133808 ) We currently do not have masked vectorization support for tenor.pad with low padding. However, we can allow this in the special case where the result dimension after padding is a unit dim. The reason is when we actually have a low pad on a unit dim, the input size of that dimension will be (or should be for correct IR) dynamically zero and hence we will create a zero mask which is correct. If the low pad is dynamically zero then the lowering is correct as well. --------- Signed-off-by: Nirvedh <nirvedh@gmail.com>	2025-04-02 16:32:36 -05:00
Andrzej Warzyński	2bee24632f	[mlir][bugfix] Fix erroneous condition in `getEffectsOnResource` (#133638 ) This patch corrects an invalid condition in `getEffectsOnResource` used to identify relevant "resources": ```cpp return it.getResource() != resource; ``` The current implementation assumes that only one instance of each resource will exist, so comparing raw pointers is both safe and sufficient. This assumption stems from constructs like: ```cpp static DerivedResource get() { static DerivedResource instance; return &instance; } ``` i.e., resource instances returned via static singleton methods. However, as discussed in https://github.com/llvm/llvm-project/issues/129216, this assumption breaks in practice — notably on macOS (Apple Silicon) when built with: * `-DBUILD_SHARED_LIBS=On`. In such cases, multiple instances of the same logical resource may exist across shared library boundaries, leading to incorrect behavior and causing failures in tests like: * test/Dialect/Transform/check-use-after-free.mlir This patch replaces the pointer comparison with a comparison based on resource identity: ```cpp return it.getResource()->getResourceID() != resource->getResourceID(); ``` This approach aligns better with the intent of `getEffectsOnResource`, which is to: ```cpp /// Collect all of the effect instances that operate on the provided /// resource (...) ``` Fixes #129216	2025-04-02 21:26:41 +01:00
ofri frishman	6f1347d57b	[MLIR] Bubble up tensor.extract_slice through tensor.collapse_shape (#131982 ) Add a pattern that bubbles up tensor.extract_slice through tensor.collapse_shape. The pattern is registered in a pattern population function that is used by the transform op transform.apply_patterns.tensor.bubble_up_extract_slice and by the tranform op transform.structured.fuse as a cleanup pattern. This pattern enables tiling and fusing op chains which contain tensor.collapse_shape if added as a cleanup pattern of tile and fuse utility. Without this pattern that would not be possible, as tensor.collapse_shape does not implement the tiling interface. This is an additional pattern to the one added in PR #126898	2025-04-02 21:06:43 +01:00
AdityaK	340f06a8d4	Fix: bail out when divisor is zero (#133518 ) Fixes: #131279	2025-04-02 09:44:18 -07:00
Fehr Mathieu	8b67f36258	[mlir] [arith] Fix ceildivsi lowering in arith-expand (#133774 ) This fixes the current lowering of `arith.ceildivsi` in the arith-expand pass, which was previously incorrect. The new version is based on the lowering of `arith.floordivsi`, and will not introduce new undefined behavior or poison during the lowering. It also replaces one division with a multiplication. The previous lowering of `ceildivsi(n, m)` was the following: ``` x = (m > 0) ? -1 : 1 (nm>0) ? ((n+x) / m) + 1 : - (-n / m) ``` This caused two problems: In the case where `n` is INT_MIN and `m` is positive, the result would be poison instead of an actual value * In the case where `n` is INT_MAX and `m` is `-1`, this would trigger undefined behavior, while the original code wouldn't. This is because `n+x` would be equal to `INT_MIN` (`INT_MAX + 1`), so the `(n+x) / m` division would overflow and trigger UB.	2025-04-02 17:26:58 +01:00
Shilei Tian	84cb08e118	[MLIR][AMDGPU] Bump to COV6 (#133849 ) We already bump to COV6 by default in the front-end and backend. This PR is for MLIR. Note that COV6 requires ROCm 6.3+.	2025-04-02 12:14:24 -04:00
Igor Wodiany	200b589a1b	[mlir][spirv] Fix ambiguous conversion between SmallVector and TypeRange (#134087 ) This address buildbot failures caused by #133702.	2025-04-02 15:09:36 +01:00
Igor Wodiany	2a90631841	[mlir][spirv] Allow yielding values from selection regions (#133702 ) There are cases in SPIR-V shaders where values need to be yielded from the selection region to make valid MLIR. For example (part of the SPIR-V shader decompiled to GLSL): ``` bool _115 if (_107) { // ... float _200 = fma(...); // ... _115 = _200 < _174; } else { _115 = _107; } bool _123; if (_115) { // ... float _213 = fma(...); // ... _123 = _213 < _174; } else { _123 = _115; } ```` This patch extends `mlir.selection` so it can return values. `mlir.merge` is used as a "yield" operation. This allows to maintain a compatibility with code that does not yield any values, as well as, to maintain an assumption that `mlir.merge` is the only operation in the merge block of the selection region.	2025-04-02 14:35:22 +01:00
Longsheng Mou	7d441d9892	[mlir] Use `dyn_cast` instead of `cast` in MathToVCIX conversion (#134047 ) Fixes #131093.	2025-04-02 18:30:42 +08:00
Kazu Hirata	f1025c0e87	[mlir] Construct SmallVector with ArrayRef (NFC) (#134023 )	2025-04-01 22:00:01 -07:00
Kazu Hirata	d760dbe6eb	[mlir] Remove extraneous calls to make_range (NFC) (#134020 )	2025-04-01 21:59:51 -07:00
donald chen	d40bab359c	[mlir][liveness] fix bugs in liveness analysis (#133416 ) This patch fixes the following bugs: - In SparseBackwardAnalysis, the setToExitState function should propagate changes if it modifies the lattice. Previously, this issue was masked because multi-block scenarios were not tested, and the traversal order of backward data flow analysis starts from the end of the program. - The method in liveness analysis for determining whether the non-forwarded operand in branch/region branch operations is live is incorrect, which may cause originally live variables to be marked as not live.	2025-04-02 11:56:13 +08:00
Longsheng Mou	222297b020	[mlir] Use Region::hasOneBlock (NFC) (#133879 )	2025-04-02 09:40:22 +08:00
Matthias Springer	69f59d59cb	[mlir][IR] Delete `match` and `rewrite` functions (#130259 ) The `match` and `rewrite` functions have been deprecated in #130031. This commit deletes them entirely. Note for LLVM integration: Update your patterns to use `matchAndRewrite` instead of separate `match` / `rewrite`.	2025-04-01 15:30:13 -07:00
AdityaK	96d60c00e5	[mlir][spirv] Verify matching of entry block arguments and function signature (#133167 ) Fixes: #132894	2025-04-01 14:10:17 -07:00
Max191	1407f5bee9	[mlir] Canonicalize extract_slice(unpack) (#133777 ) Canonicalizes a chain of `linalg.unpack -> tensor.extract_slice` into a `linalg.unpack` with reduced dest sizes. This will only happen when the unpack op's only user is a non rank-reducing slice with zero offset and unit strides. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com> Signed-off-by: Max Dawkins <maxdawkins19@gmail.com> Co-authored-by: Max Dawkins <maxdawkins19@gmail.com>	2025-04-01 14:51:58 -04:00
Krzysztof Drewniak	25622aa745	[mlir][AMDGPU] Add gfx950 MFMAs to the amdgpu.mfma op (#133553 ) This commit extends the lowering of amdgpu.mfma to handle the new double-rate MFMAs in gfx950 and adds tests for these operations. It also adds support for MFMAs on small floats (f6 and f4), which are implented using the "scaled" MFMA intrinsic with a scale value of 0 in order to have an unscaled MFMA. This commit does not add a `amdgpu.scaled_mfma` operation, as that is future work. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2025-04-01 11:59:09 -05:00
Nirvedh Meshram	69c5049826	[NFC][mlir] Update generate script for conv_3d_ncdhw_fcdhw (#133927 ) https://github.com/llvm/llvm-project/pull/129547 changed the IR directly without updating the auto generate script. Signed-off-by: Nirvedh <nirvedh@gmail.com>	2025-04-01 11:55:40 -05:00
Ivan Butygin	1f194ff34e	[mlir] Expose `simplifyAffineExpr` through python api (#133926 )	2025-04-01 19:28:53 +03:00
lorenzo chelini	105c8c38dc	[MLIR][NFC] Retire let constructor for EmitC (#133732 ) `let constructor` is legacy (do not use in tree!) since the tableGen backend emits most of the glue logic to build a pass.	2025-04-01 18:22:40 +02:00
Jean-Didier PAILLEUX	15cfe4a774	[MLIR] Adding 'no_inline' and 'always_inline' attributes on LLMV::CallOp (#133726 ) Addition of `no_inline` and `always_inline` attributes for CallOps in MLIR in order to be able to inline or not directly the call of a function without having the attribute on the `FuncOp`. The addition of these attributes will be used in a future PR in Flang (`[NO]INLINE` directive).	2025-04-01 15:48:25 +02:00
Jeremy Morse	1ebc308bba	[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855 ) During the transition from debug intrinsics to debug records, we used several different command line options to customise handling: the printing of debug records to bitcode and textual could be independent of how the debug-info was represented inside a module, whether the autoupgrader ran could be customised. This was all valuable during development, but now that totally removing debug intrinsics is coming up, this patch removes those options in favour of a single flag (experimental-debuginfo-iterators), which enables autoupgrade, in-memory debug records, and debug record printing to bitcode and textual IR. We need to do this ahead of removing the experimental-debuginfo-iterators flag, to reduce the amount of test-juggling that happens at that time. There are quite a number of weird test behaviours related to this -- some of which I simply delete in this commit. Things like print-non-instruction-debug-info.ll , the test suite now checks for debug records in all tests, and we don't want to check we can print as intrinsics. Or the update_test_checks tests -- these are duplicated with write-experimental-debuginfo=false to ensure file writing for intrinsics is correct, but that's something we're imminently going to delete. A short survey of curious test changes: * free-intrinsics.ll: we don't need to test that debug-info is a zero cost intrinsic, because we won't be using intrinsics in the future. * undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory mode while we sorted something out; it works now either way. * salvage-cast-debug-info.ll: was testing intrinsics-in-memory get salvaged, isn't necessary now * localize-constexpr-debuginfo.ll: was producing "dead metadata" intrinsics for optimised-out variable values, dbg-records takes the (correct) representation of poison/undef as an operand. Looks like we didn't update this in the past to avoid spurious test differences. * Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing that debug-info affected codegen, and we deferred updating the tests until now. This is just one of those silent gnochange issues that get fixed by RemoveDIs. Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc, that checks we can autoupgrade debug intrinsics that are in bitcode into the new debug records.	2025-04-01 14:27:11 +01:00
Pablo Antonio Martinez	a338f80ddc	[mlir][Linalg] Add transform to convert linalg.copy into memref.copy (#132422 ) Targeted rewrite of a linalg.copy on memrefs to a memref.copy. This is useful when bufferizing copies to a linalg.copy, applying some transformations, and then rewriting the copy into a memref.copy. If the element types of the source and destination differ, or if the source is a scalar, the transform produces a silenceable failure.	2025-04-01 13:39:33 +01:00
Longsheng Mou	1d9ad99305	[mlir] Use llvm::hasSingleElement (NFC) (#133881 )	2025-04-01 18:58:50 +08:00
Longsheng Mou	1cf6786e32	[mlir] Improve error handling for dense attribute parsing in complex types (#133220 ) - For splat dense attributes, the number of parsed elements must be 2. - For non-splat dense attributes, the number of parsed elements must be twice the number of elements in the type. Fixes #132859.	2025-04-01 16:27:43 +08:00
Jerry-Ge	fd0785e67c	[mlir][tosa] Reorder ERF op to align with TOSA spec (#133814 ) Minor non-functional change of the dialect to better align with the operator order from the TOSA specification: https://www.mlplatform.org/tosa/tosa_spec.html Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-01 08:49:05 +01:00
Frank Schlimbach	49f080afc4	[mlir][mpi] Mandatory Communicator (#133280 ) This is replacing #125361 - communicator is mandatory - new mpi.comm_world - new mp.comm_split - lowering and test --------- Co-authored-by: Sergio Sánchez Ramírez <sergio.sanchez.ramirez+git@bsc.es>	2025-04-01 08:58:55 +02:00
Jakub Kuderski	66db3ccd8c	[mlir] Update vector return types for `.getMixed`* methods (NFC) (#133821 ) Drop small size to make vector types match the generic helper `getMixedValues` in `StaticValueUtils.h`. This saves some needles vector copies. I didn't find any local variables that need updating.	2025-03-31 19:56:46 -04:00
Sandeep Dasgupta	eefefb5da7	Fix sub-channel quantized type documentation (#133765 ) fixes the issue reported in https://github.com/llvm/llvm-project/pull/120172#issuecomment-2748367578	2025-03-31 16:45:54 -04:00
Sandeep Dasgupta	baacd1287b	Fix printing of `mlirUniformQuantizedSubChannelTypeGetNumBlockSizes` in 32-bit machine. (#133763 ) Fixes the issue reported in https://github.com/llvm/llvm-project/pull/120172#issuecomment-2763212827 cc @mgorny	2025-03-31 16:45:43 -04:00
Sandeep Dasgupta	b46c60272e	Remove unused function from StripFuncQuantTypes (#121594 ) `StripFuncQuantTypes::isLegalType` is unused and can be safely removed.	2025-03-31 14:50:32 -04:00

1 2 3 4 5 ...

22458 Commits