llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 23:36:35 +00:00

Author	SHA1	Message	Date
lorenzo chelini	57dc71352c	[MLIR][Bufferization] Retire `enforce-aliasing-invariants` (#130929 ) Why? This option can lead to incorrect IR if used in isolation, for example, consider the IR below: ```mlir func.func @loop_with_aliasing(%arg0: tensor<5xf32>, %arg1: index, %arg2: index) -> tensor<5xf32> { %c1 = arith.constant 1 : index %cst = arith.constant 1.000000e+00 : f32 %0 = tensor.empty() : tensor<5xf32> %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<5xf32>) -> tensor<5xf32> // The BufferizableOpInterface says that %2 alias with %arg0 or be a newly // allocated buffer %2 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (tensor<5xf32>) { scf.yield %1 : tensor<5xf32> } %cst_0 = arith.constant 1.000000e+00 : f32 %inserted = tensor.insert %cst_0 into %1[%c1] : tensor<5xf32> return %2 : tensor<5xf32> } ``` If we bufferize with: enforce-aliasing-invariants=false, we get: ``` func.func @loop_with_aliasing(%arg0: memref<5xf32, strided<[?], offset: ?>>, %arg1: index, %arg2: index) -> memref<5xf32, strided<[?], offset: ?>> { %c1 = arith.constant 1 : index %cst = arith.constant 1.000000e+00 : f32 %alloc = memref.alloc() {alignment = 64 : i64} : memref<5xf32> linalg.fill ins(%cst : f32) outs(%alloc : memref<5xf32>) %0 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (memref<5xf32, strided<[?], offset: ?>>) { %cast = memref.cast %alloc : memref<5xf32> to memref<5xf32, strided<[?], offset: ?>> scf.yield %cast : memref<5xf32, strided<[?], offset: ?>> } %cst_0 = arith.constant 1.000000e+00 : f32 memref.store %cst_0, %alloc[%c1] : memref<5xf32> return %0 : memref<5xf32, strided<[?], offset: ?>> } ``` Which is not correct IR since the loop yields the allocation. I am using this option. What do I need to do now? If you are using this option in isolation, you are possibly generating incorrect IR, so you need to revisit your bufferization strategy. If you are using it together with `copyBeforeWrite,` you simply need to retire the `enforceAliasingInvariants` option. Co-authored-by: Matthias Springer <mspringer@nvidia.com>	2025-03-18 08:42:43 +01:00
Longsheng Mou	4cb1430c1c	[mlir][spirv] Fix a crash in `spirv::ISubOp::fold` (#131570 ) This PR fixes a crash if `spirv.ISub` is not integer type. Fixes #131283.	2025-03-18 09:18:49 +08:00
William Moses	d9c65af626	[MLIR][GPUToNVVM] Support 32-bit isfinite (#131699 ) Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp>	2025-03-18 02:11:38 +01:00
Johannes de Fine Licht	c3f750250a	[MLIR][LLVM] Handle floats in Mem2Reg of memset intrinsics (#131621 ) This was lacking a bitcast from the shifted integer type into a float. Other non-struct types than integers and floats will still not be Mem2Reg'ed. Also adds special handling for constants to be emitted as a constant directly rather than relying on followup canonicalization patterns (`memset` of zero is a case that can appear in the wild).	2025-03-17 22:31:28 +01:00
Christian Ulmann	800593a014	[MLIR][LLVM] Avoid duplicated module flags in the export (#131627 ) This commit resolves an issue in the LLVMIR export that caused the duplication of the "Debug Info Version" module flag, when it was already in MLIR.	2025-03-17 17:43:15 +01:00
Zhuoran Yin	1e89a76a04	[MLIR] Refactor to create vectorization convOp precondition check (#130181 ) In corner situations, the vectorization pass may face to lower a conv2d op and assert in a completely irrelevant location in vectorizeConvolution() subroutine. ~~This PR rejects the conv2d op early and make the asserted routine to return failure as a defensive workaround.~~ In addressing this, the PR moved all condition check away from the `Conv1dGenerator` into the `convOpPreconditionCheck()` function. This makes the unsupported ops such as conv2d to be rejected early and leave a cleaner `Conv1dGenerator` constructor.	2025-03-17 09:32:45 -04:00
Luke Hutton	0c34d7a9e7	[mlir][tosa] Require operand/result tensors of at least rank 1 for some operations (#131335 ) This commit updates the following operations (operands/results) to be of at least rank 1 such that it aligns with the expectations of the specification: - ARGMAX (input) - REDUCE_ALL (input/output) - REDUCE_ANY (input/output) - REDUCE_MAX (input/output) - REDUCE_MIN (input/output) - REDUCE_PRODUCT (input/output) - REDUCE_SUM (input/output) - CONCAT (each input in input1/output) - PAD (input1/output) - REVERSE (input1/output) - SLICE (input1/output) - TILE (input1/output) - TRANSPOSE (input1/output) In addition to this change, PAD has been updated to allow unranked tensors for input1/output, inline with other operations.	2025-03-17 10:22:52 +00:00
Matthias Springer	6c867e27a7	[mlir] Use `getSingleElement`/`hasSingleElement` in various places (#131460 ) This is a code cleanup. Update a few places in MLIR that should use `hasSingleElement`/`getSingleElement`. Note: `hasSingleElement` is faster than `.getSize() == 1` when it is used with linked lists etc. Depends on #131508.	2025-03-17 07:43:18 +01:00
Ivan Butygin	7c98cddc5a	[mlir] Expose `AffineExpr.shift_dims/shift_symbols` through C and Python bindings (#131521 )	2025-03-16 19:57:56 +03:00
Andrzej Warzyński	d928a671b8	[mlir][Vector] Refactor VectorEmulateNarrowType.cpp (#123529 ) This is PR refactors `alignedConversionPrecondition` from VectorEmulateNarrowType.cpp and adds new helper hooks. Update `alignedConversionPrecondition` (1) This method doesn't require the vector type for the "container" argument. The underlying element type is sufficient. The corresponding argument has been renamed as `containerTy` - this is meant as the multi-byte container element type (`i8`, `i16`, `i32`, etc). With this change, the updated invocations of `alignedConversionPrecondition` (in e.g. `RewriteAlignedSubByteIntExt`) make it clear that the container element type is assumed to be `i8`. Update alignedConversionPrecondition (2): The final check in `alignedConversionPrecondition` has been replaced with a new helper method, `isSubByteVecFittable`. This helper hook is now also re-used in `ConvertVectorTransferRead` (to improve code re-use). Other updates Extended + unified comments. Implements: https://github.com/llvm/llvm-project/issues/123630	2025-03-16 12:22:46 +00:00
Matthias Springer	6c2f8476e7	[mlir][Transforms] Dialect Conversion: Add 1:N support to `remapInput` (#131454 ) This commit adds 1:N support to `SignatureConversion::remapInputs`. This API allows users to replace a block argument with multiple replacement values. (And the block argument is dropped.) The API already supported "bbarg --> multiple bbargs" mappings, but "bbarg --> multiple SSA values" was missing. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2025-03-15 18:33:06 +01:00
Bangtian Liu	d52ec1e9dd	[MLIR][NFC] fix msvc debug build errors (#131393 ) We found the build broken using msvc debug build as below: ``` C:\Users\bangtliu\iree\third_party\llvm-project\llvm\include\llvm/ADT/SmallVector.h(1162): error C2338: static_assert failed: 'You are trying to use a default number of inlined elements for `SmallVector<T>` but `sizeof(T)` is really big! Please use an explicit number of inlined elements with `SmallVector<T, N>` to make sure you really want that much inline storage.' C:\Users\bangtliu\iree\third_party\llvm-project\llvm\include\llvm/ADT/SmallVector.h(1162): note: the template instantiation context (the oldest one first) is C:\Users\bangtliu\iree\third_party\llvm-project\llvm\include\llvm/ADT/SmallVector.h(1194): note: see reference to class template instantiation 'llvm::CalculateSmallVectorDefaultInlinedElements<T>' being compiled with [ T=`anonymous-namespace'::LinalgOperandDef ] C:\Users\bangtliu\iree\third_party\llvm-project\mlir\tools\mlir-linalg-ods-gen\mlir-linalg-ods-yaml-gen.cpp(120): error C2976: 'llvm::SmallVector': too few template arguments C:\Users\bangtliu\iree\third_party\llvm-project\llvm\include\llvm/ADT/SmallVector.h(1195): note: see declaration of 'llvm::SmallVector' [862/7776] Building CXX object llvm-project\lib\DebugInfo\DWARF\CMakeFiles\LLVMDebugInfoDWARF.dir\DWARFDebugLine.cpp.obj ninja: build stopped: subcommand failed. ``` This PR is added to address this error.	2025-03-14 21:36:53 -04:00
Bruno Cardoso Lopes	5265412c13	[MLIR][LLVMIR] Import: add flag to prefer using unregistered intrinsics (#130685 ) Currently, there is no common mechanism for supported intrinsics to be generically annotated with arg and ret attributes. Since there are many supported intrinsics around different dialects, the amount of work to teach all them about these attributes is not trivial (though it would be nice in the long term). This PR adds a new flag `-prefer-unregistered-intrinsics` that can be used alongside `--import-llvm` to always use `llvm.intrinsic_call` during import time (ignoring dialect hooks for custom intrinsic support). Using this flag allow us to roundtrip the LLVM IR while eliminating a whole set of differences coming from lack of arg/ret attributes on supported intrinsics. Note `convertIntrinsic` has to be moved to an implementation file because it queries into `moduleImport` state, which is a fwd declaration in `LLVMImportInterface.h`	2025-03-14 18:04:32 -07:00
Bruno Cardoso Lopes	29a000023c	[MLIR][LLVMIR] Add module flags support (#130679 ) Import and translation support. Note that existing support (prior to this PR) already covers enough in translation specifically to emit "Debug Info Version". Also, the debug info version metadata is being emitted even though the imported IR has no information and is showing up in some tests (will fix that in another PR). --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com> Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz>	2025-03-14 18:03:36 -07:00
Kazu Hirata	456963de96	[mlir] Fix warnings This patch fixes: mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp:54:3: error: definition of implicit copy assignment operator for 'Layout' is deprecated because it has a user-declared copy constructor [-Werror,-Wdeprecated-copy] mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp:103:3: error: definition of implicit copy assignment operator for 'SGMap' is deprecated because it has a user-declared copy constructor [-Werror,-Wdeprecated-copy]	2025-03-14 13:28:58 -07:00
Luke Hutton	955c02dc9c	[mlir][tosa] Check for compile time constants in the validation pass (#131123 ) This commit adds a concept of the 'dynamic' extension in the Dialect and checks that compile time constant (CTC) operands for each operator are constant if the dynamic extension is not loaded. Operands labeled as CTC in the specification that are of tosa.shape (shape_t in the specification) type are not checked as they are always expected to be constant. This requirement is checked elsewhere in the dialect. Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2025-03-14 12:45:01 -07:00
MaheshRavishankar	2490f7f076	[mlir][Linalg] Allow expand shape propagation across linalg ops with dynamic shapes. (#127943 ) With `tensor.expand_shape` allowing expanding dynamic dimension into multiple dynamic dimension, adapt the reshape propagation through expansion to handle cases where one dynamic dimension is expanded into multiple dynamic dimension. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-03-14 12:42:42 -07:00
Charitha Saumya	fd24805c8e	Reapply [mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution. (#131380 ) Originally introduced in #130240 and reverted in #131364 Reproduced the issue locally in Linux by doing a shared lib build. Fixes including adding the missing LINK_LIBS. Original commit message: This PR adds the SG map propagation step of the XeGPU SIMT distribution. SG map propagation is a sparse backward dataflow analysis that propagate the sg_map backward starting from the operands of certain operations (DPAS, store etc.). This is the first step of XeGPU subgroup distribution. This analysis result is used to attach layout information to each XeGPU SIMD subgroup op. The lowering patterns in XeGPUSubgroupDistribute will consume these layout info to distribute SIMD ops into SIMT ops that work on work-item level data fragments. Summary of Lowering XeGPU SIMD -> SIMT Subgroup map propagation (This PR) Attach sg_map to each op in move all ops inside gpu.warp_execute_on_lane0 region. Distribute each op using sg_map Additional legalization steps to align more with Xe HW.	2025-03-14 12:38:36 -07:00
Charitha Saumya	3fcd921aa4	Revert "[mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution." (#131364 ) Reverts llvm/llvm-project#130240	2025-03-14 10:36:58 -07:00
Charitha Saumya	5eb557774d	[mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution. (#130240 ) This PR adds the SG map propagation step of the XeGPU SIMT distribution. SG map propagation is a sparse backward dataflow analysis that propagate the sg_map backward starting from the operands of certain operations (DPAS, store etc.). This is the first step of XeGPU subgroup distribution. This analysis result is used to attach layout information to each XeGPU SIMD subgroup op. The lowering patterns in XeGPUSubgroupDistribute will consume these layout info to distribute SIMD ops into SIMT ops that work on work-item level data fragments. ### Summary of Lowering XeGPU SIMD -> SIMT 1. Subgroup map propagation (This PR) 2. Attach `sg_map` to each op in move all ops inside `gpu.warp_execute_on_lane0` region. 3. Distribute each op using `sg_map` 4. Additional legalization steps to align more with Xe HW.	2025-03-14 10:21:22 -07:00
Nikolay Panchenko	3ac5d8da61	[mlir-lsp] Abstract input and output of the `JSONTransport` (#129320 ) The patch abstracts sending and receiving json messages of `JSONTransport` to allow custom implementation of them. For example, one concrete implementation can use pipes without a need to convert file descriptor to a `FILE` object.	2025-03-14 12:31:28 -04:00
Sergio Afonso	72b8744aa5	[MLIR][OpenMP] Reduce overhead of target compilation (#130945 ) This patch avoids calling `TargetOp::getInnermostCapturedOmpOp` multiple times during initialization of default and runtime target attributes in MLIR to LLVM IR translation of `omp.target` operations. This is a potentially expensive operation, so this change should help keep compile times lower.	2025-03-14 15:18:32 +00:00
Peter Hawkins	244cf89f14	[mlir][python] Small optimization to mlirApiObjectToCapsule. (#131160 ) Call nb::getattr(...) rather than using nb::hasattr() and .attr(). Saves a Python string allocation and a dictionary lookup when using a recent nanobind. Optimization only, no changes in behavior expected.	2025-03-14 08:10:42 -07:00
mihailo-stojanovic	fc8b2bf2f8	[MLIR][LLVM] Import dereferenceable metadata from LLVM IR (#130974 ) Add support for importing `dereferenceable` and `dereferenceable_or_null` metadata into LLVM dialect. Add a new attribute which models these two metadata nodes and a new OpInterface.	2025-03-14 09:30:47 +01:00
Kai Sasaki	befa037c13	[mlir][affine] Guard invalid dim attribute in the test-reify-bound pass (#129013 ) Computing the bound of affine op (ValueBoundsConstraintSet::computeBound) crashes due to the invalid dim value given to the op. It is necessary for the pass to check the dim attribute not to be greater than the rank of the input type. Fixes https://github.com/llvm/llvm-project/issues/128807	2025-03-14 08:09:01 +09:00
Luke Hutton	1c45514748	[mlir][tosa] Fix bug causing quantized pad const creation crash (#131125 ) This commit ensures the storage type is retrieved correctly which fixes a crash when creating a quantized pad const tensor. Testing is completed via the `tosa-optional-decompositions` pass which makes use of the `createPadConstTensor` function. Also includes some cleanup.	2025-03-13 13:17:47 -07:00
Daniel Hernandez-Juarez	64f67f870d	[mlir][AMDGPU] Enable emulating vector buffer_atomic_fadd for bf16 on gfx942 (#129029 ) - Change to make sure architectures < gfx950 emulate bf16 buffer_atomic_fadd - Add tests for bf16 buffer_atomic_fadd and architectures: gfx12, gfx942 and gfx950 --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2025-03-13 14:30:45 -05:00
Nirvedh Meshram	ca4399036f	[mlir][linalg] Add FoldReshapeWithGenericOpByCollapsing pattern (#131029 ) This pattern to bubble up collapse shapes was missing in `populateFoldReshapeOpsByCollapsingPatterns` . Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>	2025-03-13 14:22:50 -05:00
Christopher Di Bella	933ecf5f30	[mlir] adds `[[maybe_unused]]` to variables that might not be used (#131184 ) This should suppress an unused variable warning that was seemingly pervasive.	2025-03-13 11:44:53 -07:00
Adam Siemieniuk	a16c225b40	[mlir][xegpu] Convert Vector contraction to XeGPU (#122115 ) Adds pattern to lower vector.contract to XeGPU operation.	2025-03-13 19:41:53 +01:00
Uday Bondhugula	0ed5f9b22b	[MLIR] NFC. Fix unused warning in affine loop utils	2025-03-13 20:56:35 +05:30
Kareem Ergawy	b003face11	[flang][OpenMP] Add `OutlineableOpenMPOpInterface` to `omp.teams` (#131109 ) Given the following input: ```fortran program rep_loopbind implicit none integer :: i real :: priv_val !$omp teams private(priv_val) !$omp distribute do i=1,1000 end do !$omp end teams end program ``` the `AllocaOpConversion` pattern in `FIRToLLVMLowering` would move the private allocations that belong to the `teams` directive (i.e. the allocations needed for the private copies of `priv_val` and the loop's iteration variable) from the the `omp.teams` op to the outside scope. This is not correct since these allocations should be eventually emitted inside the outlined region for the `teams` directive. Without this fix, these allocation would be emitted in the parent function (or the parent scope whatever it is).	2025-03-13 16:03:19 +01:00
Michael Klemm	28ffa7f6a4	[flang][OpenMP] Fix missing missing inode issue (#130798 ) When outlining an offload region, Flang creates a unique name by querying an inode ID. However, when the name of the actual source file does not match the logical file in a `#line` preprocessor directive, code-gen was failing as it could not determine the inode ID. This PR checks for this condition and if the logical file name does not exist, the inode is replaced with a hash value created from the source code itself.	2025-03-13 15:50:37 +01:00
Sergio Afonso	237a910819	[MLIR][OpenMP] Remove the ReductionClauseInterface, NFC (#130978 ) This patch removes the `ReductionClauseInterface` and all definitions of its associated `getAllReductionVars` method. The method mandated by this interface is not used anywhere and the conflicts its definition produces when multiple reduction clauses are present in an operation result in a more convoluted operation definition, so it seems better to remove it and only add something like this if there's a clear advantage to it.	2025-03-13 14:50:23 +00:00
Sergio Afonso	6ff33edf4d	[MLIR][OpenMP] Minor improvements to BlockArgOpenMPOpInterface, NFC (#130789 ) This patch introduces a use for the new `getBlockArgsPairs` to avoid having to manually list each applicable clause. Also, the `numClauseBlockArgs()` function is introduced, which simplifies the implementation of the interface's verifier and enables better memory handling within `getBlockArgsPairs`.	2025-03-13 14:48:19 +00:00
Artemiy Bulavin	0aa5ba43a0	[mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled (#128566 ) Currently, `DistinctAttr` uses an allocator wrapped in a `ThreadLocalCache` to manage attribute storage allocations. This ensures all allocations are freed when the allocator is destroyed. However, this setup can cause use-after-free errors when `mlir::PassManager` runs its passes on a separate thread as a result of crash reproduction being enabled. Distinct attribute storages are created in the child thread's local storage and freed once the thread joins. Attempting to access these attributes after this can result in segmentation faults, such as during printing or alias analysis. Example: This invocation of `mlir-opt` demonstrates the segfault issue due to distinct attributes being created in a child thread and their storage being freed once the thread joins: ``` mlir-opt --mlir-pass-pipeline-crash-reproducer=. --test-distinct-attrs mlir/test/IR/test-builtin-distinct-attrs.mlir ``` This pull request changes the distinct attribute allocator to use different allocators depending on whether or not threading is enabled and whether or not the pass manager is running its passes in a separate thread. If multithreading is disabled, a non thread-local allocator is used. If threading remains enabled and the pass manager invokes its pass pipelines in a child thread, then a non-thread local but synchronised allocator is used. This ensures that the lifetime of allocated storage persists beyond the lifetime of the child thread. I have added two tests for the `-test-distinct-attrs` pass and the `-enable-debug-info-on-llvm-scope` passes that run them with crash reproduction enabled.	2025-03-13 15:00:39 +01:00
Ivan Butygin	02fae68a45	[mlir][vector] VectorLinearize: `ub.poison` support (#128612 ) Unify `arith.constant` and `up.poison` using `OpTraitConversionPattern<OpTrait::ConstantLike>`.	2025-03-13 14:18:21 +03:00
Ivan Butygin	786e70ff10	[mlir] Change `TypeOrValueSemanticsContainer` base from `TypeConstraint` to `Type` (#129433 ) `Type` is derived from `TypeConstraint`. Using `Type` as base allows to use `SignlessIntegerLike` and friends in `Variadic<>`.	2025-03-13 14:17:42 +03:00
Uday Bondhugula	55b806c2af	[MLIR][Affine] Fix affine data copy generation copy placement for missing memref definition check (#130750 ) This was exposed with the test case previously added but when performing generation with limited memory capacity.	2025-03-13 16:08:53 +05:30
Matthias Springer	59fd2878fc	[mlir][memref] Clean up `load`/`store` documentation (#130569 ) Remove references to the Affine dialect. The documentation is outdated. Separate `affine.load/store` ops have been added. Also add documentation for `nontemporal`.	2025-03-13 09:42:14 +01:00
Pradeep Kumar	21cef8aa1c	[MLIR][NVVM] Add support for tcgen05.{ld, st} (#130728 ) This commit adds support for tcgen05.{ld, st} to the NVVM Dialect with tests under tcgen05-ld.mlir and tcgen05-st.mlir respectively	2025-03-13 12:07:04 +05:30
Christopher Bate	3438dfc7ff	[mlir][tensor] Fix bufferization interface for 'tensor.reshape' (#128590 ) Previously, the BufferizableOpInterface implementation for 'tensor.reshape' listed the 'shape' operand as an alias for the result tensor, causing unnecessary conflicts with ops that "write" to the shape operand.	2025-03-12 22:19:01 -06:00
Javed Absar	ecf4d995f6	[mlir][linalg][elementwise] Fold transpose into new elementwise (#130207 ) Fold transpose into new elementwise Op which has affine-map attached. Will add broadcast folding in next diff.	2025-03-12 23:04:44 +00:00
Luke Hutton	c44c905174	[mlir][tosa] Add error if verification to pooling operators (#130052 ) This commit adds the following checks to avg_pool2d and max_pool2d TOSA operations: - check kernel values are >= 1 - check stride values are >= 1 - check padding values are >= 0 - check padding values are less than kernel sizes - check output shape matches the expected output shape Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2025-03-12 10:48:33 -07:00
Artemiy Bulavin	fc127ff53d	[mlir] Extract RHS rows once when lowering vector.contract to dot (#130130 ) The `vector.contract` op on two matrices A and B will be lowered to individual dot products of each row and column of A and B respectively. The existing lowering will extract each column of B for each row of A, which leads to multiple values in the IR representing the same columns of B. This PR makes changes to the `ContractOpToDotLowering` to make sure that the columns of B are only ever extracted once, so then the SSA values representing the extracted columns are then re-used in the IR for later dot products. I have updated the existing vector-contract-to-dot-transforms test.	2025-03-12 17:16:49 +00:00
Nikita Popov	f137c3d592	[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940 ) This avoids doing a Triple -> std::string -> Triple round trip in lots of places, now that the Module stores a Triple.	2025-03-12 17:35:09 +01:00
MaheshRavishankar	665299eb3e	[mlir][Transforms] Add a utility method to move value definitions. (#130874 ) `205c5325b3` added a transform utility that moved all SSA dependences of an operation before an insertion point. Similar to that, this PR adds a transform utility function, `moveValueDefinitions` to move the slice of operations that define all values in a `ValueRange` before the insertion point. While very similar to `moveOperationDependencies`, this method differs in a few ways 1. When computing the backward slice since the start of the slice is value, the slice computed needs to be inclusive. 2. The combined backward slice needs to be sorted topologically before moving them to avoid SSA use-def violations while moving individual ops. The PR also adds a new transform op to test this new utility function. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-03-12 08:30:43 -07:00
Sergio Afonso	032f83b743	[MLIR][OpenMP] Enable BlockArgOpenMPOpInterface accessing operands (#130769 ) This patch makes additions to the `BlockArgOpenMPOpInterface` to simplify its use by letting it handle the matching between operands and their associated entry block arguments. Most significantly, the following is now possible: ```c++ SmallVector<std::pair<Value, BlockArgument>> pairs; cast<BlockArgOpenMPOpInterface>(op).getBlockArgsPairs(pairs); for (auto [var, arg] : pairs) { // var points to the operand (outside value) and arg points to the entry // block argument associated to that value. } ``` This is achieved by making the interface define and use `getXyzVars()` methods, which by default return empty `OperandRange`s and are overriden by getters automatically produced for the `Variadic<...> $xyz_vars` tablegen argument of the corresponding clause. These definitions can then be simplified, since they no longer need to manually define `numXyzBlockArgs` functions as a result. A side-effect of this is that all ops implementing this interface will now publicly define `getXyzVars()` functions for all entry block argument-generating clauses, even if they don't actually accept all clauses. However, these would just return empty ranges, so it shouldn't cause issues. This change uncovered some incorrect definitions of class declarations related to the `ReductionClauseInterface`, and the `OpenMP_DetachClause` incorrectly implementing the `BlockArgOpenMPOpInterface`, so these issues are also addressed.	2025-03-12 11:50:12 +00:00
Matthias Springer	418e07b7e6	[mlir][Tensor] Check for out-of-bounds slice in `insert/extract_slice` verifier (#130487 ) Also fix test cases that had invalid ops.	2025-03-12 08:34:21 +01:00
Oleksandr "Alex" Zinenko	6981f7e92a	[mlir] account for explicit affine.parallel in parallelization (#130812 ) Affine parallelization should take explicitly parallel loops into account when computing loop depth for dependency analysis purposes. This was previously not the case, potentially leading to loops incorrectly being marked as parallel due to depth mismatch.	2025-03-11 20:53:50 -05:00

1 2 3 4 5 ...

22391 Commits