llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 23:56:06 +00:00

Author	SHA1	Message	Date
Kareem Ergawy	49b8d8472f	[OpenMP][MLIR] Support LLVM translation for `distribute` with delayed privatization (#131564 ) Adds support for tranlating delayed privatization (`private` and `firstprivate`) for `omp.distribute` ops.	2025-03-18 10:14:42 +01:00
Matthias Springer	e614e840bc	[mlir][memref] Add runtime verification for `memref.dim` (#130410 ) Add runtime verification for `memref.dim`: check that the index is in bounds. Also simplify the pass pipeline for all memref runtime verification checks.	2025-03-18 09:10:49 +01:00
Longsheng Mou	4cb1430c1c	[mlir][spirv] Fix a crash in `spirv::ISubOp::fold` (#131570 ) This PR fixes a crash if `spirv.ISub` is not integer type. Fixes #131283.	2025-03-18 09:18:49 +08:00
William Moses	d9c65af626	[MLIR][GPUToNVVM] Support 32-bit isfinite (#131699 ) Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp>	2025-03-18 02:11:38 +01:00
Johannes de Fine Licht	c3f750250a	[MLIR][LLVM] Handle floats in Mem2Reg of memset intrinsics (#131621 ) This was lacking a bitcast from the shifted integer type into a float. Other non-struct types than integers and floats will still not be Mem2Reg'ed. Also adds special handling for constants to be emitted as a constant directly rather than relying on followup canonicalization patterns (`memset` of zero is a case that can appear in the wild).	2025-03-17 22:31:28 +01:00
Christian Ulmann	800593a014	[MLIR][LLVM] Avoid duplicated module flags in the export (#131627 ) This commit resolves an issue in the LLVMIR export that caused the duplication of the "Debug Info Version" module flag, when it was already in MLIR.	2025-03-17 17:43:15 +01:00
Zhuoran Yin	1e89a76a04	[MLIR] Refactor to create vectorization convOp precondition check (#130181 ) In corner situations, the vectorization pass may face to lower a conv2d op and assert in a completely irrelevant location in vectorizeConvolution() subroutine. ~~This PR rejects the conv2d op early and make the asserted routine to return failure as a defensive workaround.~~ In addressing this, the PR moved all condition check away from the `Conv1dGenerator` into the `convOpPreconditionCheck()` function. This makes the unsupported ops such as conv2d to be rejected early and leave a cleaner `Conv1dGenerator` constructor.	2025-03-17 09:32:45 -04:00
Luke Hutton	0c34d7a9e7	[mlir][tosa] Require operand/result tensors of at least rank 1 for some operations (#131335 ) This commit updates the following operations (operands/results) to be of at least rank 1 such that it aligns with the expectations of the specification: - ARGMAX (input) - REDUCE_ALL (input/output) - REDUCE_ANY (input/output) - REDUCE_MAX (input/output) - REDUCE_MIN (input/output) - REDUCE_PRODUCT (input/output) - REDUCE_SUM (input/output) - CONCAT (each input in input1/output) - PAD (input1/output) - REVERSE (input1/output) - SLICE (input1/output) - TILE (input1/output) - TRANSPOSE (input1/output) In addition to this change, PAD has been updated to allow unranked tensors for input1/output, inline with other operations.	2025-03-17 10:22:52 +00:00
Matthias Springer	6c867e27a7	[mlir] Use `getSingleElement`/`hasSingleElement` in various places (#131460 ) This is a code cleanup. Update a few places in MLIR that should use `hasSingleElement`/`getSingleElement`. Note: `hasSingleElement` is faster than `.getSize() == 1` when it is used with linked lists etc. Depends on #131508.	2025-03-17 07:43:18 +01:00
Ivan Butygin	7c98cddc5a	[mlir] Expose `AffineExpr.shift_dims/shift_symbols` through C and Python bindings (#131521 )	2025-03-16 19:57:56 +03:00
Matthias Springer	6c2f8476e7	[mlir][Transforms] Dialect Conversion: Add 1:N support to `remapInput` (#131454 ) This commit adds 1:N support to `SignatureConversion::remapInputs`. This API allows users to replace a block argument with multiple replacement values. (And the block argument is dropped.) The API already supported "bbarg --> multiple bbargs" mappings, but "bbarg --> multiple SSA values" was missing. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2025-03-15 18:33:06 +01:00
Bruno Cardoso Lopes	5265412c13	[MLIR][LLVMIR] Import: add flag to prefer using unregistered intrinsics (#130685 ) Currently, there is no common mechanism for supported intrinsics to be generically annotated with arg and ret attributes. Since there are many supported intrinsics around different dialects, the amount of work to teach all them about these attributes is not trivial (though it would be nice in the long term). This PR adds a new flag `-prefer-unregistered-intrinsics` that can be used alongside `--import-llvm` to always use `llvm.intrinsic_call` during import time (ignoring dialect hooks for custom intrinsic support). Using this flag allow us to roundtrip the LLVM IR while eliminating a whole set of differences coming from lack of arg/ret attributes on supported intrinsics. Note `convertIntrinsic` has to be moved to an implementation file because it queries into `moduleImport` state, which is a fwd declaration in `LLVMImportInterface.h`	2025-03-14 18:04:32 -07:00
Bruno Cardoso Lopes	29a000023c	[MLIR][LLVMIR] Add module flags support (#130679 ) Import and translation support. Note that existing support (prior to this PR) already covers enough in translation specifically to emit "Debug Info Version". Also, the debug info version metadata is being emitted even though the imported IR has no information and is showing up in some tests (will fix that in another PR). --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com> Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz>	2025-03-14 18:03:36 -07:00
Luke Hutton	955c02dc9c	[mlir][tosa] Check for compile time constants in the validation pass (#131123 ) This commit adds a concept of the 'dynamic' extension in the Dialect and checks that compile time constant (CTC) operands for each operator are constant if the dynamic extension is not loaded. Operands labeled as CTC in the specification that are of tosa.shape (shape_t in the specification) type are not checked as they are always expected to be constant. This requirement is checked elsewhere in the dialect. Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2025-03-14 12:45:01 -07:00
MaheshRavishankar	2490f7f076	[mlir][Linalg] Allow expand shape propagation across linalg ops with dynamic shapes. (#127943 ) With `tensor.expand_shape` allowing expanding dynamic dimension into multiple dynamic dimension, adapt the reshape propagation through expansion to handle cases where one dynamic dimension is expanded into multiple dynamic dimension. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-03-14 12:42:42 -07:00
Charitha Saumya	fd24805c8e	Reapply [mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution. (#131380 ) Originally introduced in #130240 and reverted in #131364 Reproduced the issue locally in Linux by doing a shared lib build. Fixes including adding the missing LINK_LIBS. Original commit message: This PR adds the SG map propagation step of the XeGPU SIMT distribution. SG map propagation is a sparse backward dataflow analysis that propagate the sg_map backward starting from the operands of certain operations (DPAS, store etc.). This is the first step of XeGPU subgroup distribution. This analysis result is used to attach layout information to each XeGPU SIMD subgroup op. The lowering patterns in XeGPUSubgroupDistribute will consume these layout info to distribute SIMD ops into SIMT ops that work on work-item level data fragments. Summary of Lowering XeGPU SIMD -> SIMT Subgroup map propagation (This PR) Attach sg_map to each op in move all ops inside gpu.warp_execute_on_lane0 region. Distribute each op using sg_map Additional legalization steps to align more with Xe HW.	2025-03-14 12:38:36 -07:00
Charitha Saumya	3fcd921aa4	Revert "[mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution." (#131364 ) Reverts llvm/llvm-project#130240	2025-03-14 10:36:58 -07:00
Charitha Saumya	5eb557774d	[mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution. (#130240 ) This PR adds the SG map propagation step of the XeGPU SIMT distribution. SG map propagation is a sparse backward dataflow analysis that propagate the sg_map backward starting from the operands of certain operations (DPAS, store etc.). This is the first step of XeGPU subgroup distribution. This analysis result is used to attach layout information to each XeGPU SIMD subgroup op. The lowering patterns in XeGPUSubgroupDistribute will consume these layout info to distribute SIMD ops into SIMT ops that work on work-item level data fragments. ### Summary of Lowering XeGPU SIMD -> SIMT 1. Subgroup map propagation (This PR) 2. Attach `sg_map` to each op in move all ops inside `gpu.warp_execute_on_lane0` region. 3. Distribute each op using `sg_map` 4. Additional legalization steps to align more with Xe HW.	2025-03-14 10:21:22 -07:00
mihailo-stojanovic	fc8b2bf2f8	[MLIR][LLVM] Import dereferenceable metadata from LLVM IR (#130974 ) Add support for importing `dereferenceable` and `dereferenceable_or_null` metadata into LLVM dialect. Add a new attribute which models these two metadata nodes and a new OpInterface.	2025-03-14 09:30:47 +01:00
Kai Sasaki	befa037c13	[mlir][affine] Guard invalid dim attribute in the test-reify-bound pass (#129013 ) Computing the bound of affine op (ValueBoundsConstraintSet::computeBound) crashes due to the invalid dim value given to the op. It is necessary for the pass to check the dim attribute not to be greater than the rank of the input type. Fixes https://github.com/llvm/llvm-project/issues/128807	2025-03-14 08:09:01 +09:00
Luke Hutton	1c45514748	[mlir][tosa] Fix bug causing quantized pad const creation crash (#131125 ) This commit ensures the storage type is retrieved correctly which fixes a crash when creating a quantized pad const tensor. Testing is completed via the `tosa-optional-decompositions` pass which makes use of the `createPadConstTensor` function. Also includes some cleanup.	2025-03-13 13:17:47 -07:00
Daniel Hernandez-Juarez	64f67f870d	[mlir][AMDGPU] Enable emulating vector buffer_atomic_fadd for bf16 on gfx942 (#129029 ) - Change to make sure architectures < gfx950 emulate bf16 buffer_atomic_fadd - Add tests for bf16 buffer_atomic_fadd and architectures: gfx12, gfx942 and gfx950 --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2025-03-13 14:30:45 -05:00
Nirvedh Meshram	ca4399036f	[mlir][linalg] Add FoldReshapeWithGenericOpByCollapsing pattern (#131029 ) This pattern to bubble up collapse shapes was missing in `populateFoldReshapeOpsByCollapsingPatterns` . Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>	2025-03-13 14:22:50 -05:00
Adam Siemieniuk	a16c225b40	[mlir][xegpu] Convert Vector contraction to XeGPU (#122115 ) Adds pattern to lower vector.contract to XeGPU operation.	2025-03-13 19:41:53 +01:00
Artemiy Bulavin	0aa5ba43a0	[mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled (#128566 ) Currently, `DistinctAttr` uses an allocator wrapped in a `ThreadLocalCache` to manage attribute storage allocations. This ensures all allocations are freed when the allocator is destroyed. However, this setup can cause use-after-free errors when `mlir::PassManager` runs its passes on a separate thread as a result of crash reproduction being enabled. Distinct attribute storages are created in the child thread's local storage and freed once the thread joins. Attempting to access these attributes after this can result in segmentation faults, such as during printing or alias analysis. Example: This invocation of `mlir-opt` demonstrates the segfault issue due to distinct attributes being created in a child thread and their storage being freed once the thread joins: ``` mlir-opt --mlir-pass-pipeline-crash-reproducer=. --test-distinct-attrs mlir/test/IR/test-builtin-distinct-attrs.mlir ``` This pull request changes the distinct attribute allocator to use different allocators depending on whether or not threading is enabled and whether or not the pass manager is running its passes in a separate thread. If multithreading is disabled, a non thread-local allocator is used. If threading remains enabled and the pass manager invokes its pass pipelines in a child thread, then a non-thread local but synchronised allocator is used. This ensures that the lifetime of allocated storage persists beyond the lifetime of the child thread. I have added two tests for the `-test-distinct-attrs` pass and the `-enable-debug-info-on-llvm-scope` passes that run them with crash reproduction enabled.	2025-03-13 15:00:39 +01:00
Ivan Butygin	02fae68a45	[mlir][vector] VectorLinearize: `ub.poison` support (#128612 ) Unify `arith.constant` and `up.poison` using `OpTraitConversionPattern<OpTrait::ConstantLike>`.	2025-03-13 14:18:21 +03:00
Ivan Butygin	786e70ff10	[mlir] Change `TypeOrValueSemanticsContainer` base from `TypeConstraint` to `Type` (#129433 ) `Type` is derived from `TypeConstraint`. Using `Type` as base allows to use `SignlessIntegerLike` and friends in `Variadic<>`.	2025-03-13 14:17:42 +03:00
Uday Bondhugula	55b806c2af	[MLIR][Affine] Fix affine data copy generation copy placement for missing memref definition check (#130750 ) This was exposed with the test case previously added but when performing generation with limited memory capacity.	2025-03-13 16:08:53 +05:30
Pradeep Kumar	21cef8aa1c	[MLIR][NVVM] Add support for tcgen05.{ld, st} (#130728 ) This commit adds support for tcgen05.{ld, st} to the NVVM Dialect with tests under tcgen05-ld.mlir and tcgen05-st.mlir respectively	2025-03-13 12:07:04 +05:30
Christopher Bate	3438dfc7ff	[mlir][tensor] Fix bufferization interface for 'tensor.reshape' (#128590 ) Previously, the BufferizableOpInterface implementation for 'tensor.reshape' listed the 'shape' operand as an alias for the result tensor, causing unnecessary conflicts with ops that "write" to the shape operand.	2025-03-12 22:19:01 -06:00
Javed Absar	ecf4d995f6	[mlir][linalg][elementwise] Fold transpose into new elementwise (#130207 ) Fold transpose into new elementwise Op which has affine-map attached. Will add broadcast folding in next diff.	2025-03-12 23:04:44 +00:00
Luke Hutton	c44c905174	[mlir][tosa] Add error if verification to pooling operators (#130052 ) This commit adds the following checks to avg_pool2d and max_pool2d TOSA operations: - check kernel values are >= 1 - check stride values are >= 1 - check padding values are >= 0 - check padding values are less than kernel sizes - check output shape matches the expected output shape Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2025-03-12 10:48:33 -07:00
Artemiy Bulavin	fc127ff53d	[mlir] Extract RHS rows once when lowering vector.contract to dot (#130130 ) The `vector.contract` op on two matrices A and B will be lowered to individual dot products of each row and column of A and B respectively. The existing lowering will extract each column of B for each row of A, which leads to multiple values in the IR representing the same columns of B. This PR makes changes to the `ContractOpToDotLowering` to make sure that the columns of B are only ever extracted once, so then the SSA values representing the extracted columns are then re-used in the IR for later dot products. I have updated the existing vector-contract-to-dot-transforms test.	2025-03-12 17:16:49 +00:00
MaheshRavishankar	665299eb3e	[mlir][Transforms] Add a utility method to move value definitions. (#130874 ) `205c5325b3` added a transform utility that moved all SSA dependences of an operation before an insertion point. Similar to that, this PR adds a transform utility function, `moveValueDefinitions` to move the slice of operations that define all values in a `ValueRange` before the insertion point. While very similar to `moveOperationDependencies`, this method differs in a few ways 1. When computing the backward slice since the start of the slice is value, the slice computed needs to be inclusive. 2. The combined backward slice needs to be sorted topologically before moving them to avoid SSA use-def violations while moving individual ops. The PR also adds a new transform op to test this new utility function. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-03-12 08:30:43 -07:00
Matthias Springer	418e07b7e6	[mlir][Tensor] Check for out-of-bounds slice in `insert/extract_slice` verifier (#130487 ) Also fix test cases that had invalid ops.	2025-03-12 08:34:21 +01:00
Oleksandr "Alex" Zinenko	6981f7e92a	[mlir] account for explicit affine.parallel in parallelization (#130812 ) Affine parallelization should take explicitly parallel loops into account when computing loop depth for dependency analysis purposes. This was previously not the case, potentially leading to loops incorrectly being marked as parallel due to depth mismatch.	2025-03-11 20:53:50 -05:00
Evan Liu	634e25319e	[mlir] Add special case for 0-D tensor when fusing expand from collapse (#130838 ) One fusion pattern for collapse_shape -> expand_shape was added in `a95ad2da36`, however if the intermediate tensor between a collapse and expand is a 0-D tensor, then the `reassociation_map` for these two are special cases and can't be generally fused in this function `BubbleUpExpandThroughParallelCollapse`.	2025-03-11 15:55:55 -07:00
Fabian Mora	8c97ddff53	[mlir][DataLayout] Add a default memory space entry to the data layout. (#127416 ) This patch adds a default memory space attribute to the DL and adds methods to query the attribute. This is required as MLIR has no well defined default memory space unlike LLVM which has 0. While `nullptr` is a candidate for default memory space, the `ptr` dialect will remove the possibility for `nullptr` memory spaces to avoid undefined semantics. This patch also modifies the `DataLayoutTypeInterface::areCompatible` to include the new DL spec and merged entries, as it is needed to query the default memory space. --------- Co-authored-by: Christian Ulmann <christianulmann@gmail.com>	2025-03-11 17:39:20 -04:00
Krzysztof Drewniak	f3e55944a9	[mlir][ODS] Switch declarative rewrite rules to properties structs (#124876 ) Now that we have collective builders that take `const [RelevantOp]::Properties &` arguments, we don't need to serialize all the attributes that'll be set during an output pattern into a dictionary attribute. Similarly, we can use the properties struct to get the attributes instead of needing to go through the big if statement in getAttrOfType<>(). This also enables us to have declarative rewrite rules that match non-attribute properties in a future PR. This commit also adds a basic test for the generated matchers since there didn't seem to already be one.	2025-03-11 10:21:33 -05:00
Tai Ly	913d077f29	[mlir][tosa] Change Rescale zero points to be inputs (#130340 ) Update RescaleOp to use zero-point as operands instead of attributes. Check input_zp data type against the input and output_zp data type against the output. Signed-off-by: Peng Sun <peng.sun@arm.com> Co-authored-by: Peng Sun <peng.sun@arm.com>	2025-03-11 13:19:49 +00:00
Nirvedh Meshram	849abd8c05	[mlir][linalg] Add transpose support for reshape as consumer fusion (#130344 ) During https://github.com/llvm/llvm-project/pull/129128 adding reshape as consumer fusion handling of linalg.transpose was missed. This PR adds that. Also transpose reshape as producer fusion test is updated to static sizes as that is more likely to catch any issues with the permutation vector in the verifier if the shapes dont match up. --------- Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>	2025-03-11 07:48:38 -05:00
Matthias Springer	1b455df780	[mlir][memref] Add runtime verification for `memref.copy` (#130437 ) Implement runtime op verification for `memref.copy`. Only ranked memrefs are verified at the moment.	2025-03-11 13:20:48 +01:00
Uday Bondhugula	4d17ae7776	[MLIR][Affine] Fix affine-loop-tile zero cache size corner case crash (#130526 ) Fixes: https://github.com/llvm/llvm-project/issues/64979	2025-03-11 16:28:22 +05:30
MaheshRavishankar	205c5325b3	[mlir] Add a utility method to move operation dependencies. (#129975 ) The added utility method moves all SSA values that an operation depends upon before an insertion point. This is useful during transformations where such movements might make transformations (like fusion) more powerful. To test the operation add a transform dialect op that calls the move operation. To be able to capture the `notifyMatchFailure` messages from the transformation and to report/check these in the test modify the `ErrorCheckingTrackingListener` to capture the last match failure notification. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-03-10 20:23:08 -07:00
Nikhil Kalra	b15ccd436a	[mlir] Better Python diagnostics (#128581 ) Updated the Python diagnostics handler to emit notes (in addition to errors) into the output stream so that users have more context as to where in the IR the error is occurring.	2025-03-10 15:59:47 -07:00
TatWai Chong	3fb8cb6740	[mlir][tosa] Add support for EXT-DOUBLEROUND and EXT-INEXACTROUND (#130337 )	2025-03-10 19:32:30 +00:00
Tai Ly	483c23f6e8	[mlir][tosa] Switch zero point of negate to input variable type (#129758 ) This commit changes the zero point attribute to an input to align with the 1.0 spec. Signed-off-by: Tai Ly <tai.ly@arm.com> Co-authored-by: Luke Hutton <luke.hutton@arm.com>	2025-03-10 18:34:16 +00:00
Thomas Preud'homme	967ab7e08e	[mlir][TOSA] Fix linalg lowering of depthwise conv2d (#130293 ) Current lowering for tosa.depthwise_conv2d assumes if both zero points are zero then it's a floating-point operation by hardcoding the use of a arith.addf in the lowered code. Fix code to check for the element type to decide what add operation to use.	2025-03-10 14:49:05 +00:00
Krzysztof Parzyszek	d67947162f	[flang][OpenMP] Implement HAS_DEVICE_ADDR clause (#128568 ) The HAS_DEVICE_ADDR indicates that the object(s) listed exists at an address that is a valid device address. Specifically, `has_device_addr(x)` means that (in C/C++ terms) `&x` is a device address. When entering a target region, `x` does not need to be allocated on the device, or have its contents copied over (in the absence of additional mapping clauses). Passing its address verbatim to the region for use is sufficient, and is the intended goal of the clause. Some Fortran objects use descriptors in their in-memory representation. If `x` had a descriptor, both the descriptor and the contents of `x` would be located in the device memory. However, the descriptors are managed by the compiler, and can be regenerated at various points as needed. The address of the effective descriptor may change, hence it's not safe to pass the address of the descriptor to the target region. Instead, the descriptor itself is always copied, but for objects like `x`, no further mapping takes place (as this keeps the storage pointer in the descriptor unchanged). --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2025-03-10 08:11:01 -05:00
Artemiy Bulavin	f3dcc0fe22	[mlir] Refactor ConvertVectorToLLVMPass options (#128219 ) The `VectorTransformsOptions` on the `ConvertVectorToLLVMPass` is currently represented as a struct, which makes it not serialisable. This means a pass pipeline that contains this pass cannot be represented as textual form, which breaks reproducer generation and options such as `--dump-pass-pipeline`. This PR expands the `VectorTransformsOptions` struct into the two options that are actually used by the Pass' patterns: `vector-contract-lowering` and `vector-transpose-lowering` . The other options present in VectorTransformOptions are not used by any patterns in this pass. Additionally, I have changed some interfaces to only take these specific options over the full options struct as, again, the vector contract and transpose lowering patterns only need one of their respective options. Finally, I have added a simple lit test that just prints the pass pipeline using `--dump-pass-pipeline` to ensure the options on this pass remain serialisable. Fixes #129046	2025-03-10 10:32:03 +00:00

1 2 3 4 5 ...

12744 Commits