llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 21:36:05 +00:00

Author	SHA1	Message	Date
Kazu Hirata	a2661a9904	[mlir] Avoid repeated hash lookups (NFC) (#132332 )	2025-03-21 08:01:51 -07:00
Yi Qian	0ea4fb9264	[AMD][ROCDL] Add packed conversions fp8/bf8->bf16 and fp8/bf8->fp32 in ROCDL dialect (#131850 ) - Add packed conversions fp8/bf8->bf16 for gfx950 and fp8/bf8->fp32 for gfx942 in ROCDL dialect - Update amdgpu.ext_packed_fp8 lowering to use ROCDL packed fp8/bf8->f32 conversions for vector target types and ROCDL scalar fp8/bf8->fp32 for scalar target type. --------- Co-authored-by: Jungwook Park <jungwook.park@amd.com>	2025-03-21 14:49:50 +00:00
Vadim Curcă	6b59b33358	[MLIR] Handle call site locations when inlining (#132247 ) When inlining a `callee` with a call site debug location, the inlining infrastructure was trivially combining the `callee` and the `caller` locations, forming a "tree" of call stacks. Because of this, the remarks were printing an incomplete inlining stack. This commit handles this case and appends the `caller` location at the end of the `callee`'s stack, extending the chain.	2025-03-21 14:06:53 +01:00
Zhuoran Yin	ea03bdee70	[MLIR][AMDGPU] Adding Vector transfer_read to load rewrite pattern (#131803 ) This PR adds the Vector transfer_read to load rewrite pattern. The pattern creates a transfer read op lowering. A vector trasfer read op will be lowered to a combination of `vector.load`, `arith.select` and `vector.broadcast` if: - The transfer op is masked. - The memref is in buffer address space. - Other conditions introduced from `TransferReadToVectorLoadLowering` The motivation of this PR is due to the lack of support of masked load from amdgpu backend. `llvm.intr.masked.load` lower to a series of conditional scalar loads refer to (`scalarize-masked-mem-intrin` pass). This PR will make it possible for masked transfer_read to be lowered towards buffer load with bounds check, allowing a more optimized global load accessing pattern compared with existing implementation of `llvm.intr.masked.load` on vectors.	2025-03-21 08:42:04 -04:00
Uday Bondhugula	2170d77e5d	[MLIR][Affine] Fix getAccessRelation for 0-d memrefs (#132347 ) Fix getAccessRelation for 0-d memref accesses in certain cases. Fixes corner case crashes when using scalrep, affine dep analysis, etc. Fixes: https://github.com/llvm/llvm-project/issues/132163	2025-03-21 14:35:20 +05:30
Kazu Hirata	3041fa6c7a	[mlir] Use *Set::insert_range (NFC) (#132326 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-20 22:24:17 -07:00
Sergei Lebedev	c8a9a4109a	[MLIR] [python] A few improvements to the Python bindings (#131686 ) * `PyRegionList` is now sliceable. The dialect bindings generator seems to assume it is sliceable already (!), yet accessing e.g. `cases` on `scf.IndexedSwitchOp` raises a `TypeError` at runtime. * `PyBlockList` and `PyOperationList` support negative indexing. It is common for containers to do that in Python, and most container in the MLIR Python bindings already allow the index to be negative.	2025-03-21 00:13:13 -04:00
Kareem Ergawy	f23a6ef54c	[flang][OpenMP] Process `omp.atomic.update` while translating scopes for target device (#132165 ) Fixes a bug introduced by https://github.com/llvm/llvm-project/pull/130078. For non-BlockArgOpenMPOpInterface ops, we also want to map their entry block arguments to their operands, if any. For the current support in the OpenMP dialect, the table below lists all ops that have arguments (SSA operands and/or attributes) and not target-related. Of all these ops, we need to only process `omp.atomic.update` since it is the only op that has SSA operands & an attached region. Therefore, the region's entry block arguments must be mapped to the op's operands in case they are referenced inside the region. \| op \| operands? \| region(s)? \| parent is func? \| processed? \| \|--------------\|-------------\|------------\|------------------\|-------------\| \| atomic.read \| yes \| no \| yes \| no \| \| atomic.write \| yes \| no \| yes \| no \| \| atomic.update \| yes \| yes \| yes \| yes \| \| critical \| no \| no \| yes \| no \| \| declare_mapper \| no \| yes \| no \| no \| \| declare_reduction \| no \| yes \| no \| no \| \| flush \| yes \| no \| yes \| no \| \| private \| no \| yes \| yes \| no \| \| threadprivate \| yes \| no \| yes \| no \| \| yield \| yes \| no \| yes \| no \|	2025-03-20 16:21:09 -05:00
Longsheng Mou	7c11d053f6	[mlir][TosaToLinalg] Exit after `notifyMatchFailure` (#132012 ) This PR adds `return nullptr` when the shift value of `tosa.mul` is not constant to prevent a crash. Fixes #131766.	2025-03-20 19:17:04 +00:00
Sergio Afonso	b231f6f862	[MLIR][OpenMP] Improve omp.map.info verification (#132066 ) This patch makes the `map_type` and `map_capture_type` arguments of the `omp.map.info` operation required, which was already an invariant being verified by its users via `verifyMapClause()`. This makes it clearer, as getters no longer return misleading `std::optional` values. Checks for the `mapper_id` argument are moved to a verifier for the operation, rather than being checked by users. Functionally NFC, but not marked as such due to a reordering of arguments in the assembly format of `omp.map.info`.	2025-03-20 15:48:45 +00:00
Sergio Afonso	ff3341ca35	[MLIR][OpenMP] Simplify OpenMP to LLVM dialect conversion (#132009 ) This patch makes a few changes to unify the conversion process from the 'omp' to the 'llvm' dialect. The main goal of this change is to consolidate the logic used to identify legal and illegal ops, and to consolidate the conversion logic into a single class. Changes introduced are the following: - Removal of `getNumVariableOperands()` and `getVariableOperand()` extra class declarations from OpenMP operations. These are redundant, as they are equivalent to `mlir::Operation::getNumOperands()` and `mlir::Operation::getOperands()`, respectively. - Consolidation of `RegionOpConversion`, `RegionLessOpWithVarOperandsConversion`, `RegionOpWithVarOperandsConversion`, `RegionLessOpConversion`, `AtomicReadOpConversion`, `MapInfoOpConversion`, `DeclMapperOpConversion` and `MultiRegionOpConversion` into a single `OpenMPOpConversion` class. This is possible because all of the previous were doing parts of the same set of operations based on whether they defined any regions, whether they took operands, type attributes, etc. - Update of `mlir::configureOpenMPToLLVMConversionLegality` to use a single generic set of checks for all operations, removing the need to list every operation manually. - Update of `mlir::populateOpenMPToLLVMConversionPatterns` to automatically populate the list of patterns to include all dialect operations.	2025-03-20 14:41:27 +00:00
Ingo Müller	1a2539ec92	[mlir:bufferization] Make `LayoutMapOption` CL args enums. (#132121 ) This PR changes the type of the command-line arguments representing `LayoutMapOption` from `std::string` to the enum with the same name. This allows for checking the values of programmable usages of the corresponding options at compile time.	2025-03-20 13:52:11 +01:00
Longsheng Mou	ead2724600	[mlir][scf] Fix a div-by-zero bug when step of `scf.for` is zero (#131079 ) Fixes #130095.	2025-03-20 09:23:59 +08:00
Matthias Springer	a810141281	[mlir][memref] Add runtime verification for `memref.assume_alignment` (#130412 ) Implement runtime verification for `memref.assume_alignment`.	2025-03-19 21:23:40 +01:00
TatWai Chong	1456eabca8	[mlir][tosa] Finalize profile-based validation for TOSA v1.0 (#131208 ) - When the operand type of an operation changes to a profile-dependent type, the compliance metadata must be updated. Update compliance check for the following: - CONV2D, CONV3D, DEPTHWISE_CONV2D, and TRANSPOSE_CONV2D, as zero points have changed to variable inputs. - PAD, because pad_const has been changed to a variable input. - GATHER and SCATTER, as indices has changed to index_t. - Add an int16 extension check for CONCAT. - Add a compliance check for COND_IF, WHILE_LOOP, VARIABLE, VARIABLE_READ, and VARIABLE_WRITE. - Correct the profile requirements for IDENTITY, TABLE, MATMUL and LOGICAL-like operations. - Remove unnecessary checks for non-v1.0 operations. - Add condition requirements (anyOf and allOf) to the type mode of metadata for modes that have multiple profile/extension considerations.	2025-03-19 08:57:22 -07:00
Fabian Mora	2b8f887915	[mlir][Ptr] Add the `MemorySpaceAttrInterface` interface and dependencies. (#86870 ) This patch introduces the `MemorySpaceAttrInterface` interface. This interface is responsible for handling the semantics of `ptr` operations. For example, this interface can be used to create read-only memory spaces, making any other operation other than a load a verification error, see `TestConstMemorySpaceAttr` for a possible implementation of this concept. This patch also introduces Enum dependencies `AtomicOrdering`, and `AtomicBinOp`, both enumerations are clones of the Enums with the same name in the LLVM Dialect. Also, see: - [[RFC] `ptr` dialect & modularizing ptr ops in the LLVM dialect](https://discourse.llvm.org/t/rfc-ptr-dialect-modularizing-ptr-ops-in-the-llvm-dialect/75142) for rationale. - https://github.com/llvm/llvm-project/pull/73057 for a prototype implementation of the full change. Note: Ignore the first commit, that's being reviewed in https://github.com/llvm/llvm-project/pull/86860 .	2025-03-19 10:55:24 -04:00
Kareem Ergawy	b7eb01b3a1	[NFC][OpenMP][MLIR] Refactor code related to collecting privatizer info into a shared util (#131582 ) Moves code needed to collect info about delayed privatizers into a shared util instread of repeating the same patter across all relevant constructs.	2025-03-19 12:06:33 +01:00
Andrey Timonin	75ab43b70e	[mlir][emitc][NFC] Eliminate the extra newline printout during emitc.switch emission (#129257 ) Before the output was: ```c++ void emitc_switch_ptrdiff_t() { ptrdiff_t v1 = 1; switch (v1) { ... } return; } ``` After: ```c++ void emitc_switch_ptrdiff_t() { ptrdiff_t v1 = 1; switch (v1) { ... } return; } ```	2025-03-19 11:02:54 +01:00
Matthias Springer	f304fd0d5c	[mlir][tensor][NFC] Remove dead code `tensor.extract_slice` canonicalization pattern (#131903 ) Folding a cast into an `extract_slice` does not change the result type.	2025-03-19 08:35:40 +01:00
Kareem Ergawy	e737b846b4	[flang][OpenMP] Translate OpenMP scopes when compiling for target device (#130078 ) If a `target` directive is nested in a host OpenMP directive (e.g. parallel, task, or a worksharing loop), flang currently crashes if the target directive-related MLIR ops (e.g. `omp.map.bounds` and `omp.map.info` depends on SSA values defined inside the parent host OpenMP directives/ops. This PR tries to solve this problem by treating these parent OpenMP ops as "SSA scopes". Whenever we are translating for the device, instead of completely translating host ops, we just tranlate their MLIR ops as pure SSA values.	2025-03-19 08:26:19 +01:00
Longsheng Mou	c34dc9a0cf	[mlir][SCFToEmitC] Don't convert unsupported types in EmitC (#131786 ) This PR adds check for unsupported types in emitc, which fixes a crash. Fixes #131442.	2025-03-19 14:18:51 +08:00
Longsheng Mou	efc31ecd27	[mlir][LICM] Restrict LICM to pure tensor semantics (#129673 ) This PR fixes a bug where LICM incorrectly allowed buffer semantics, which could lead to a crash. Fixes #129416.	2025-03-19 14:17:49 +08:00
Longsheng Mou	fbc1038f0c	[mlir][TosaToLinalg] Only support ranked tensor for `reduce` and `gather` (#131805 ) This PR adds checks for ranked tensors in converter of reduce and gather to prevent crash. Fixes #131087.	2025-03-19 09:18:52 +08:00
Ian Wood	fbbb33f400	[mlir] Fix crash when verifying linalg.transpose (#131733 ) Adds checks in `isPermutationVector` for indices that are outside of the bounds and removes the assert. Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-03-18 12:33:27 -07:00
Kazu Hirata	f179daf77b	[mlir] Fix a warning This patch fixes: mlir/lib/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.cpp:121:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]	2025-03-18 07:52:46 -07:00
Srinivasa Ravi	c42952a782	[MLIR][NVVM] Add support for match.sync Op (#130718 ) This change adds the `match.sync` Op to the MLIR NVVM dialect to generate the `match.sync` PTX instruction. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-match-sync	2025-03-18 14:54:24 +05:30
Kareem Ergawy	49b8d8472f	[OpenMP][MLIR] Support LLVM translation for `distribute` with delayed privatization (#131564 ) Adds support for tranlating delayed privatization (`private` and `firstprivate`) for `omp.distribute` ops.	2025-03-18 10:14:42 +01:00
Matthias Springer	e614e840bc	[mlir][memref] Add runtime verification for `memref.dim` (#130410 ) Add runtime verification for `memref.dim`: check that the index is in bounds. Also simplify the pass pipeline for all memref runtime verification checks.	2025-03-18 09:10:49 +01:00
lorenzo chelini	57dc71352c	[MLIR][Bufferization] Retire `enforce-aliasing-invariants` (#130929 ) Why? This option can lead to incorrect IR if used in isolation, for example, consider the IR below: ```mlir func.func @loop_with_aliasing(%arg0: tensor<5xf32>, %arg1: index, %arg2: index) -> tensor<5xf32> { %c1 = arith.constant 1 : index %cst = arith.constant 1.000000e+00 : f32 %0 = tensor.empty() : tensor<5xf32> %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<5xf32>) -> tensor<5xf32> // The BufferizableOpInterface says that %2 alias with %arg0 or be a newly // allocated buffer %2 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (tensor<5xf32>) { scf.yield %1 : tensor<5xf32> } %cst_0 = arith.constant 1.000000e+00 : f32 %inserted = tensor.insert %cst_0 into %1[%c1] : tensor<5xf32> return %2 : tensor<5xf32> } ``` If we bufferize with: enforce-aliasing-invariants=false, we get: ``` func.func @loop_with_aliasing(%arg0: memref<5xf32, strided<[?], offset: ?>>, %arg1: index, %arg2: index) -> memref<5xf32, strided<[?], offset: ?>> { %c1 = arith.constant 1 : index %cst = arith.constant 1.000000e+00 : f32 %alloc = memref.alloc() {alignment = 64 : i64} : memref<5xf32> linalg.fill ins(%cst : f32) outs(%alloc : memref<5xf32>) %0 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (memref<5xf32, strided<[?], offset: ?>>) { %cast = memref.cast %alloc : memref<5xf32> to memref<5xf32, strided<[?], offset: ?>> scf.yield %cast : memref<5xf32, strided<[?], offset: ?>> } %cst_0 = arith.constant 1.000000e+00 : f32 memref.store %cst_0, %alloc[%c1] : memref<5xf32> return %0 : memref<5xf32, strided<[?], offset: ?>> } ``` Which is not correct IR since the loop yields the allocation. I am using this option. What do I need to do now? If you are using this option in isolation, you are possibly generating incorrect IR, so you need to revisit your bufferization strategy. If you are using it together with `copyBeforeWrite,` you simply need to retire the `enforceAliasingInvariants` option. Co-authored-by: Matthias Springer <mspringer@nvidia.com>	2025-03-18 08:42:43 +01:00
Longsheng Mou	4cb1430c1c	[mlir][spirv] Fix a crash in `spirv::ISubOp::fold` (#131570 ) This PR fixes a crash if `spirv.ISub` is not integer type. Fixes #131283.	2025-03-18 09:18:49 +08:00
William Moses	d9c65af626	[MLIR][GPUToNVVM] Support 32-bit isfinite (#131699 ) Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp>	2025-03-18 02:11:38 +01:00
Johannes de Fine Licht	c3f750250a	[MLIR][LLVM] Handle floats in Mem2Reg of memset intrinsics (#131621 ) This was lacking a bitcast from the shifted integer type into a float. Other non-struct types than integers and floats will still not be Mem2Reg'ed. Also adds special handling for constants to be emitted as a constant directly rather than relying on followup canonicalization patterns (`memset` of zero is a case that can appear in the wild).	2025-03-17 22:31:28 +01:00
Christian Ulmann	800593a014	[MLIR][LLVM] Avoid duplicated module flags in the export (#131627 ) This commit resolves an issue in the LLVMIR export that caused the duplication of the "Debug Info Version" module flag, when it was already in MLIR.	2025-03-17 17:43:15 +01:00
Zhuoran Yin	1e89a76a04	[MLIR] Refactor to create vectorization convOp precondition check (#130181 ) In corner situations, the vectorization pass may face to lower a conv2d op and assert in a completely irrelevant location in vectorizeConvolution() subroutine. ~~This PR rejects the conv2d op early and make the asserted routine to return failure as a defensive workaround.~~ In addressing this, the PR moved all condition check away from the `Conv1dGenerator` into the `convOpPreconditionCheck()` function. This makes the unsupported ops such as conv2d to be rejected early and leave a cleaner `Conv1dGenerator` constructor.	2025-03-17 09:32:45 -04:00
Luke Hutton	0c34d7a9e7	[mlir][tosa] Require operand/result tensors of at least rank 1 for some operations (#131335 ) This commit updates the following operations (operands/results) to be of at least rank 1 such that it aligns with the expectations of the specification: - ARGMAX (input) - REDUCE_ALL (input/output) - REDUCE_ANY (input/output) - REDUCE_MAX (input/output) - REDUCE_MIN (input/output) - REDUCE_PRODUCT (input/output) - REDUCE_SUM (input/output) - CONCAT (each input in input1/output) - PAD (input1/output) - REVERSE (input1/output) - SLICE (input1/output) - TILE (input1/output) - TRANSPOSE (input1/output) In addition to this change, PAD has been updated to allow unranked tensors for input1/output, inline with other operations.	2025-03-17 10:22:52 +00:00
Matthias Springer	6c867e27a7	[mlir] Use `getSingleElement`/`hasSingleElement` in various places (#131460 ) This is a code cleanup. Update a few places in MLIR that should use `hasSingleElement`/`getSingleElement`. Note: `hasSingleElement` is faster than `.getSize() == 1` when it is used with linked lists etc. Depends on #131508.	2025-03-17 07:43:18 +01:00
Ivan Butygin	7c98cddc5a	[mlir] Expose `AffineExpr.shift_dims/shift_symbols` through C and Python bindings (#131521 )	2025-03-16 19:57:56 +03:00
Andrzej Warzyński	d928a671b8	[mlir][Vector] Refactor VectorEmulateNarrowType.cpp (#123529 ) This is PR refactors `alignedConversionPrecondition` from VectorEmulateNarrowType.cpp and adds new helper hooks. Update `alignedConversionPrecondition` (1) This method doesn't require the vector type for the "container" argument. The underlying element type is sufficient. The corresponding argument has been renamed as `containerTy` - this is meant as the multi-byte container element type (`i8`, `i16`, `i32`, etc). With this change, the updated invocations of `alignedConversionPrecondition` (in e.g. `RewriteAlignedSubByteIntExt`) make it clear that the container element type is assumed to be `i8`. Update alignedConversionPrecondition (2): The final check in `alignedConversionPrecondition` has been replaced with a new helper method, `isSubByteVecFittable`. This helper hook is now also re-used in `ConvertVectorTransferRead` (to improve code re-use). Other updates Extended + unified comments. Implements: https://github.com/llvm/llvm-project/issues/123630	2025-03-16 12:22:46 +00:00
Matthias Springer	6c2f8476e7	[mlir][Transforms] Dialect Conversion: Add 1:N support to `remapInput` (#131454 ) This commit adds 1:N support to `SignatureConversion::remapInputs`. This API allows users to replace a block argument with multiple replacement values. (And the block argument is dropped.) The API already supported "bbarg --> multiple bbargs" mappings, but "bbarg --> multiple SSA values" was missing. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2025-03-15 18:33:06 +01:00
Bruno Cardoso Lopes	5265412c13	[MLIR][LLVMIR] Import: add flag to prefer using unregistered intrinsics (#130685 ) Currently, there is no common mechanism for supported intrinsics to be generically annotated with arg and ret attributes. Since there are many supported intrinsics around different dialects, the amount of work to teach all them about these attributes is not trivial (though it would be nice in the long term). This PR adds a new flag `-prefer-unregistered-intrinsics` that can be used alongside `--import-llvm` to always use `llvm.intrinsic_call` during import time (ignoring dialect hooks for custom intrinsic support). Using this flag allow us to roundtrip the LLVM IR while eliminating a whole set of differences coming from lack of arg/ret attributes on supported intrinsics. Note `convertIntrinsic` has to be moved to an implementation file because it queries into `moduleImport` state, which is a fwd declaration in `LLVMImportInterface.h`	2025-03-14 18:04:32 -07:00
Bruno Cardoso Lopes	29a000023c	[MLIR][LLVMIR] Add module flags support (#130679 ) Import and translation support. Note that existing support (prior to this PR) already covers enough in translation specifically to emit "Debug Info Version". Also, the debug info version metadata is being emitted even though the imported IR has no information and is showing up in some tests (will fix that in another PR). --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com> Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz>	2025-03-14 18:03:36 -07:00
Kazu Hirata	456963de96	[mlir] Fix warnings This patch fixes: mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp:54:3: error: definition of implicit copy assignment operator for 'Layout' is deprecated because it has a user-declared copy constructor [-Werror,-Wdeprecated-copy] mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp:103:3: error: definition of implicit copy assignment operator for 'SGMap' is deprecated because it has a user-declared copy constructor [-Werror,-Wdeprecated-copy]	2025-03-14 13:28:58 -07:00
Luke Hutton	955c02dc9c	[mlir][tosa] Check for compile time constants in the validation pass (#131123 ) This commit adds a concept of the 'dynamic' extension in the Dialect and checks that compile time constant (CTC) operands for each operator are constant if the dynamic extension is not loaded. Operands labeled as CTC in the specification that are of tosa.shape (shape_t in the specification) type are not checked as they are always expected to be constant. This requirement is checked elsewhere in the dialect. Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2025-03-14 12:45:01 -07:00
MaheshRavishankar	2490f7f076	[mlir][Linalg] Allow expand shape propagation across linalg ops with dynamic shapes. (#127943 ) With `tensor.expand_shape` allowing expanding dynamic dimension into multiple dynamic dimension, adapt the reshape propagation through expansion to handle cases where one dynamic dimension is expanded into multiple dynamic dimension. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-03-14 12:42:42 -07:00
Charitha Saumya	fd24805c8e	Reapply [mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution. (#131380 ) Originally introduced in #130240 and reverted in #131364 Reproduced the issue locally in Linux by doing a shared lib build. Fixes including adding the missing LINK_LIBS. Original commit message: This PR adds the SG map propagation step of the XeGPU SIMT distribution. SG map propagation is a sparse backward dataflow analysis that propagate the sg_map backward starting from the operands of certain operations (DPAS, store etc.). This is the first step of XeGPU subgroup distribution. This analysis result is used to attach layout information to each XeGPU SIMD subgroup op. The lowering patterns in XeGPUSubgroupDistribute will consume these layout info to distribute SIMD ops into SIMT ops that work on work-item level data fragments. Summary of Lowering XeGPU SIMD -> SIMT Subgroup map propagation (This PR) Attach sg_map to each op in move all ops inside gpu.warp_execute_on_lane0 region. Distribute each op using sg_map Additional legalization steps to align more with Xe HW.	2025-03-14 12:38:36 -07:00
Charitha Saumya	3fcd921aa4	Revert "[mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution." (#131364 ) Reverts llvm/llvm-project#130240	2025-03-14 10:36:58 -07:00
Charitha Saumya	5eb557774d	[mlir][xegpu] Add XeGPU subgroup map propagation analysis for XeGPU SIMT distribution. (#130240 ) This PR adds the SG map propagation step of the XeGPU SIMT distribution. SG map propagation is a sparse backward dataflow analysis that propagate the sg_map backward starting from the operands of certain operations (DPAS, store etc.). This is the first step of XeGPU subgroup distribution. This analysis result is used to attach layout information to each XeGPU SIMD subgroup op. The lowering patterns in XeGPUSubgroupDistribute will consume these layout info to distribute SIMD ops into SIMT ops that work on work-item level data fragments. ### Summary of Lowering XeGPU SIMD -> SIMT 1. Subgroup map propagation (This PR) 2. Attach `sg_map` to each op in move all ops inside `gpu.warp_execute_on_lane0` region. 3. Distribute each op using `sg_map` 4. Additional legalization steps to align more with Xe HW.	2025-03-14 10:21:22 -07:00
Nikolay Panchenko	3ac5d8da61	[mlir-lsp] Abstract input and output of the `JSONTransport` (#129320 ) The patch abstracts sending and receiving json messages of `JSONTransport` to allow custom implementation of them. For example, one concrete implementation can use pipes without a need to convert file descriptor to a `FILE` object.	2025-03-14 12:31:28 -04:00
Sergio Afonso	72b8744aa5	[MLIR][OpenMP] Reduce overhead of target compilation (#130945 ) This patch avoids calling `TargetOp::getInnermostCapturedOmpOp` multiple times during initialization of default and runtime target attributes in MLIR to LLVM IR translation of `omp.target` operations. This is a potentially expensive operation, so this change should help keep compile times lower.	2025-03-14 15:18:32 +00:00
mihailo-stojanovic	fc8b2bf2f8	[MLIR][LLVM] Import dereferenceable metadata from LLVM IR (#130974 ) Add support for importing `dereferenceable` and `dereferenceable_or_null` metadata into LLVM dialect. Add a new attribute which models these two metadata nodes and a new OpInterface.	2025-03-14 09:30:47 +01:00

1 2 3 4 5 ...

16749 Commits