16898 Commits

Author SHA1 Message Date
James Newling
409def2867
[mlir][vector] shape_cast(broadcast) -> broadcast canonicalization (#134939)
Add additional cases of this canonicalization, by checking the 'source
of truth' function `isBroadcastableTo` to check when it is possible to
broadcast directly to the shape resulting from the shape_cast.

---------

Signed-off-by: James Newling <james.newling@gmail.com>
2025-04-11 15:15:03 +01:00
James Newling
cd85f5dbdf
[mlir] canonicalizer: shape_cast(poison) -> poison (#133988)
Based on the ShapeCastConstantFolder, this pattern replaces

%0 = ub.poison : vector<2x3xf32>
%1 = vector.shape_cast %0 vector<2x3xf32> to vector<6xf32>

with 

%1 = ub.poison : vector<6xf32>

---------

Signed-off-by: James Newling <james.newling@gmail.com>
2025-04-11 15:13:03 +01:00
Vivek Khandelwal
e377a5d168
[MLIR][Tensor] Remove tensor.dim canonicalization patterns registered on tensor.expand_shape/tensor.collapse_shape (#134219)
These are problematic because the iterative application that locally
resolves the tensor.dim operation introduces
intermediate floor_div, which is losing the information about the exact
division that was carried out in the original
IR, and the iterative algorithm can't converge towards the simplest
form.
Information loss is not acceptable for canonicalization.

Resolving the dimOp can be achieved through
resolve-ranked-shaped-type-result-dims and
resolve-shaped-type-result-dims passes.

---------

Signed-off-by: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2025-04-11 06:57:34 -04:00
Jean-Didier PAILLEUX
aeb06c6152
[MLIR] Adding 'inline_hint' attribute on LLMV::CallOp (#134582)
Addition of `inlinehint` attributes for CallOps in MLIR in order to be
able to say to a function call that the inlining is desirable without
having the attribute on the FuncOp.
2025-04-11 09:31:18 +02:00
Bangtian Liu
9466cbdf29
[mlir][CAPI][python] expose the python bindings for linalg::isaConvolutionOpInterface and linalg::inferConvolutionDims (#135253)
This PR is mainly about exposing the python bindings for
`linalg::isaConvolutionOpInterface` and `linalg::inferConvolutionDims`.

---------

Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
2025-04-10 20:22:15 -04:00
Luke Hutton
b39ab7a620
[mlir][tosa] Add error_if checks to clamp op verifier (#134224)
Specifically it introduces checks for:
- ERROR_IF(max_val < min_val)
- ERROR_IF(isNaN(min_val) || isNaN(max_val))

Signed-off-by: Luke Hutton <luke.hutton@arm.com>
2025-04-10 16:03:42 -07:00
Maksim Levental
1cec5fffd8
[mlir] implement -verify-diagnostics=only-expected (#135131)
This PR implements `verify-diagnostics=only-expected` which is a "best
effort" verification - i.e., `unexpected`s and `near-misses` will not be
considered failures. The purpose is to enable narrowly scoped checking
of verification remarks (just as we have for lit where only a subset of
lines get `CHECK`ed).
2025-04-10 18:50:00 -04:00
darkbuck
9188288581
[mlir][DataLayout] Keep consistent input/output order (#135185)
- Use 'MapVector' instead of 'DenseMap' to keep a consistent order when
importing/printing entries to prevent run-by-run differences.
2025-04-10 15:55:37 -04:00
Matthias Springer
53ae2bdceb
[mlir][NVVM] Remove commented out code (#135144)
This addresses a comment on #135051.
2025-04-10 21:51:32 +02:00
Guray Ozen
d7cb24e10d
[MLIR][NVVM] Run clang-tidy (#135006) 2025-04-10 21:12:14 +02:00
Alan Li
959b8aaeac
[MLIR][NFC] Expose computeProduct function. (#135192)
Make it non-static, as its functionality is quite generic.
2025-04-10 08:29:32 -07:00
Matthias Springer
85742f7642
[mlir][LLVM] Delete getFixedVectorType and getScalableVectorType (#135051)
The LLVM dialect no longer has its own vector types. It uses
`mlir::VectorType` everywhere. Remove
`LLVM::getFixedVectorType/getScalableVectorType` and use
`VectorType::get` instead. This commit addresses a
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
on the PR that deleted the LLVM vector types.
2025-04-10 10:36:21 +02:00
Dominik Adamski
adfc577895
[OpenMP][CodeExtractor]Add align metadata to load instructions (#131131)
Moving code to another function can lead to missed optimization
opportunities, because function passes operate on smaller chunks of
code, and they cannot figure out all details.

One example of missed optimization opportunities after code extraction
is information about pointer alignment. The instruction combine pass
adds information about pointer alignment to LLVM intrinsic memcpy calls
if it can deduce it from the code or if align metadata is added. If this
information is not present, then further optimization passes can
generate inefficient code.

If we add align metadata to extracted pointers, then the instruction
combine pass can add the align attribute to the LLVM intrinsic memcpy
call and unblock further optimization.

Scope of changes:
1. Analyze MLIR map operations. Add information about the alignment of
objects that are passed by reference to OpenMP GPU kernels.
2. Propagate alignment information to the outlined by `CodeExtractor`
helper functions.
2025-04-10 09:45:30 +02:00
Bangtian Liu
c359f7625f
[mlir][CAPI][python] expose the python bindings for linalg::isaContractionOpInterface and linalg::inferContractionDims (#134935)
This PR is mainly about exposing the python bindings for`
linalg::isaContractionOpInterface` and` linalg::inferContractionDims`.

---------

Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
2025-04-09 20:01:38 -04:00
Maksim Levental
9b50167ed9
[mlir][python] add use_name_loc_as_prefix to value.get_name() (#135052)
Add `use_name_loc_as_prefix` to `value.get_name()`.
2025-04-09 19:28:59 -04:00
Nachi G
2f7e685e3d
[MLIR] Ensure deterministic parallel verification (#134963)
`failableParallelForEach` will non-deterministically early terminate
upon failure, leading to inconsistent and potentially missing
diagnostics.

This PR uses `parallelForEach` to ensure all operations are verified and
all diagnostics are handled, while tracking the failure state
separately.

Other potential fixes include:
- Making `failableParallelForEach` have deterministic early-exit
behavior (or have an option for it)
- I didn't want to change more than what was required (and potentially
incur perf hits for unrelated code), but if this is a better fix I'm
happy to submit a patch.
- I think all diagnostics that can be detected from verification
failures should be reported, so I don't even think this would be correct
behavior anyway

- Adding an option for `failableParallelForEach` to still execute on
every element on the range while still returning `LogicalResult`
2025-04-09 15:43:26 -07:00
Matthias Springer
a0d449016b
[mlir][LLVM] Delete getVectorElementType (#134981)
The LLVM dialect no longer has its own vector types. It uses
`mlir::VectorType` everywhere. Remove `LLVM::getVectorElementType` and
use `cast<VectorType>(ty).getElementType()` instead. This commit
addresses a
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
on the PR that deleted the LLVM vector types.

Also improve vector type constraints by specifying the
`mlir::VectorType` C++ class, so that explicit casts to `VectorType` can
be avoided in some places.
2025-04-09 21:35:32 +02:00
Adam Siemieniuk
0c2a6f2d62
[mlir][x86vector] Simplify intrinsic generation (#133692)
Replaces separate x86vector named intrinsic operations with direct calls
to LLVM intrinsic functions.
    
This rework reduces the number of named ops leaving only high-level MLIR
equivalents of whole intrinsic classes e.g., variants of AVX512 dot on
BF16 inputs. Dialect conversion applies LLVM intrinsic name mangling
further simplifying lowering logic.
    
The separate conversion step translating x86vector intrinsics into LLVM
IR is also eliminated. Instead, this step is now performed by the
existing llvm dialect infrastructure.

RFC:
https://discourse.llvm.org/t/rfc-simplify-x86-intrinsic-generation/85581
2025-04-09 19:59:37 +02:00
Jan Leyonberg
1aed6ad906
[MLIR][OpenMP] Enable multiple variables for target teams reductions (#134903)
This patch enables multiple reductions to be used in a reduction clause
inside target regions for GPU offloading.

---------

Co-authored-by: Sergio Afonso <safonsof@amd.com>
2025-04-09 13:01:53 -04:00
Jerry-Ge
751c3f51eb
[mlir][tosa] Update TileOp infer shape (#134732)
update to use getConstShapeValues in TileOp's shape inference

Signed-off-by: Tai Ly <tai.ly@arm.com>
Co-authored-by: Tai Ly <tai.ly@arm.com>
2025-04-09 09:29:29 -07:00
ivangarcia44
5083e80c14
Folding extract_strided_metadata input into reinterpret_cast (#134845)
We can always fold the input of a extract_strided_metadata operator to
the input of a reinterpret_cast operator, because they point to the same
memory. Note that the reinterpret_cast does not use the layout of its
input memref, only its base memory pointer which is the same as the base
pointer returned by the extract_strided_metadata operator and the base
pointer of the extract_strided_metadata memref input.

Operations like expand_shape, collapse_shape, and subview are lowered to
a pair of extract_strided_metadata and reinterpret_cast like this:
      
%base_buffer, %offset, %sizes:2, %strides:2 =
memref.extract_strided_metadata %input_memref :
memref<ID1x...xIDNxBaseType> -> memref<f32>, index, index, index, index,
index

%reinterpret_cast = memref.reinterpret_cast %base_buffer to offset:
[%o1], sizes: [%d1,...,%dN], strides: [%s1,...,%N] : memref<f32> to
memref<OD1x...xODNxBaseType >

In many cases the input of the extract_strided_metadata input can be
passed directly into the input of the reinterpret_cast operation like
this (see how %base_buffer is replaced by %input_memref in the
reinterpret_cast above and the input type is updated):

%base_buffer, %offset, %sizes:2, %strides:2 =
memref.extract_strided_metadata %input_memref :
memref<ID1x...xIDNxBaseType> -> memref<f32>, index, index, index, index,
index
%reinterpret_cast = memref.reinterpret_cast %input_memref to offset:
[%o1], sizes: [%d1,...,%dN], strides: [%s1,...,%N] :
memref<ID1x...xIDNxBaseType> to memref<OD1x...xODNxBaseType >

When dealing with static dimensions, the extract_strided_metatdata will
become deadcode and we end up only with a reinterpret_cast:

%reinterpret_cast = memref.reinterpret_cast %input_memref to offset:
[%o1], sizes: [%d1,...,%dN], strides: [%s1,...,%N] :
memref<ID1x...xIDNxBaseType> to memref<OD1x...xODNxBaseType >

Note that reinterpret_cast only reads the base memory pointer from the
input memref (%input_memref above), which is equivalent to the
%base_buffer returned by the extract_strided_metadata operation. Hence
it is legal always to use the extract_strided_metadata input memref
directly in the reinterpret_cast. Note that since this is a pointer,
this operation is legal even when the base pointer values are modified
between the operation pair.

@matthias-springer 
@joker-eph 
@sahas3
@Hanumanth04
@dixinzhou
@rafaelubalmw

---------

Co-authored-by: Ivan Garcia <igarcia@vdi-ah2ddp-178.dhcp.mathworks.com>
2025-04-09 16:50:16 +02:00
Sergio Afonso
0de48de36e
[MLIR][OpenMP] Improve loop wrapper op verifiers (#134833)
This patch revisits op verifiers for `LoopWrapperInterface` operations
to improve consistency across operations and to properly cover some
previously misreported cases.

Checks that should be done for these kinds of operations are documented
in the interface description.
2025-04-09 12:36:07 +01:00
NimishMishra
53fa92dcad
[mlir][llvm][OpenMP] Hoist __atomic_load alloca (#132888)
Current implementation of `__atomic_compare_exchange` uses an alloca for
`__atomic_load`, leading to issues like
https://github.com/llvm/llvm-project/issues/120724. This PR hoists this
alloca to `AllocaIP`.


Fixes: https://github.com/llvm/llvm-project/issues/120724
2025-04-09 03:01:44 -07:00
Chao Chen
34d586fdd5
[MLIR][XeGPU] Extend SGMapAttr and Add ConvertLayoutOp (#132425)
This PR improves the SGMapAttr to enable workgroup-level programming, representing the first step in expanding the XeGPU dialect from subgroup to workgroup level, and renames it to LayoutAttr
2025-04-08 19:46:05 -05:00
Georgios Pinitas
9c38b2e513
[mlir][tosa] Fold PadOp to tensor operations (#132700) 2025-04-09 00:01:54 +01:00
Matthias Springer
234d30e36b
[mlir][LLVM] Delete LLVMFixedVectorType and LLVMScalableVectorType (#133286)
Since #125690, the MLIR vector type supports `!llvm.ptr` as an element
type. The only remaining element type for `LLVMFixedVectorType` is now
`LLVMPPCFP128Type`.

This commit turns `LLVMPPCFP128Type` into a proper FP type (by
implementing `FloatTypeInterface`), so that the MLIR vector type accepts
it as an element type. This makes `LLVMFixedVectorType` obsolete.
`LLVMScalableVectorType` is also obsolete. This commit deletes
`LLVMFixedVectorType` and `LLVMScalableVectorType`.

Note for LLVM integration: Use `VectorType` instead of
`LLVMFixedVectorType` and `LLVMScalableVectorType`.
2025-04-08 20:28:24 +02:00
Matthias Springer
b7b3758e88
[mlir][IR] Add VectorTypeElementInterface with !llvm.ptr (#133455)
This commit extends the MLIR vector type to support pointer-like types
such as `!llvm.ptr` and `!ptr.ptr`, as indicated by the newly added
`VectorTypeElementInterface`. This makes the LLVM dialect closer to LLVM
IR. LLVM IR already supports pointers as vector element type.

Only integers, floats, pointers and index are valid vector element types
for now. Additional vector element types may be added in the future
after further discussions. The interface is still evolving and may
eventually turn into one of the alternatives that were discussed on the
RFC.

This commit also disallows `!llvm.ptr` as an element type of
`!llvm.vec`. This type exists due to limitations of the MLIR vector
type.

RFC:
https://discourse.llvm.org/t/rfc-allow-pointers-as-element-type-of-vector/85360
2025-04-08 19:21:45 +02:00
Christopher McGirr
ae3faea1f2
[MLIR][mlir-opt] move action debugger hook flag (#134842)
Currently if a developer uses the flag `--mlir-enable-debugger-hook` the
debugger hook is not actually enabled. It seems the DebugConfig and the
MainMLIROptConfig are not connected.

To fix this we can move the `enableDebuggerHook` CL Option to the
DebugConfigCLOptions struct so that it can get registered and enabled
along with the other debugger flags. AFAICS there are no other uses of
the flag so this should be safe.

This also adds a small LIT test to check that the hook is enabled by
checking the std::cerr output for the log message.
2025-04-08 16:54:11 +02:00
Michael Liao
4f77e50042 [MLIR][AMDGPU] Fix shared build. NFC 2025-04-08 10:46:15 -04:00
Christian Sigg
3a6b9b3a87 [mlir][bazel] Fix after dae0ef53a0b99c6c2b74143baee5896e8bc5c8e7
Remove unnecessary include.
2025-04-08 15:47:14 +02:00
Alan Li
dae0ef53a0
[MLIR][AMDGPU] Add a wrapper for global LDS load intrinsics in AMDGPU (#133498)
Defining a new `amdgpu.global_load` op, which is a thin wrap around
ROCDL `global_load_lds` intrinsic, along with its lowering logics to
`rocdl.global.load.lds`.
2025-04-08 09:18:30 -04:00
lorenzo chelini
e7365d3143
[MLIR][NFC] Retire let constructor for Reducer (#134786)
let constructor is legacy (do not use in tree!) since the tableGen
backend emits most of the glue logic to build a pass.
2025-04-08 10:31:15 +02:00
lorenzo chelini
87a187cedf
[MLIR][NFC] Retire let constructor for Tosa (#134784)
`let constructor` is legacy (do not use in tree!) since the tableGen
backend emits most of the glue logic to build a pass.
2025-04-08 10:30:47 +02:00
Bruno Cardoso Lopes
a168ddc470
[MLIR][LLVM] Block address support (#134335)
Add support for import and translate.

MLIR does not support using basic block references outside a function
(like LLVM does), This PR does not consider changes to MLIR to that
respect. It instead introduces two new ops: `llvm.blockaddress` and
`llvm.blocktag`. Here's an example:

```
llvm.func @ba() -> !llvm.ptr {
  %0 = llvm.blockaddress <function = @ba, tag = <id = 1>> : !llvm.ptr
  llvm.br ^bb1
^bb1:  // pred: ^bb0
  llvm.blocktag <id = 1>
  llvm.return %0 : !llvm.ptr
}
```

Value `%0` hold the address of block tagged as `id = 1` in function
`@ba`. Block tags need to be unique within a function and use of
`llvm.blockaddress` requires a matching tag in a `llvm.blocktag`.
2025-04-07 17:53:18 -07:00
Jerry-Ge
a38ad6e2a2
[mlir][tosa] Check empty Concat input for inferReturnTypeComponents (#134728)
Signed-off-by: Jerry Ge <jerry.ge@arm.com>
2025-04-07 16:27:08 -07:00
Jan Leyonberg
fbc8335311
[MLIR][OpenMP] Add codegen for teams reductions (#133310)
This patch adds the lowering of teams reductions from the omp dialect to
LLVM-IR. Some minor cleanup was done in clang to remove an unused
parameter.
2025-04-07 12:47:16 -04:00
Tai Ly
cb9afe53bf
[mlir][tosa] Fix validation pass assert (#134445)
This fixes a validation pass assert when processing ops with quantized
element types.
The failure case is added to invalid.mlir
The fix is to re-order the validation checking so that only ops with
int/float operands and results pass the first stage of validation pass,
so that the remaining checks do not need to handle quantized data types.

Signed-off-by: Tai Ly <tai.ly@arm.com>
2025-04-07 09:47:07 -07:00
Jerry-Ge
112af87964
[mlir][tosa] Add parenthesis for the weight padding calculation (#134420)
Minor change.

Signed-off-by: Jerry Ge <jerry.ge@arm.com>
2025-04-07 09:43:46 -07:00
Igor Wodiany
cff65657d9
[mlir][spirv] Fix incorrect argument erasure in deserializer (#134610)
The current implementation iterates and modifies the list of arguments
at the same time. Depending on the number of arguments this will trigger
an assert: `assert(index < arguments.size())`. This change replaces loop
with a range based erasure.
2025-04-07 15:00:33 +01:00
Andrzej Warzyński
2f6bc47a18
[mlir][vector] Standardise valueToStore Naming Across Vector Ops (NFC) (#134206)
This change standardises the naming convention for the argument
representing the value to store in various vector operations.
Specifically, it ensures that all vector ops storing a value—whether
into memory, a tensor, or another vector — use `valueToStore` for the
corresponding argument name.

Updated operations:
* `vector.transfer_write`, `vector.insert`, `vector.scalable_insert`,
  `vector.insert_strided_slice`.

For reference, here are operations that currently use `valueToStore`:
* `vector.store` `vector.scatter`, `vector.compressstore`,
  `vector.maskedstore`.

This change is non-functional (NFC) and does not affect the
functionality of these operations.

Implements #131602
2025-04-07 13:56:54 +01:00
Matthias Springer
bafa2f4442
[mlir][memref] Check memory space before lowering alloc ops (#134427)
Check the memory space before lowering allocation ops, instead of
starting the lowering and then rolling back the pattern when the memory
space was found to be incompatible with LLVM.

Note: This is in preparation of the One-Shot Dialect Conversion
refactoring.

Note: `isConvertibleAndHasIdentityMaps` now also checks the memory
space.
2025-04-07 14:38:18 +02:00
Uday Bondhugula
37deb09593
[MLIR][Affine] Fix signatures of normalize memref utilities (#134466)
These methods were passing derived op types by pointers, which deviates
from the style. While on this, fix obsolete comments on those methods.
2025-04-07 17:36:28 +05:30
Jorn Tuyls
96de8435b9
[mlir][NVVM] Fix default label unreachable warning in getVoteSyncIntrinsicId (#134600)
Fixes the following warning after the changes in
https://github.com/llvm/llvm-project/pull/134309:

```
llvm-project/mlir/lib/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.cpp:134:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
  default:
  ^
1 warning generated.
```
2025-04-07 13:13:02 +02:00
Jack Frankland
7b007c092d
[mlir][tosa-to-linalg] Add acc_type lowering Support (#134267)
Add support for lowering of convolution operations where the `acc_type`
attribute differs from the result type of the operation. The only case
of this in for convolutions in the TOSA-v1.0 specification is an fp16
convolution which internally uses an fp32 accumulator; all other
operations have accumulator types that match their output/result types.

Add lit tests for the fp16 convolution with fp32 accumulator operators
described above.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2025-04-07 11:37:26 +01:00
Matthias Springer
431c8dd073
[mlir][IR] Add support for UnknownLoc to verify-diagnostics (#134421)
Diagnostics at unknown locations can now be verified with
`-verify-diagnostics`.

Example:
```
// expected-error@unknown {{something went wrong}}
```

Also clean up some MemRefToLLVM conversion tests that had to redirect
all errors to stdout in order to FileCheck them. All of those tests can
now be stored in a single `invalid.mlir`. That was not possible before.
2025-04-07 12:03:04 +02:00
Mike
f20cb3f8d2
[mlir][bufferization] Drop the assumption for alloc result index (#134503)
Relax the assumption that alloc op always has allocation at
`getResult(0)`, allow to use `optimize-allocation-liveness` pass for
custom ops with >1 results. Ops with multiple allocations are not
handled here yet.
2025-04-07 11:54:44 +03:00
Tobias Gysi
d9ccfd7568
[mlir][llvm] Respect call noinline attr in inliner (#134493)
This commit extends the LLVM dialect inliner interface to respect the
call op's noinline attribute. This is a follow-up to
https://github.com/llvm/llvm-project/pull/133726
which added the noinline attribute to the LLVM dialect call op.
2025-04-07 08:12:46 +02:00
Benjamin Kramer
3e08dcd767 [mlir][inliner] Move callback types from InlinerConfig -> InlinerInterface. NFC.
The proper layering here is that Inliner depends on InlinerUtils, and
not the other way round. Maybe it's time to give InliningUtils a less
terrible file name.
2025-04-06 13:02:42 +02:00
junfengd-nv
aeec94500a
[mlir][inliner] Add doClone and canHandleMultipleBlocks callbacks to Inliner Config (#131226)
Current inliner disables inlining when the caller is in a region with
single block trait, while the callee function contains multiple blocks.
the SingleBlock trait is used in operations such as do/while loop, for
example fir.do_loop, fir.iterate_while and fir.if. Typically, calls within
loops are good candidates for inlining. However, functions with multiple
blocks are also common. for example, any function with "if () then
return" will result in multiple blocks in MLIR.

This change gives the flexibility of a customized inliner to handle such
cases.
doClone: clones instructions and other information from the callee
function into the caller function. .
canHandleMultipleBlocks: checks if functions with multiple blocks can be
inlined into a region with the SingleBlock trait.

The default behavior of the inliner remains unchanged.

---------

Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-04-05 22:56:55 +02:00
Matthias Springer
68ce637872
[mlir][Transforms][NFC] Dialect Conversion: Replace after legalizing constants (#134384)
When folding an op during a conversion, first try to legalize all
generated constants, then replace the original operation. This is
slightly more efficient because fewer rewrites must be rolled back in
case a generated constant could not be legalized.

Note: This is in preparation of the One-Shot Dialect Conversion
refactoring.
2025-04-05 06:20:25 -07:00