21743 Commits

Author SHA1 Message Date
Nikita Popov
160e6ace3e [mlir][cmake] Do not export MLIR_MAIN_SRC_DIR and MLIR_INCLUDE_DIR (#125842)
MLIR_MAIN_SRC_DIR and MLIR_INCLUDE_DIR point to the source directory,
which is not installed. As such, the installed MLIRConfig.cmake also
should not reference it.

The comment indicates that these are needed for mlir_tablegen(), but I
don't see any related uses.

The motivation for this is the use in flang, where we end up inheriting
a meaningless MLIR_MAIN_SRC_DIR from a previous MLIR build, whose source
directory doesn't exist anymore, and that cannot be overridden with the
correct path, because it's not a cached variable.

Instead do what all the other projects do for LLVM_MAIN_SRC_DIR and
initialize MLIR_MAIN_SRC_DIR to CMAKE_CURRENT_SOURCE_DIR/../mlir.

For MLIR_INCLUDE_DIR there already is an exported MLIR_INCLUDE_DIRS,
which can be used instead.

(cherry picked from commit 82bd148a3f25439d7f52a32422dc1bcd2da03803)
2025-02-25 08:05:37 -08:00
Nikita Popov
88f8956711 [mlir] Fix MLIRTestDialect dependency in MLIRTestIR
This is a test library which is not part of libMLIR, so it should
use normal LINK_LIBS instead of mlir_target_link_libraries.

This fixes an issue introduced in #123910 and follows up on the
fix in #125004, which added the library to DEPENDS, which is not
sufficient.
2025-02-11 14:59:41 +01:00
Diego Caballero
04d55131ce [mlir][cmake] Add missing MLIRTestDialect dependencies
This cherry picks
[mlir] Fix build race condition in Pass Manager tests (d906da5ead2764579395e5006c517f2ec9afd46f)
to the 20.x release branch.

This addresses issues that started with
https://github.com/llvm/llvm-project/pull/123910, which is already on the 20.x branch.

Linaro noticed this on our flang dylib (shared library) build bot.

In file included from /home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/mlir/test/lib/Pass/TestPassManager.cpp:10:
/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/mlir/test/lib/Pass/../Dialect/Test/TestOps.h:148:10: fatal error: 'TestOps.h.inc' file not found
  148 | #include "TestOps.h.inc"
      |          ^~~~~~~~~~~~~~~

We have tested these changes on the buildbot for the last 2 days and had no problems.
Whereas before it was failing maybe 1 in 10 builds, enough that multiple people
in the community noticed it.

Reported in https://github.com/llvm/llvm-project/issues/124485.
2025-02-10 13:25:08 -08:00
David Spickett
898089b76e [mlir][CMake] Fix dependency on MLIRTestDialect in Transforms tests (#125894)
Another follow up fix to
https://github.com/llvm/llvm-project/pull/123910 to fix a build failure
that sometimes happens in shared library builds:
https://lab.llvm.org/buildbot/#/builders/50/builds/9724

In file included from
/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/mlir/test/lib/Transforms/TestInlining.cpp:16:
/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/mlir/test/lib/Transforms/../Dialect/Test/TestOps.h:148:10:
fatal error: 'TestOps.h.inc' file not found
  148 | #include "TestOps.h.inc"
      |          ^~~~~~~~~~~~~~~
1 error generated.

(cherry picked from commit ebd23f25c8936db3dd917567737a067d6878e2f4)
2025-02-07 18:19:57 -08:00
Fabian Tschopp
28507ac629
[MLIR] Fix thread safety of the deleter in PyDenseResourceElementsAttribute (#124832)
In general, `PyDenseResourceElementsAttribute` can get deleted at any
time and any thread, where unlike the `getFromBuffer` call, the Python
interpreter may not be initialized and the GIL may not be held.

This PR fixes segfaults caused by `PyBuffer_Release` when the GIL is not
being held by the thread calling the deleter.
2025-01-28 18:56:00 -05:00
Diego Caballero
35df525fd0
[mlir][Vector] Add support for poison indices to Extract/IndexOp (#123488)
Following up on #122188, this PR adds support for poison indices to
`ExtractOp` and `InsertOp`. It also includes canonicalization patterns
to turn extract/insert ops with poison indices into `ub.poison`.
2025-01-28 13:51:50 -08:00
Yi Zhang
bfefa15cc1
[mlir][bufferization] Use original type when convert arg for users (#124826)
This change will keep the memory space information for the tensor if
there is any.
2025-01-28 15:30:16 -05:00
mgcsysinfcat
589bef333e
[emacs][lsp][tblgen] add tblgen-lsp-server support for emacs lsp-mode (#76337)
Co-authored-by: mgcsysinfcat <p779yqwdf@mozmail.com>
Co-authored-by: Ronan Keryell <ronan.keryell@amd.com>
2025-01-28 20:26:14 +01:00
Matthias Gehre
1b729c3d70 Revert "[mlir][python] allow DenseIntElementsAttr for index type (#118947)"
This reverts commit 9dd762e8b10586e749b0ddf3542e5dccf8392395.
2025-01-28 18:35:50 +01:00
Matthias Gehre
9dd762e8b1
[mlir][python] allow DenseIntElementsAttr for index type (#118947)
Model the `IndexType` as `uint64_t` when converting to a python integer. 

With the python bindings, 
```python
DenseIntElementsAttr(op.attributes["attr"])
```
used to `assert` when `attr` had `index` type like `dense<[1, 2, 3, 4]>
: vector<4xindex>`.

---------

Co-authored-by: Christopher McGirr <christopher.mcgirr@amd.com>
Co-authored-by: Tiago Trevisan Jost <tiago.trevisanjost@amd.com>
2025-01-28 18:31:58 +01:00
Maksim Levental
1bc5fe669f
[mlir][python] implement GenericOp bindings (#124496) 2025-01-28 12:02:26 -05:00
Jack Frankland
a58e774fba
[mlir][tosa] Make TOSA MUL's Shift an Input (#121953)
The TOSA-v1.0 specification makes the shift attribute of the MUL
(Hammard product) operator an input. Move the `shift` parameter of the
MUL operator in the MILR TOSA dialect from an attribute to an input and
update any lit tests appropriately.

Expand the verifier of the `tosa::MulOp` operation to check the various
constraints defined in the TOSA-v1.0 specification. Specifically, ensure
that all input operands (excluding the optional shift) are of the same
rank. This means that broadcasting tests which previously checked rank-0
tensors would be broadcast are no longer valid and are removed.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
Co-authored-by: TatWai Chong <tatwai.chong@arm.com>
2025-01-28 16:25:22 +00:00
Mehdi Amini
75622e3f8d [MLIR] Define getArgument() for Toy tutorial passes
This is important during debugging to be able to dump a pass pipeline.
It is also what is used by `--mlir-print-ir-tree-dir` to compute filenames
during dumps.
2025-01-28 16:52:23 +01:00
Luohao Wang
e84f6b6a88
[mlir] Fix conflict of user defined reserved functions with internal prototypes (#123378)
On lowering from `memref` to LLVM, `malloc` and other intrinsic
functions from `libc` will be declared in the current module. User's
redefinition of these reserved functions will poison the internal
analysis with wrong prototype. This patch adds assertion on the found
function's type and reports if it mismatch with the intended type.

Related to #120950


---------

Co-authored-by: Luohao Wang <Luohaothu@users.noreply.github.com>
2025-01-28 14:40:47 +01:00
Joseph Huber
13dcc95dcd
[Offload] Rework offloading entry type to be more generic (#124018)
Summary:
The previous offloading entry type did not fit the current use-cases
very well. This widens it and adds a version to prevent further
annoyances. It also includes the kind to better sort who's using it.

The first 64-bytes are reserved as zero so the OpenMP runtime can detect
the old format for binary compatibilitry.
2025-01-28 07:26:13 -06:00
Kareem Ergawy
83433d9361
[OpenMP][IRBuilder] Handle target ... nowait when codegen targets host (#124720)
Fixes https://github.com/llvm/llvm-project/issues/124578

Handles the `nowait` clause for `omp.target` ops when the actual target
is the host (i.e. there is no target device). Rather than only checking
for the `HasNoWait` boolean, we also check for the presence/absence of a
`DeviceID` value. We only emit the target task if both are present.
2025-01-28 12:57:40 +01:00
Adam Siemieniuk
458542f454
[mlir][linalg] Relax structured op region filler check (#123741)
Removes assert on output type from structure op region filler to allow
more graceful error handling.
2025-01-28 09:55:59 +01:00
Hongren Zheng
3a439e2caf
[mlir][dataflow] disallow outside use of propagateIfChanged for DataFlowSolver (#120885)
Detailed writeup is in https://github.com/google/heir/issues/1153. See
also https://github.com/llvm/llvm-project/pull/120881. In short,
`propagateIfChanged` is used outside of the `DataFlowAnalysis` scope,
because it is public, but it does not propagate as expected as the
`DataFlowSolver` has stopped running.

To solve such misuse, `propagateIfChanged` should be made
protected/private.

For downstream users affected by this, to correctly propagate the
change, the Analysis should be re-run (check #120881) instead of just a
`propagateIfChanged`

The change to `IntegerRangeAnalysis` is just a expansion of the
`solver->propagateIfChanged`. The `Lattice` has already been updated by
the `join`. Propagation is done by `onUpdate`.

Cc @Mogball for review
2025-01-28 13:32:28 +08:00
Hongren Zheng
3c64f86314
[mlir] Add OpAsmTypeInterface for pretty-print (#121187)
See
https://discourse.llvm.org/t/rfc-introduce-opasm-type-attr-interface-for-pretty-print-in-asmprinter/83792
for detailed introduction.

This PR acts as the first part of it
* Add `OpAsmTypeInterface` and `getAsmName` API for deducing ASM name
from type
* Add default impl in `OpAsmOpInterface` to respect this API when
available.

The `OpAsmAttrInterface` / hooking into Alias system part should be
another PR, using a `getAlias` API.

### Discussion

* Instead of using `StringRef getAsmName()` as the API, I use `void
getAsmName(OpAsmSetNameFn)`, as returning StringRef might be unsafe
(std::string constructed inside then returned a _ref_; and this aligns
with the design of `getAsmResultNames`.
* On the result packing of an op, the current approach is that when not
all of the result types are `OpAsmTypeInterface`, then do nothing (old
default impl)

### Review 

Cc @j2kun and @Alexanderviand-intel for downstream; Cc @River707 and
@joker-eph for relevent commit history; Cc @ftynse for discourse.
2025-01-28 13:31:41 +08:00
Longsheng Mou
8900c09ebf
[mlir][nvgpu] Fix crash when handling 0D memref in OptimizeSharedMemoryPass (#124517)
This PR adds a check for 0D memref types to prevent a crash. Fixes
#119855.
2025-01-28 09:19:51 +08:00
Diego Caballero
a7a4c16c67
[mlir][Vector] Support efficient shape cast lowering for n-D vectors (#123497)
This PR implements a generalization of the existing more efficient
lowering of shape casts from 2-D to 1D and 1-D to 2-D vectors. This
significantly reduces code size and generates more performant code for
n-D shape casts that make their way to LLVM/SPIR-V.
2025-01-27 14:36:19 -08:00
Rahul Joshi
aca08a8515
[TableGen] Add assert to validate Objects list for HwModeSelect (#123794)
- Bail out of TableGen if any asserts fail before running the backend. 
- Add asserts to validate that the `Objects` and `Modes` lists for
various `HwModeSelect` subclasses are of same length.
 - Eliminate equivalent check in CodeGenHWModes.cpp
2025-01-27 13:44:44 -08:00
Chao Chen
bd5d361c05
[mlir][vector] add support for linearizing vector.bitcast in VectorLinearize (#123110)
This PR adds support for converting Vector::BitCastOp working on ND 
(N >1) vectors into the same op working on linearized (1D) vectors.
2025-01-27 14:41:33 -06:00
Jeremy Morse
749443a307
[NFC][DebugInfo] Mop up final instruction-insertion call sites (#124289)
These are the final places in the monorepo that make use of instruction
insertion for methods like insertBefore and moveBefore. As part of the
RemoveDIs project, instead use iterators for insertion. (see:
https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
).
2025-01-27 16:07:27 +00:00
Scott Manley
e492083f55
[OpenACC] Add AutomaticAllocationScope to recipe ops (#124337)
The recipe operations should have AutomaticAllocationScope so recipes can
be converted using operators that require parent ops to have
AutomaticAllocationScope
2025-01-27 07:47:45 -08:00
MaheshRavishankar
092372da15
[mlir][Tensor] Rework ReifyRankedShapedTypeInterface implementation for tensor.expand_shape op. (#113501)
The op carries the output-shape directly. This can be used directly.
Also adds a method to get the shape as a `SmallVector<OpFoldResult>`.

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-01-27 07:05:34 -08:00
MaheshRavishankar
1f5335c1db
Make index computation used divsi/remsi (#124390)
The index computation is meant to be signed. Using unsigned could lead
to subtle errors. Fix places where some index math was using unsigned
operations.

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-01-27 07:04:50 -08:00
Ivan Butygin
ac87d6b036
[mlir][arith] Fold arith.cmpi eq, %val, %one : i1 -> %val and arith.cmpi ne, %val, %zero : i1 -> %val (#124436)
https://alive2.llvm.org/ce/z/dNZMdC
2025-01-27 14:28:09 +03:00
Samuel Ginzburg
43a50deb63
[MLIR][ROCDL] Add GFX940 SMFMAC (2:4 sparsity) instructions to the ROCDL dialect (#124435)
# Overview

This PR adds 2:4 structured sparsity (sparse A, dense B) matrix multiply
instructions to ROCDL.

# Testing

I've added tests to Dialect/mlir and Target/mlir
2025-01-27 11:58:26 +01:00
Longsheng Mou
8f17f51deb
[mlir][tosa] Fix comments format(NFC) (#124520)
This PR corrects the formatting of comments in Markdown. The previous
format was as follows:
https://mlir.llvm.org/docs/Dialects/TOSA/#tosaerf-mlirtosaerfop

![image](https://github.com/user-attachments/assets/1d1d10d5-c960-4724-9fb4-29c17ea39b11)

https://mlir.llvm.org/docs/Dialects/TOSA/#tosarescale-mlirtosarescaleop

![image](https://github.com/user-attachments/assets/fb23cbf6-be10-4a60-8b43-b28dc2db6918)
2025-01-27 10:50:53 +00:00
Jakub Kuderski
2655ae54db
[mlir] Fix deprecated pointer union casts in toy example (#124422) 2025-01-25 13:52:07 -05:00
Adam Paszke
21f04b1458
Hold a queue of iterator ranges (not operations) in wouldOpBeTriviallyDead (#123642)
Ranges let us push the whole blocks onto the queue in constant time. If
one of the first ops in the block is side-effecting we'll be able to
provide the answer quickly. The previous implementation had to walk the
block and queue all the operations only to start traversing them again,
which was a considerable slowdown for compile times of large MLIR
programs in our benchmarks.

---------

Co-authored-by: Jacques Pienaar <jpienaar@google.com>
2025-01-25 07:28:21 -08:00
Jacques Pienaar
3b35b4c7f9
[mlir] Allow fallback from file line col range to loc (#124321)
This was discussed during the original review but I made it stricter
than discussed. Making it a pure view but adding a helper for bytecode
serialization (I could avoid the helper, but it ends up with more logic
and stronger coupling).
2025-01-24 18:08:44 -08:00
Henrich Lauko
95d993a838
[MLIR] Fix import of calls with mismatched variadic types (#124286)
Previously, an indirect call was incorrectly generated when
`llvm::CallBase::getCalledFunction` returned null due to a type mismatch
between the call and the function. This patch updates the code to use
`llvm::CallBase::getCalledOperand` instead.
2025-01-24 20:28:36 +01:00
junfengd-nv
83df39c649
[mlir][inline] Fix Issue#82401: Infinite loop in MLIR inliner for indirect recursive call. (#124026) 2025-01-24 11:06:37 -08:00
Andrzej Warzyński
d88293d8a2
[mlir][vector] Disable BreakDownVectorBitCast for scalable vectors (#122725)
`BreakDownVectorBitCast` leverages
  * `vector.extract_strided_slices` + `vector.insert_strided_slices`

As these Ops do not support extracting scalable sub-vectors (i.e.
extracting/inserting a fraction of a scalable dim), it's best to bail
out.
2025-01-24 17:15:06 +00:00
Adam Siemieniuk
ba6774f997
[mlir][xegpu] Fix verifier diagnostic recursion (#124148)
Uses global diagnostic message in operation verifier to avoid infinite
recursion on a warning.

Emitting diagnostics through the operation under verification creates a
loop where verifier runs again before printing the message.
2025-01-24 18:09:48 +01:00
Peter Hawkins
acde3f722f
[mlir:python] Compute get_op_result_or_value in PyOpView's constructor. (#123953)
This logic is in the critical path for constructing an operation from
Python. It is faster to compute this in C++ than it is in Python, and it
is a minor change to do this.

This change also alters the API contract of
_ods_common.get_op_results_or_values to avoid calling
get_op_result_or_value on each element of a sequence, since the C++ code
will now do this.

Most of the diff here is simply reordering the code in IRCore.cpp.
2025-01-24 06:26:28 -08:00
Andrea Faulds
eb206e9ea8
[mlir] Rename mlir-cpu-runner to mlir-runner (#123776)
With the removal of mlir-vulkan-runner (as part of #73457) in
e7e3c45bc70904e24e2b3221ac8521e67eb84668, mlir-cpu-runner is now the
only runner for all CPU and GPU targets, and the "cpu" name has been
misleading for some time already. This commit renames it to mlir-runner.
2025-01-24 14:08:38 +01:00
Ivan Butygin
88136f9645
[mlir][vector] Canonicalize gathers/scatters with trivial offsets (#117939)
Canonicalize gathers/scatters with contiguous (i.e. [0, 1, 2, ...])
offsets into vector masked load/store ops.
2025-01-24 14:14:53 +03:00
Jianjian Guan
990837f91d
[mlir][arith][tensor] Disable index type for bitcast (#121455)
Fixes #121397.
2025-01-24 16:53:04 +08:00
donald chen
45d83ae7df
[mlir] [math] Fix the precision issue of expand math (#120865)
The convertFloorOp pattern incurs precision loss when floating-point
numbers exceed the representable range of int64. This pattern should be
removed.

Fixes https://github.com/llvm/llvm-project/issues/119836
2025-01-24 14:46:41 +08:00
Jordan Rupprecht
e10d551aa4
[mlir][PDLL] Allow (and ignore) -D tablegen macros. (#124166)
Similar to #91329, `mlir-pdll` is a tool used in tablegen macros that
unregisters from common flags, including `-D` macros. Because a macro
may be used globally, e.g. configured via `LLVM_TABLEGEN_FLAGS`, we want
this tool to just ignore the macro instead of a fatal failure due to the
unrecognized flag.
2025-01-23 13:45:46 -06:00
Yi Qian
c118864223
[MLIR][ROCDL]Add MFMA_*_F8F6F4 instructions to the ROCDL dialect (#123830)
This PR adds mfma.scale.f32.32x32x64.f8f6f4 and
mfma.scale.f32.16x16x128.f8f6f4 to the ROCDL dialect. They are converted
to the corresponding intrinsics in the mlir-to-llvmir pass.
2025-01-23 19:27:56 +00:00
Fangrui Song
e062224596 [test] Remove misleading '' 2025-01-23 09:45:51 -08:00
Scott Todd
1a8f49fdda
[mlir][python][cmake] Allow skipping nanobind compile options changes. (#123997)
Context:
https://github.com/llvm/llvm-project/pull/107103#discussion_r1925834532

This code is brittle, especially when called from a superproject that
adds the `nanobind-*` target in a different source directory:
```cmake
get_property(all_targets DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} PROPERTY BUILDSYSTEM_TARGETS)
```

The changes here do help with my downstream build, but I'm not sure if
using the `MLIR_DISABLE_CONFIGURE_PYTHON_DEV_PACKAGES` option introduced
in https://github.com/llvm/llvm-project/pull/117934 is the right fix
given that the option is currently scoped directly to one location with
a matching name:
7ad8a3da47/mlir/cmake/modules/MLIRDetectPythonEnv.cmake (L4-L5)

Some other solutions to consider:

1. Search through an explicit list of target names using `if (TARGET)`
2. Iterate over _all_ targets in the project, not just the targets in
the current directory, using code like
https://stackoverflow.com/a/62311397
3. Iterate over targets in the directory known to MLIR
(`llvm-project/mlir/python`)
4. Move this `target_compile_options` setup into
`mlir_configure_python_dev_packages` (I started on this, but that runs
into similar issues where the target is defined in a different
directory)
2025-01-23 09:18:12 -08:00
Kazu Hirata
df299958e6 [mlir] Fix warnings
This patch fixes:

  mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:403:5: error:
  'ClampRange' may not intend to support class template argument
  deduction [-Werror,-Wctad-maybe-unsupported]

  mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:404:5: error:
  'ClampRange' may not intend to support class template argument
  deduction [-Werror,-Wctad-maybe-unsupported]
2025-01-23 08:31:18 -08:00
Tuomas Kärnä
0e944a3095
[SCFToGPU] Convert scf.parallel+scf.reduce to gpu.all_reduce (#122782)
Support reductions in SCFToGPU: `scf.parallel` and `scf.reduce` op
combination is now converted to a `gpu.all_reduce` op.
2025-01-23 13:47:36 +01:00
Durgadoss R
2e6cc79f81
[MLIR][NVVM] Migrate CpAsyncOp to intrinsics (#123789)
Intrinsics are available for the 'cpSize'
variants also. So, this patch migrates the Op
to lower to the intrinsics for all cases.

* Update the existing tests to check the lowering to intrinsics.
* Add newer cp_async_zfill tests to verify the lowering for the 'cpSize'
   variants.
* Tidy-up CHECK lines in cp_async() function in nvvmir.mlir (NFC)

PTX spec link:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-01-23 16:15:52 +05:30
Jack Frankland
8388040fc9
[mlir][tosa] Add NaN Propagation Mode Support (#121951)
The TOSA-V1.0 specification adds "nan propagation" modes as attributes
for several operators. Adjust the ODS definitions of the relevant
operations to include this attribute.

The defined modes are "PROPAGATE" and "IGNORE" and the PROPAGATE mode is
set by default.

MAXIMUM, MINIMUM, REDUCE_MAX, REDUCE_MIN, MAX_POOL, CLAMP, and ARGMAX
support this attribute.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
Co-authored-by: TatWai Chong <tatwai.chong@arm.com>
2025-01-23 10:14:00 +00:00