21867 Commits

Author SHA1 Message Date
Andrzej Warzynski
b1a267e1b9 [mlir][vector] Remove references to non-existing patterns (nfc)
Delete references to:
  * `VectorLoadToMemrefLoadLowering`,
  * `VectorStoreToMemrefStoreLowering`.

These patters were removed in #121454.
2025-02-09 13:54:11 +00:00
Durgadoss R
2feced1df0
[MLIR][NVVM] Add tcgen05 wait/fence Ops (#126265)
PR #126091 adds intrinsics for tcgen05
wait/fence/commit operations. This patch
adds NVVM Dialect Ops for them.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-08 21:34:40 +05:30
Thomas Preud'homme
027aa70ea4
[TOSA] Fix negate maxValue computation (#126295)
getInput1Zp() returns an unsigned value which means in case of negative
zero point value the max intermediate value computation currently goes
wrong. Use getInput1ZpAttr() instead which returns an APInt and allows
easy sign extension to int64_t.
2025-02-08 09:23:32 +00:00
Uday Bondhugula
b850ce41db
[MLIR][Affine] Fix private memref creation bug in affine fusion (#126028)
Fix private memref creation bug in affine fusion exposed in the case of
the same memref being loaded from/stored to in producer nest. Make the
private memref replacement sound.

Change affine fusion debug string to affine-fusion - more compact.

Fixes: https://github.com/llvm/llvm-project/issues/48703
2025-02-08 08:35:10 +05:30
Md Abdullah Shahneous Bari
8b12acd2a4
[mlir][vector][spirv] Handle 1-element vector.{load|store} lowering. (#126294)
Add support for single element vector{load|store} lowering to SPIR-V.
Since, SPIR-V converts single element vector to scalars, it needs
special attention for vector{load|store} lowering to spirv{load|store}.
2025-02-07 15:31:47 -06:00
Adam Siemieniuk
8a03658d57
[mlir][xegpu] Tensor descriptor type verifier (#124548)
Adds XeGPU tensor descriptor type verifier.

The type verifier covers general tensor descriptor invariants w.r.t. Xe
ISA semantics.
Related operation verifiers are updated to account for the new
descriptor checks and avoid duplication.
2025-02-07 20:43:05 +01:00
Scott Todd
73f11ac17d
[mlir][tosa] Use explicit namespace for OpTrait. (#126286)
I'm seeing build errors in a downstream project using torch-mlir that
are fixed by this change. See
https://github.com/iree-org/iree/pull/19903#discussion_r1946899561 for
more context. The build error on MSVC is:
```
C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/Utils/ConversionUtils.h(148): error C2872: 'OpTrait': ambiguous symbol
C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/IR/TosaOps.h(49): note: could be 'mlir::OpTrait'
C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Dialect/Torch/IR/TorchTraits.h(23): note: or       'mlir::torch::Torch::OpTrait'
C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/Utils/ConversionUtils.h(148): note: the template instantiation context (the oldest one first) is
C:\home\runner\_work\iree\iree\third_party\torch-mlir\lib\Conversion\TorchToTosa\TosaLegalizeCommon.cpp(126): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInfer<mlir::tosa::MulOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::PatternRewriter &,mlir::Location,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled
        with
        [
            TosaOp=mlir::tosa::MulOp
        ]
C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Conversion/TorchToTosa/TosaLegalizeUtils.h(83): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInfer<TosaOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::ImplicitLocOpBuilder &,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled
        with
        [
            TosaOp=mlir::tosa::MulOp
        ]
C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Conversion/TorchToTosa/TosaLegalizeUtils.h(76): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInferShape<TosaOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::ImplicitLocOpBuilder &,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled
        with
        [
            TosaOp=mlir::tosa::MulOp
        ]
```

I think the torch-mlir code here is causing the issue, but I'm not sure
why builds only started failing now:
https://github.com/llvm/torch-mlir/blob/main/include/torch-mlir/Dialect/Torch/IR/TorchTraits.h.
Given that `mlir::OpTrait` already exists, torch-mlir should not be
creating an ambiguous symbol `mlir::torch::Torch::OpTrait`. So while a
better fix would be to the downstream project, being explicit here
doesn't seem that unreasonable to me.
2025-02-07 11:04:09 -08:00
Diego Caballero
2c4dd89902
[mlir][Vector] Introduce poison in LowerVectorBitCast/Broadcast/Transpose (#126180)
This PR continues with the introduction of poison as initialization
vector, in this particular case, in LowerVectorBitCast,
LowerVectorBroadcast and LowerVectorTranspose.
2025-02-07 10:51:24 -08:00
Diego Caballero
5a0075adbb
[mlir][Vector] Generate poison vectors in vector.shape_cast lowering (#125613)
This is the first PR that introduces `ub.poison` vectors as part of a
rewrite/conversion pattern in the Vector dialect. It replaces the
`arith.constant dense<0>` vector initialization for
`vector.insert_slice` ops with a poison vector.

This PR depends on all the previous PRs that introduced support for
poison in Vector operations such as `vector.shuffle`, `vector.extract`,
`vector.insert`, including ODS, canonicalization and lowering support.

This PR may improve end-to-end compilation time through LLVM, depending
on the workloads.
2025-02-07 10:42:55 -08:00
TatWai Chong
571a98722f
[mlir][tosa] Change 'shape' of RESHAPE from attribute to input shape … (#125789)
The shape operand is changed to input shape type since V1.0

Change-Id: I508cc1d67e9b017048b3f29fecf202cb7d707110

Co-authored-by: Won Jeon <won.jeon@arm.com>
2025-02-07 10:24:52 -08:00
Diego Caballero
68325148d3
[mlir][Vector] Fold vector.extract from poison vector (#126122)
This PR adds a folder for `vector.extract(ub.poison) -> ub.poison`. It
also replaces `create` with `createOrFold` insert/extract ops in vector
unroll and transpose lowering patterns to trigger the poison foldings
introduced recently.
2025-02-07 10:20:07 -08:00
Eric Hein
1f67070a3f
[ViewOpGraph] Improve GraphViz output (#125509)
This patch improves the GraphViz output of ViewOpGraph
(--view-op-graph).

- Switch to rectangular record-based nodes, inspired by a similar
visualization in [Glow](https://github.com/pytorch/glow). Rectangles
make more efficient use of space when printing text.
- Add input and output ports for each operand and result, and remove
edge labels.
- Switch to a muted color palette to reduce eye strain.
2025-02-07 10:45:47 -05:00
Guray Ozen
b284a849d5
[MLIR][NVVM] Add default constructor for nvvm.barrier [NFC] (#126225)
This PR adds a default constructor to `nvvm.barrier`, making it more
convenient to build the OP.
2025-02-07 15:42:57 +01:00
Igor Wodiany
1454fc9dbf
[mlir][spirv] Add definition for OpGroupNonUniformBallotBitCount (#126055)
A new constraint is also added to restrict attributes values for SPIR-V
attributes. Ideally this should use `ConfinedAttr` with a custom
constraint directly on the operand, however it seems TableGen does not
allow using that with SPIR-V attributes. I suspect it is because SPIR-V
attributes do not derive from the generic MLIR attribute class -
TableGen complains about missing enum field.
2025-02-07 14:20:02 +01:00
Matthias Springer
15e50b1736
[mlir][IR] Clean up type constraints around ValueSemanticsContainerOf (#126075)
* Remove duplicate `TypeOrContainer`. There is an identical class with
the same name: `TypeOrValueSemanticsContainer`.
* Remove `TypeOrContainerOfAnyRank` and use
`TypeOrValueSemanticsContainer` instead. `TypeOrContainerOfAnyRank` is
inconsistent with the other classes because it explicitly checks for
`VectorType` and `TensorType` instead of utilizing the value semantics
type trait.
* Remove `SignlessIntegerOrIndexLikeOfAnyRank` etc. and use
`SignlessIntegerOrIndexLike` instead. `SignlessIntegerOrIndexLike` etc.
already allow 0-d vectors, so there is no difference with
`SignlessIntegerOrIndexLikeOfAnyRank`.
2025-02-07 09:58:15 +01:00
Andrzej Warzyński
e346d4752f
[mlir][vector] Categorize Vector-to-LLVM conversion tests (nfc) (#125918)
Following https://github.com/llvm/llvm-project/issues/124308, this patch
reorganizes the `vector-to-llvm.mlir` tests by splitting them into two
categories:

- **Basic conversion tests**: Tests that only require
  `populateVectorToLLVMConversionPatterns`, focusing on the minimal
  conversion from Vector to LLVM. These have been moved to
  `vector-to-llvm-interface.mlir`.
- **Full pass tests**: Tests that require the complete
  `ConvertVectorToLLVMPass`, which includes
  `populateVectorToLLVMConversionPatterns` along with additional
  patterns. These remain in `vector-to-llvm.mlir`.

This reorganization clarifies test coverage and helps avoid unnecessary
duplication.

NOTE: This is merely moving tests around between two files and adds
some comments.
2025-02-07 08:04:59 +00:00
Andrzej Warzyński
6b902efdf6
[mlir][linalg] Add support for masked vectorization of tensor.insert_slice (2/N) (#123031)
For context, recall that `tensor.insert_slice` is vectorised using the
`vector.transfer_read` + `vector.transfer_write` pair. An unmasked
example is shown below:

```mlir
// BEFORE VECTORIZATION
%res = tensor.insert_slice
  %slice into %dest[0, %c2]
  [5, 1] [1, 1] : tensor<5x1xi32> into tensor<5x3xi32>

// AFTER VECTORIZATION
%read = vector.transfer_read %source[%c0, %c0], %pad
  : tensor<5x1xi32>, vector<8x1xi32>
%res = vector.transfer_write %read, %dest[%c0, %c2]
  : vector<8x1xi32>, tensor<5x3xi32>
```

This PR extends `vectorizeAsInsertSliceOp` to add masking support for
the `vector.transfer_write` operation. This complements the changes
in #122927, which introduced masking for the `vector.transfer_read`.
2025-02-07 07:56:24 +00:00
Karim Nosseir
7fa57cd430
[MLIR] Add move constructor to BytecodeWriterConfig (#126130)
The config is currently not movable and because there are constructors
the default move won't be generated, which prevents it from being moved.
Also, it is not copyable because of the unique_ptr. This PR adds move
constructor to allow moving it.
2025-02-06 21:30:55 -08:00
Avik Pal
a15618f18c
[mlir] feat: add mlirFuncSetResultAttr (#125972)
cc @ftynse @wsmoses
2025-02-06 17:33:12 -06:00
Prashant Kumar
97b08b8ee0
[mlir][amdgpu] Support for 8bit extf for 0d vector type (#126102)
For 0d vector type the rewrite crashes.
2025-02-07 04:15:22 +05:30
Alan Li
f0e1857c84
[MLIR] Support non-atomic RMW option for emulated vector stores (#124887)
This patch is a followup of the previous one: #115922, It adds an option
to turn on emitting non-atomic rmw code sequence instead of atomic rmw.
2025-02-06 13:22:42 -08:00
Md Asghar Ahmad Shahid
f2bca9e385
[MLIR][Linalg] Introduce broadcast/transpose semantic to batch_matmul (#122275)
Goals:
1. To add syntax and semantic to 'batch_matmul' without changing any of
the existing syntax expectations for current usage. batch_matmul is
still just batch_matmul.

2. Move the definition of batch_matmul from linalg OpDsl to tablegen ODS
infra.

Scope of this patch:
To expose broadcast and transpose semantics on the 'batch_matmul'.

The broadcast and transpose semantic are as follows:

By default, 'linalg.batch_matmul' behavior will remain as is. Broadcast
and Transpose semantics can be applied by specifying the explicit
attribute 'indexing_maps' as shown below. This is a list attribute, so
the list must include all the maps if specified.

    Example Transpose:
    ```
    linalg.batch_matmul indexing_maps = [
affine_map< (d0, d1, d2, d3) -> (d0, d3, d1)>, //transpose
                   affine_map< (d0, d1, d2, d3) -> (d0, d3, d2)>,
                   affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)>
                   ]
ins (%arg0, %arg1: memref<2x5x3xf32>,memref<2x5x7xf32>)
                   outs (%arg2: memref<2x3x7xf32>)
    ```

    Example Broadcast:
    ```
    linalg.batch_matmul indexing_maps = [
affine_map< (d0, d1, d2, d3) -> (d3)>, //broadcast
                       affine_map< (d0, d1, d2, d3) -> (d0, d3, d2)>,
                       affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)>
                     ]
                     ins (%arg0, %arg1: memref<5xf32>,memref<2x5x7xf32>)
                     outs (%arg2: memref<2x3x7xf32>)
    ```

    Example Broadcast and transpose:
    ```
    linalg.batch_matmul indexing_maps = [
affine_map< (d0, d1, d2, d3) -> (d1, d3)>, //broadcast
affine_map< (d0, d1, d2, d3) -> (d0, d2, d3)>, //transpose
                       affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)>
                     ]
ins (%arg0, %arg1: memref<3x5xf32>, memref<2x7x5xf32>)
                     outs (%arg2: memref<2x3x7xf32>)
    ```

RFCs and related PR:

https://discourse.llvm.org/t/rfc-linalg-opdsl-constant-list-attribute-definition/80149
https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863
https://discourse.llvm.org/t/rfc-mlir-linalg-operation-tree/83586
https://github.com/llvm/llvm-project/pull/115319
2025-02-06 19:08:50 +00:00
Krzysztof Drewniak
f4e3b8783c
[mlir][LLVM] Switch undef for poison for uninitialized values (#125629)
LLVM itself is generally moving away from using `undef` and towards
using `poison`, to the point of having a lint that caches new uses of
`undef` in tests.

In order to not trip the lint on new patterns and to conform to the
evolution of LLVM
- Rename valious ::undef() methods on StructBuilder subclasses to
::poison()
- Audit the uses of UndefOp in the MLIR libraries and replace almost all
of them with PoisonOp

The remaining uses of `undef` are initializing `uninitialized` memrefs,
explicit conversions to undef from SPIR-V, and a few cases in
AMDGPUToROCDL where usage like

    %v = insertelement <M x iN> undef, iN %v, i32 0
    %arg = bitcast <M x iN> %v to i(M * N)

is used to handle "i32" arguments that are are really packed vectors of
smaller types that won't always be fully initialized.
2025-02-06 12:49:30 -06:00
Kareem Ergawy
dcb124e820
[flang][OpenMP] Enable delayed privatization by default omp.wsloop (#125732)
Reapplies #122471

This is based on https://github.com/llvm/llvm-project/pull/125699, only
the latest commit is relevant.

With changes in this PR and the parent one, the previously reported
failures in the Fujitsu(*) test suite should hopefully be resolved (I
verified all the 14 reported failures and they pass now).

(*) https://linaro.atlassian.net/browse/LLVM-1521
2025-02-06 19:11:04 +01:00
Andrea Faulds
f497fe464b
[mlir][spirv] Fix some issues related to converting ub.poison to SPIR-V (#125905)
This is a follow-up to 5df62bdc9be9c258c5ac45c8093b71e23777fa0e. That
commit should not have needed to make the vector.insert and
vector.extract conversions to SPIR-V directly handle the static poison
index case, as there is a fold from those to ub.poison, and a conversion
pattern from ub.poison to spirv.Undef, however:

- The ub.poison fold result could not be materialized by the vector
dialect (fixed as of d13940ee263ff50b7a71e21424913cc0266bf9d4).
- The conversion pattern wasn't being populated in VectorToSPIRVPass,
which is used by the tests. This commit changes this.
- The ub.poison to spirv.Undef pattern rejected non-scalar types, which
prevented its use for vector results. It is unclear why this restriction
existed; a remark in D156163 said this was to avoid converting "user
types", but it is not obvious why these shouldn't be permitted (the
SPIR-V specification allows OpUndef for all types except OpTypeVoid).
This commit removes this restriction.

With these fixed, this commit removes the redundant static poison index
handling, and updates the tests.
2025-02-06 19:07:34 +01:00
Krzysztof Drewniak
efd0a7f446
[mlir][ROCDL][~NFC] Migrate to LLVM dialect default builders (#125609)
There were a bunch of spots in ROCDL.td where we were defining our own
llvmBuilder call which could have been generated using the default
built-in one on LLVM_IntrOpBase.

This commit cleans up such usages in the interests of potentinally
enabling ROCDL import in the future and of making best practices more
obvious.

The one breaking change is renaming WaitcntOp to SWaitcntOp, which
should have minimal impact.
2025-02-06 11:38:43 -06:00
Igor Wodiany
8609e27a58
[mlir][spirv] Add definition for ImageWriteOp (#124124)
This Pull Request adds OpImageWrite as defined in section 3.52.10.
(Image Instructions). The tests in
`mlir/test/Target/SPIRV/image-ops.mlir` are also updated (and extended
with the new op), so they now pass validation with `spirv-val` after
serialization into SPIR-V. The test was missing `ImageQuery` capability
and entry points. For entry points dummy `main` functions were added.
2025-02-06 09:25:08 -05:00
Joseph Huber
f1e917d07b
[Offload] Unify offloading entries into a single section (#125731)
Summary:
This patch unifies the existing offloading entires into a single section
called `llvm_offload_entires`. This lets us use a more unified
offloading infrastructure so that all targets share the same handling.
The effect is that people in the runtimes now need to check if the kind
is what they expect, but the expectation is that you can combine
multiple potential providers into a compile job. Doesn't fully work
yet because of other runtime issues, but some day. Mostly this helps the
future of liboffload where we want to handle different languages than
OpenMP.
2025-02-06 08:24:01 -06:00
Han-Chung Wang
c9d0a464c9
Revert "[mlir][math]Update convertPowfOp ExpandPatterns.cpp" (#126063)
Reverts llvm/llvm-project#124402

It breaks an integration test in downstream project (i.e., IREE), which
produces NANs. Talked to the author @ita9naiwa, and we agree to reland
the PR after we find the issue.
2025-02-06 05:19:45 -08:00
Kareem Ergawy
84c3b05e5e
[OpenMP][flang][MLIR] Decouple alloc, init, and copy regions for omp.private|declare_reduction ops (#125699)
This PR changes the emitted block structure of alloc, init, and copy
regions for `omp.private` and `omp.declare_reduction` ops a little bit.
In particular, this decouples init and copy regions from the alloca
insertion-point. The main motivation is fix "Instruction does not
dominate all uses!" errors that happen specially when an init region
uses a value from the OpenMP region it is being inlined into. The issue
happens because, previous to this PR, we inline the init region right
after the latest alloc block (since we used the alloca IP); which in
some cases (see exmaple below), is too early and causes the use
dominance issue.

Example that would break without this PR (when delayed privatization is
enabled for `omp.wsloop`s):
```fortran
subroutine test2 (xyz)
  integer :: i
  integer :: xyz(:)

  !$omp target map(from:xyz)
    !$omp do private(xyz)
      do i = 1, 10
        xyz(i) = i
      end do
  !$omp end target
end subroutine
```
2025-02-06 11:45:40 +01:00
Matthias Springer
8c2b4aa5a0
[mlir][LLVM][NFC] Fix description of LLVMFixedVectorType (#126031) 2025-02-06 10:37:32 +01:00
Andrzej Warzyński
78f690bba7
[mlir][Vector] Update VectorEmulateNarrowType.cpp (2/N) (#123527)
This is PR 2 in a series of N patches aimed at improving
"VectorEmulateNarrowType.cpp". This is mainly minor refactoring, no
major functional changes are made/added.

**CHANGE 1** 

Renames the variable "scale". Note, "scale" could mean either:

  * "container-elements-per-emulated-type", or
  * "emulated-elements-per-container-type".

While from the context it is clear that it's always the former (original
type is always a sub-byte type and the emulated type is usually `i8`),
this PR reduces the cognitive load by making this clear.

**CHANGE 2** 

Replaces `isUnalignedEmulation` with `isFullyAligned`

Note, `isUnalignedEmulation` is always computed following a
"per-element-alignment" condition:
```cpp
// Check per-element alignment.
if (containerBits % emulatedBits != 0) {
  return rewriter.notifyMatchFailure(
    op, "impossible to pack emulated elements into container elements "
    "(bit-wise misalignment)");
}

// (...)

bool isUnalignedEmulation = origElements % emulatedPerContainerElem != 0;
```

Given that `isUnalignedEmulation` captures only one of two conditions
required for "full alignment", it should be re-named as
`isPartiallyUnalignedEmulation`. Instead, I've flipped the condition and
renamed it as `isFullyAligned`:

```cpp
bool isFullyAligned = origElements % emulatedPerContainerElem == 0;
```

**CHANGE 3**
  * Unifies various comments throughout the file (for consistency).
* Adds new comments throughout the file and adds TODOs where high-level
    comments are missing.
    
    
**GitHub issue to track this work**:
https://github.com/llvm/llvm-project/issues/123630
2025-02-06 09:19:18 +00:00
Andrzej Warzyński
80fd902573
[mlir][tensor] Introduce TensorRelayoutOpInterface (#125823)
The newly introduced `TensorRelayoutOpInterface` is created specifically
for `tensor.pack` + `tensor.unpack`. Although the interface is
currently empty, it enables us to refactor the logic in
`FoldTensorCastProducerOp` within the Tensor dialect as follows:

```cpp
// OLD
// Reject tensor::PackOp - there's dedicated pattern for that instead.
if (!foldTensorCastPrecondition(op) ||
    isa<tensor::PackOp, tensor::UnPackOp>(*op))
  return failure();
```

is replaced with:

```cpp
// NEW
// Reject tensor::PackOp - there's dedicated pattern for that instead.
if (!foldTensorCastPrecondition(op) ||
    isa<tensor::RelayoutOpInterface>(*op))
  return failure();
```

This will be crucial once `tensor.pack` + `tensor.pack` are replaced
with `linalg.pack` + `linalg.unpack` (i.e. moved to Linalg):
  * https://github.com/llvm/llvm-project/pull/123902,
  * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg/.

Note that the interface itself will later be moved to the Linalg
dialect. This decoupling ensures that the Tensor dialect does not
require an understanding of Linalg ops, thus keeping the dependency
lightweight.

This PR is effectively a preparatory step for moving PackOp and UnpackOp
to Linalg. Once that's completed, most CMake changes from this PR will
be effectively reverted.
2025-02-06 09:18:13 +00:00
Michael Buch
eb8901bda1
[llvm][DebugInfo] Add new DW_AT_APPLE_enum_kind to encode enum_extensibility (#124752)
When creating `EnumDecl`s from DWARF for Objective-C `NS_ENUM`s, the
Swift compiler tries to figure out if it should perform "swiftification"
of that enum (which involves renaming the enumerator cases, etc.). The
heuristics by which it determines whether we want to swiftify an enum is
by checking the `enum_extensibility` attribute (because that's what
`NS_ENUM` pretty much are). Currently LLDB fails to attach the
`EnumExtensibilityAttr` to `EnumDecl`s it creates (because there's not
enough info in DWARF to derive it), which means we have to fall back to
re-building Swift modules on-the-fly, slowing down expression evaluation
substantially. This happens around
4b3931c8ce/lib/ClangImporter/ImportEnumInfo.cpp (L37-L59)

To speed up Swift exression evaluation, this patch proposes encoding the
C/C++/Objective-C `enum_extensibility` attribute in DWARF via a new
`DW_AT_APPLE_ENUM_KIND`. This would currently be only used from the LLDB
Swift plugin. But may be of interest to other language plugins as well
(though I haven't come up with a concrete use-case for it outside of
Swift).

I'm open to naming suggestions of the various new attributes/attribute
constants proposed here. I tried to be as generic as possible if we
wanted to extend it to other kinds of enum properties (e.g., flag
enums).

The new attribute would look as follows:
```
DW_TAG_enumeration_type
  DW_AT_type      (0x0000003a "unsigned int")
  DW_AT_APPLE_enum_kind   (DW_APPLE_ENUM_KIND_Closed)
  DW_AT_name      ("ClosedEnum")
  DW_AT_byte_size (0x04)
  DW_AT_decl_file ("enum.c")
  DW_AT_decl_line (23)

DW_TAG_enumeration_type
  DW_AT_type      (0x0000003a "unsigned int")
  DW_AT_APPLE_enum_kind   (DW_APPLE_ENUM_KIND_Open)
  DW_AT_name      ("OpenEnum")
  DW_AT_byte_size (0x04)
  DW_AT_decl_file ("enum.c")
  DW_AT_decl_line (27)
```
Absence of the attribute means the extensibility of the enum is unknown
and abides by whatever the language rules of that CU dictate.

This does feel like a big hammer for quite a specific use-case, so I'm
happy to discuss alternatives.

Alternatives considered:
* Re-using an existing DWARF attribute to express extensibility. E.g., a
`DW_TAG_enumeration_type` could have a `DW_AT_count` or
`DW_AT_upper_bound` indicating the number of enumerators, which could
imply closed-ness. I felt like a dedicated attribute (which could be
generalized further) seemed more applicable. But I'm open to re-using
existing attributes.
* Encoding the entire attribute string (i.e., `DW_TAG_LLVM_annotation
("enum_extensibility((open))")`) on the `DW_TAG_enumeration_type`. Then
in LLDB somehow parse that out into a `EnumExtensibilityAttr`. I haven't
found a great API in Clang to parse arbitrary strings into AST nodes
(the ones I've found required fully formed C++ constructs). Though if
someone knows of a good way to do this, happy to consider that too.
2025-02-06 08:58:35 +00:00
William Moses
6d81a14873
[MLIR][LLVM] Fold extract of extract (#125980)
Co-authored-by: Ivan R. Ivanov <ivanov.i.aa@m.titech.ac.jp>
2025-02-06 01:05:34 -06:00
Bruno Cardoso Lopes
4fb96f203e
[MLIR][LLVM] Implement LLVM dialect support for global aliases (#125295)
This includes support for module translation, module import and add tests for both.

Fix https://github.com/llvm/llvm-project/issues/115390
ClangIR cannot currently lower global aliases to LLVM because of missing support for this.
2025-02-05 18:19:36 -08:00
Paul Carabas
df1bee03de
[mlir] Add math to LLVM lowering support for missing trigonometric & hyperbolic ops (#125753)
The patch adds support for math -> LLVM dialect lowering for TanOp,
Sinh, Cosh, Tanh
2025-02-05 16:02:29 -08:00
Abid Qadeer
5f7acf7259
[flang][OMPIRbuilder] Set debug loc on terminator created by splitBB. (#125897)
Fixes #125088.

When splitBB is called with createBranch=true, it creates a branch
instruction in the old block. But no debug loc is set on that branch
instruction. If that is used as InsertPoint in the restoreIP, it has the
potential to set the current debug location to null and subsequent
instruction will come out without a debug location. This caused the
verification check to fail as shown in the bug report.

This PR changes splitBB and spliceBB function to also take a debugLoc
parameter which can be used to set the debug location of the branch
instruction.
2025-02-05 22:35:43 +00:00
Ivan Butygin
6e52a12811
[mlir][vector] Create VectorToLLVMDialectInterface (#121440)
Create `VectorToLLVMDialectInterface` which allows automatic conversion
discovery by generic `--convert-to-llvm` pass. This only covers final
dialect conversion step and not any previous preparation steps. Also,
currently there is no way to pass any additional parameters through this
conversion interface, but most users using default parameters anyway.
2025-02-05 23:21:25 +03:00
Nikhil Kalra
113534d12e
[MLIR] Fix nanobind linker args on macOS (#125733)
Set the "-U,_PyClassMethod_New" linker flag on the Python library itself
isntead of on the CMAKE_MODULE_LINKER_FLAGS property.
2025-02-05 11:49:11 -08:00
Nikhil Kalra
4e14b8afb4
[mlir] Python: Parse ModuleOp from file path (#125736)
For extremely large models, it may be inefficient to load the model into
memory in Python prior to passing it to the MLIR C APIs for
deserialization. This change adds an API to parse a ModuleOp directly
from a file path.
2025-02-05 11:48:37 -08:00
Nikhil Kalra
0ad1f8369c
[mlir] Python: Extend print large elements limit to resources (#125738)
If the large element limit is specified, large elements are hidden from
the asm but large resources are not. This change extends the large
elements limit to apply to printed resources as well.
2025-02-05 11:48:11 -08:00
David Spickett
ebd23f25c8
[mlir][CMake] Fix dependency on MLIRTestDialect in Transforms tests (#125894)
Another follow up fix to
https://github.com/llvm/llvm-project/pull/123910 to fix a build failure
that sometimes happens in shared library builds:
https://lab.llvm.org/buildbot/#/builders/50/builds/9724

In file included from
/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/mlir/test/lib/Transforms/TestInlining.cpp:16:
/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/mlir/test/lib/Transforms/../Dialect/Test/TestOps.h:148:10:
fatal error: 'TestOps.h.inc' file not found
  148 | #include "TestOps.h.inc"
      |          ^~~~~~~~~~~~~~~
1 error generated.
2025-02-05 17:49:52 +00:00
Guray Ozen
dd099e9cc2
[MLIR][NVVM] Fix links in OP definition (#125865) 2025-02-05 16:18:04 +01:00
Abid Qadeer
e151b1d1f6
[MLIR][OpenMP] Use correct DebugLoc in target construct callbacks. (#125856)
This is same as PR #125106 which somehow is stuck in a "Processing
Update" loop for many hours now. I am going to close that one and push
this one instead.

While working on https://github.com/llvm/llvm-project/issues/125088, I
noticed a problem with the TargetBodyGenCallbackTy and
TargetGenArgAccessorsCallbackTy. The OMPIRBuilder and MLIR side Both
maintain their own IRBuilder and when control goes from one to other, we
have to take care to not use a stale debug location. The code currently
rely on restoreIP to set the insertion point and the debug location. But
if the passes InsertPointTy has an empty block, then the debug location
will not be updated (see SetInsertPoint). This can cause invalid debug
location to be attached to instruction and the verifier will complain.

Similarly when we exit the callback, the debug location of the Builder
is not set to what it was before the callback. This again can cause
verification failures.

This PR resets the debug location at the start and also uses an
InsertPointGuard to restore the debug location at exit.

Both of these problems would have been caught by the unit tests but they
were not setting the debug location of the builder before calling the
createTarget so the problem was hidden. I have updated the tests
accordingly.
2025-02-05 14:59:37 +00:00
Anchu Rajendran S
ccd92ec4c6
[flang][openmp] Changes for invoking scan Op (#123254) 2025-02-05 06:55:32 -08:00
Andrea Faulds
5df62bdc9b
[mlir][spirv] Support poison index when converting vector.insert/extract (#125560)
This modifies the conversion patterns so that, in the case where the
index is known statically to be poison, the insertion/extraction is
replaced by an arbitrary junk constant value, and in the dynamic case,
the index is sanitized at runtime. This avoids triggering a UB in both
cases. The dynamic case is definitely a pessimisation of the generated
code, but the use of dynamic indexes is expected to be very rare and
already slow on real-world GPU compilers ingesting SPIR-V, so the impact
should be negligible.

Resolves #124162.
2025-02-05 14:27:35 +01:00
Zichen Lu
a61ca99de2
[mlir] fix overflow warning when generating embedded libdevice (#125801)
When building mlir with `-DMLIR_NVVM_EMBED_LIBDEVICE=ON`, there will be
a warning
```
build/tools/mlir/lib/Target/LLVM/libdevice_embedded.c:1: warning: overflow in conversion from ‘int’ to ‘char’ changes value from ‘143’ to ‘-113’ [-Woverflow]
```
which is followed by a large number of characters in stdout.

Fix this to avoid stdout outputting a large number of characters (3e5).
2025-02-05 12:41:11 +01:00
Guray Ozen
baf27862dd
[MLIR][NVGPU] Move max threads/blocks size to dialect (NFC) (#124454)
This PR moves maximum number of threads in a block and block in a grid
to nvgpu dialect to avoid replicated code.

The limits are defined here:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/#features-and-technical-specifications-technical-specifications-per-compute-capability
2025-02-05 12:38:37 +01:00
Jack Frankland
f0b8ff1251
[mlir][tosa] Remove Quantization Attribute (#125479)
Removed the TOSA quantization attribute used in various MLIR TOSA
dialect operations in favour of using builtin attributes.

Update any lit tests, conversions and transformations appropriately.

Signed-off-by: Tai Ly <tai.ly@arm.com>
Co-authored-by: Tai Ly <tai.ly@arm.com>
2025-02-05 11:27:17 +00:00