20291 Commits

Author SHA1 Message Date
Johannes Reifferscheid
528a662d3a
Fix sign of largest known divisor of div. (#100081)
There's a missing abs, so it returns a negative value if the divisor is
negative. Later this is then cast to uint.
2024-07-23 10:55:32 +02:00
Tobias Gysi
5da4310082
[MLIR][LLVM] Always print variadic callee type (#99293)
This commit updates the LLVM dialect CallOp and InvokeOp to always print
the variadic callee type (previously callee type) if present. An
additional verifier checks that only variadic calls have a non-null
variadic callee type, and the builders are adapted accordingly to set
the variadic callee type for variadic calls only. Finally, the CallOp
and InvokeOp verifiers are strengthened to check that the variadic
callee type matches the call argument and result types.

The motivation of this change is that CallOp and InvokeOp don't have
hidden state that is not pretty printed, but used during the export to
LLVM IR. Previously, it could happen that a call looked correct in MLIR,
but the return type changed after exporting to LLVM IR (since it has
been taken from the hidden callee type attribute). After landing this
change, this is not possible anymore since the variadic callee type is
always printed if present.
2024-07-23 08:21:46 +02:00
Alexander Pivovarov
019136e30f
[MLIR] Add f8E4M3 IEEE 754 type (#97118)
This PR adds `f8E4M3` type to mlir.

`f8E4M3` type  follows IEEE 754 convention

```c
f8E4M3 (IEEE 754)
- Exponent bias: 7
- Maximum stored exponent value: 14 (binary 1110)
- Maximum unbiased exponent value: 14 - 7 = 7
- Minimum stored exponent value: 1 (binary 0001)
- Minimum unbiased exponent value: 1 − 7 = −6
- Precision specifies the total number of bits used for the significand (mantisa), 
    including implicit leading integer bit = 3 + 1 = 4
- Follows IEEE 754 conventions for representation of special values
- Has Positive and Negative zero
- Has Positive and Negative infinity
- Has NaNs

Additional details:
- Max exp (unbiased): 7
- Min exp (unbiased): -6
- Infinities (+/-): S.1111.000
- Zeros (+/-): S.0000.000
- NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111}
- Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240
- Min normal number: S.0001.000 = +/-2^(-6)
- Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7
- Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9)
```

Related PRs:
- [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat]
Add support for f8E4M3 IEEE 754 type
2024-07-22 23:20:28 -07:00
donald chen
9cc11b98a7
[mlir] [linalg] Add pattern to swap transpose with broadcast (#97063)
Add a pattern that implement:

  transpose(broadcast(input)) -> broadcast(transpose(input))
2024-07-23 12:52:25 +08:00
Angel Zhang
db1d881372
[mlir][spirv] Fix bug for vector.broadcast op in convert-vector-to-spirv pass (#99928)
This PR addresses
[!17976](https://github.com/iree-org/iree/issues/17976) by using
converted `resultType` instead of the original result type obtained from
`castOp.getResultVectorType`. A new LIT test is also included.
2024-07-22 18:57:11 -04:00
Jacques Pienaar
d2f42c7372
[mlir] Add unit test for RankedTensorType wrapper example. (#99789)
Add example as unit test for creating a wrapper type/view for
RankedTensorType with encoding. This view provides a more restricted &
typed API while it allows one to avoid repeated casting queries and
accessing the encoding directly.

For users with more advance encodings, the expectation would be a
separate attribute type, but here just StringAttr is used.
2024-07-22 13:16:22 -07:00
Pranav Bhandarkar
d7e185cca9
[OMPIRBuilder] - Handle dependencies in createTarget (#93977)
This patch handles dependencies specified by the `depend` clause on an
OpenMP target construct. It does this much the same way clang does it by
materializing an OpenMP `task` that is tagged with the dependencies.

The following functions are relevant to this patch -
1) `createTarget` - This function itself is largely unchanged except
that it now accepts a vector of `DependData` objects that it simply
forwards to `emitTargetCall`
2) `emitTargetCall` - This function has changed now to check if an outer
target-task needs to be materialized (i.e if `target` construct has
`nowait` or has `depend` clause). If yes, it calls `emitTargetTask` to
do all the heavy lifting for creating and dispatching the task.
3) `emitTargetTask` - Bulk of the change is here. See the large comment
explaining what it does at the beginning of this function
2024-07-22 10:56:45 -05:00
Andrzej Warzyński
85e7428562
[mlir][vector] Refactor vector-transfer-flatten.mlir (nfc) (3/n) (#95745)
The main goal of this and subsequent PRs is to unify and categorize
tests in:
  * vector-transfer-flatten.mlir

This should make it easier to identify the edge cases being tested (and
how they differ), remove duplicates and to add tests for scalable
vectors.

The main contributions of this PR:

1. For consistency with other tests,
   `@transfer_read_flattenable_with_dynamic_dims_and_indices` is renamed
   as `@transfer_read_leading_dynamic_dims`. It is also moved near other
   tests for `xfer_read`, variable names are updated to match other
   `xfer_read` tests

2. `@transfer_write_dims_mismatch_non_zero_indices_trailing_dynamic_dim`
   is renamed as `@negative_transfer_read_dynamic_dim_to_flatten` to
   better highlight that it's a negative test and to contrast it with
   `@transfer_read_leading_dynamic_dims` (and to emphasise the
   difference between the two).

3. Similar changes for tests for `xfer_write`.

4. Make sure that we consistently use `%idx_N` (as opposed to `%idxN`).

Follow-up for #95743 and #95744
2024-07-22 11:11:52 +01:00
Matthias Springer
36d384b4dd
[mlir][Transforms][NFC] Dialect conversion: Simplify EraseBlockRewrite constructor (#99805) 2024-07-22 09:09:02 +02:00
donald chen
324fea9baa
[mlir][docs] Update documentation for canonicalize. (#99753)
Update canonicalize docs.
2024-07-22 11:10:27 +08:00
Alexander Belyaev
568845ae50
[mlir] Add a ValueSemantics trait. (#99493)
We need to distinguish ShapedTypes with and without value semantics.
This is needed for downstream users to define their custom vector and
tensor
types that can work with the arith/math dialect.

RFC https://discourse.llvm.org/t/rfc-mlir-types-with-encoding/80189
2024-07-21 23:08:04 +02:00
Andrzej Warzyński
14a543ea2a
[mlir][test] Add comments in a test (nfc) (#99810)
Documents which patterns are tested in:
  * vector-transfer-collapse-inner-most-dims.mlir.
2024-07-21 17:44:12 +01:00
Fangrui Song
b8220b986d [MC] Remove unused bool arguments from createMCObjectStreamer callers 2024-07-20 21:27:36 -07:00
Kevin Gleason
d1578848e9
Add logging for emit functions in BytecodeWriter.cpp (#99558)
Recently there was a change to materializing unrealized conversion
casts, which inserted conversion that previously did not exist during
legalization (https://github.com/llvm/llvm-project/pull/97903), after
these cases are inserted and then washed away after transformation
completes, it caused the use-list ordering of an op to change in some
cases: `my.add %arg0(use1), %arg0(use2) --> my.add %arg0(use2),
%arg0(use1)`, which subtly changes the bytecode emitted since this is
considered a custom use-list.

When investigating why the bytecode had changed I added the following
logging which helped track down the difference, in my case it showed
extra bytes with "use-list section". With
`-debug-only=mlir-bytecode-writer` emits logs like the following,
detailing the source of written bytes:

```
emitBytes(4b)	bytecode header
emitVarInt(6)	bytecode version
emitByte(13)	bytecode version
emitBytes(17b)	bytecode producer
emitByte(0)	null terminator
emitVarInt(2)	dialects count
...
emitByte(5)	dialect version
emitVarInt(4)	op names count
emitByte(9)	op names count
emitVarInt(0)	dialect number
...
emitVarInt(2)	dialect writer
emitByte(5)	dialect writer
emitVarInt(9259963783827161088)	dialect APInt
...
emitVarInt(3)	attr/type offset
emitByte(7)	attr/type offset
emitByte(3)	section code
emitVarInt(18)	section size
...
```

Note: this uses string constants and `StringLiteral`, I'm not sure if
these are washed away during compilation / OK to have these around for
debuggin, or if there's a better way to do this? Alternative was adding
many braces and `LLVM_DEBUG` calls at each callsite, but this felt more
error prone / likely to miss some callsites.
2024-07-20 12:57:50 -05:00
Matthias Springer
bbd4af5da2
[mlir][Transforms] Dialect conversion: Simplify handling of dropped arguments (#97213)
This commit simplifies the handling of dropped arguments and updates
some dialect conversion documentation that is outdated.

When converting a block signature, a `BlockTypeConversionRewrite` object
and potentially multiple `ReplaceBlockArgRewrite` are created. During
the "commit" phase, uses of the old block arguments are replaced with
the new block arguments, but the old implementation was written in an
inconsistent way: some block arguments were replaced in
`BlockTypeConversionRewrite::commit` and some were replaced in
`ReplaceBlockArgRewrite::commit`. The new
`BlockTypeConversionRewrite::commit` implementation is much simpler and
no longer modifies any IR; that is done only in `ReplaceBlockArgRewrite`
now. The `ConvertedArgInfo` data structure is no longer needed.

To that end, materializations of dropped arguments are now built in
`applySignatureConversion` instead of `materializeLiveConversions`; the
latter function no longer has to deal with dropped arguments.

Other minor improvements:
- Add more comments to `applySignatureConversion`.

Note: Error messages around failed materializations for dropped basic
block arguments changed slightly. That is because those materializations
are now built in `legalizeUnresolvedMaterialization` instead of
`legalizeConvertedArgumentTypes`.

This commit is in preparation of decoupling argument/source/target
materializations from the dialect conversion.

This is a re-upload of #96207.
2024-07-20 10:12:13 +02:00
Angel Zhang
2b371003d1
[mlir][docs] Update documentation for vector.multi_reduction. NFC (#99668) 2024-07-19 17:12:33 -04:00
Guillermo Callaghan
d1ca1d017f
[mlir] Makes zip_shortest an optional keyword in transform.foreach (#98492)
This PR addresses a [comment] made by @ftynse about the syntax for
`ForeachOp`. The syntax was modified by @muneebkhan85 in #82792, where
the attribute dictionary was moved to the middle.
This patch moves it back to its original place at the end. And
introduces an optional keyword for `zip_shortest`.

[comment]:
https://github.com/llvm/llvm-project/pull/82792#pullrequestreview-2132814144
2024-07-19 17:36:12 +02:00
Dhruv Chauhan
9d86722eee
[TOSA] Add lowering for tosa.sin and tosa.cos (#99651)
Lower tosa ops to `mlir::math::SinOp` and `mlir::math::CosOp` as part of
the tosa to linalg conversion.
Added lit tests for conversion.
2024-07-19 16:11:23 +01:00
Rafael Ubal
38d0b2d174
[mlir] New canonicalization patterns for shape.shape_of and tensor.reshape (#98531)
This PR includes 3 new canonicalization patterns:

- Operation `shape.shape_of`: shape of reshape

```
// Before
func.func @f(%arg0: tensor<*xf32>, %arg1: tensor<?xindex>) -> tensor<?xindex> {
  %reshape = tensor.reshape %arg0(%arg1) : (tensor<*xf32>, tensor<?xindex>) -> tensor<*xf32>
  %0 = shape.shape_of %reshape : tensor<*xf32> -> tensor<?xindex>
  return %0 : tensor<?xindex>
}

// After
func.func @f(%arg0: tensor<*xf32>, %arg1: tensor<?xindex>) -> tensor<?xindex> {
  return %arg1 : tensor<?xindex>
}
```

- Operation `tensor.reshape`: reshape of reshape

```
// Before
func.func @fold_tensor_reshape(%arg0: tensor<*xf32>, %arg1: tensor<?xindex>, %arg2: tensor<?xindex>) -> tensor<*xf32> {
  %0 = tensor.reshape %arg0(%arg1) : (tensor<*xf32>, tensor<?xindex>) -> tensor<*xf32>
  %1 = tensor.reshape %0(%arg2) : (tensor<*xf32>, tensor<?xindex>) -> tensor<*xf32>
  return %1 : tensor<*xf32>
}

// After
func.func @fold_tensor_reshape(%arg0: tensor<*xf32>, %arg1: tensor<?xindex>, %arg2: tensor<?xindex>) -> tensor<*xf32> {
  %reshape = tensor.reshape %arg0(%arg2) : (tensor<*xf32>, tensor<?xindex>) -> tensor<*xf32>
  return %reshape : tensor<*xf32>
}
```

- Operation `tensor.reshape`: reshape 1D to 1D

```
// Before
func.func @fold_reshape_1d(%input: tensor<?xf32>, %shape: tensor<1xindex>) -> tensor<?xf32> {
  %0 = tensor.reshape %input(%shape) : (tensor<?xf32>, tensor<1xindex>) -> tensor<?xf32>
  return %0 : tensor<?xf32>
}

// After
func.func @fold_reshape_1d(%arg0: tensor<?xf32>, %arg1: tensor<1xindex>) -> tensor<?xf32> {
  return %arg0 : tensor<?xf32>
}
```

These three canonicalization patterns cooperate to simplify the IR
structure emerging from the lowering of certain element-wise ops with
unranked tensor inputs. See file `unranked-tensor-lowering.mlir` in the
proposed change list for a detailed example and description.

For context, this PR is meant to enable code optimizations for the code
generated while lowering ops `quant.qcast` and `quant.dcast` with
unranked tensors, as proposed in
https://discourse.llvm.org/t/rfc-improvements-in-the-quant-dialect/79942
(implementation currently in progress).
2024-07-19 10:09:31 -04:00
Sergio Afonso
7f763b162f
[MLIR][OpenMP] NFC: Sort clause definitions (#99504)
This patch moves the `filter` clause definition to keep alphabetical
sorting of OpenMPClauses.td.
2024-07-19 10:41:24 +01:00
Corentin Ferry
e6668b1be8
[mlir][tosa] Use roundeven in TOSA cast splat constant op folding (#99484)
The behavior of TOSA Cast operation for floating-point to integers is to round to the nearest even. This commit aligns the behavior of folding a TOSA Cast of a float splat to int, so it also uses roundeven.
2024-07-19 09:22:39 +02:00
Andrzej Warzyński
cfe043cf99
[mlir][linalg] Restrict scalable vectorisation (#98639)
Updates `vectorizeScalableVectorPrecondition` so that scalable
vectorisation is only applied in well understood and tested scenarios.

It's unlikely that we would ever want an arbitrary dimension to be
scalable. While the Linalg vectoriser should be flexible enough to
handle all possibilities:
  * in more "exotic" cases, we are likely to struggle with lowerings
    further down the compilation stack,
  * it would be impractical given the limitations of LLVM (which usually
    reflect the limitations of actual hardware) - e.g. no support for
    "scalable" arrays of scalable or fixed width vectors (*).

Ultimately, the goal of this patch is to better document what's
currently supported. While this PR adds some new restrictions, no
existing tests are affected.

(*) At MLIR vector level that would correspond to e.g.
`vector<[4]x8xf32>`.
2024-07-19 08:05:10 +01:00
Chenguang Wang
a41a4b8fed
Revert "[mlir][ArmSME] Suppress potential unused warning (#99573)" (#99578)
This reverts commit 05bce3f079b677edd0efd28e3923f4776ffb8b59.

The work was already done in 99faa03.
2024-07-18 14:59:26 -07:00
Chenguang Wang
05bce3f079
[mlir][ArmSME] Suppress potential unused warning (#99573)
When building in release mode, the assert will be dropped, making
`remove` unused.
2024-07-18 14:55:22 -07:00
Kazu Hirata
99faa038c6 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/ArmSME/Transforms/TileAllocation.cpp:621:16: error:
  unused variable 'removed' [-Werror,-Wunused-variable]
2024-07-18 14:41:15 -07:00
Benjamin Maxwell
eed72d4381
[mlir][ArmSME] Support filling liveness 'holes' in the tile allocator (#98350)
Holes in a live range are points where the corresponding value does not
need to be in a tile/register. If the tile allocator keeps track of
these holes it can reuse tiles for more values (avoiding spills).

Take this simple example:

```mlir
func.func @example(%cond: i1) {
  %tileA = arm_sme.get_tile : vector<[4]x[4]xf32>
  cf.cond_br %cond, ^bb2, ^bb1
^bb1:
  // If we end up here we never use %tileA again!
  "test.some_use"(%tileB) : (vector<[4]x[4]xf32>) -> ()
  cf.br ^bb3
^bb2:
  "test.some_use"(%tileA) : (vector<[4]x[4]xf32>) -> ()
  cf.br ^bb3
^bb3:
  return
}
```

If you were to calculate the liveness of %tileA and %tileB. You'd see
there is a hole in the liveness of %tileA in bb1:

```
      %tileA  %tileB
^bb0:  Live
^bb1:          Live
^bb2:  Live
```

The tile allocator can make use of that hole and reuse the tile ID it
assigned to %tileA for %tileB.
2024-07-18 20:13:45 +01:00
Angel Zhang
9527d77aef
[mlir][spirv] Restructure code in SPIRVConversion.cpp. NFC. (#99393) 2024-07-18 14:31:15 -04:00
Uday Bondhugula
fe04aafe6c
[MLIR][Affine] NFC. Expose affine loop tiling validity utility (#99459)
Move the utility to check for the validity of tiling affine loop nests
to affine loop utils and expose for users outside the loop tiling pass
or downstream users.
2024-07-18 22:06:44 +05:30
Johannes Reifferscheid
15495b8cd4
[mlir] Fix unused-variable warning w/o assertions. (#99489) 2024-07-18 15:20:57 +02:00
Hideto Ueno
c0084c36ed
[mlir][BytecodeReader] Const qualify *SectionReader, NFC (#99376)
`StringSectionReader`, `ResourceSectionReader` and
`PropertiesSectionReader` are immutable after `initialize` so this PR
adds const to their parsing functions and references in `AttrTypeReader`
and `DialectReader`.
2024-07-18 19:45:36 +09:00
Uday Bondhugula
ba8e4920ca
[MLIR] NFC. Remove anti-patterns given the default null init for Value (#99457)
Remove anti-patterns given the default null init for Value. Drop some
extra includes while on this file. NFC.

Co-authored-by: GitHub runner <github-runner@polymagelabs.com>
2024-07-18 14:51:09 +05:30
Adrian Kuegel
7b08c2774c [mlir][Linalg] Remove unused header include.
There seems to be no direct usage of any tosa utils.
2024-07-18 06:35:42 +00:00
Hsiangkai Wang
27ee33d136
[mlir][linalg] Decompose winograd operators (#96183)
Convert Linalg winograd_filter_transform, winograd_input_transform, and
winograd_output_transform into nested loops with matrix multiplication
with constant transform matrices.

Support several configurations of Winograd Conv2D, including F(2, 3),
F(4, 3) and F(2, 5). These configurations show that the implementation
can support different kernel size (3 and 5) and different output size
(2 and 4). Besides symetric kernel size 3x3 and 5x5, this patch also
supports 1x3, 3x1, 1x5, and 5x1 kernels.

The implementation is based on the paper, Fast Algorithm for
Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308)

Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin

Reviewed By: ftynse, Max191

Pull Request: https://github.com/llvm/llvm-project/pull/96183
2024-07-18 06:04:53 +01:00
Jordan Rupprecht
21e6777957
[mlir][NFC] Add rewrite header to fix standalone header compile (#99370)
This uses `MlirRewriterBase` from from `mlir-c/Rewrite.h` without
including it.
2024-07-17 13:50:35 -05:00
Angel Zhang
6867e49fc8
[mlir][spirv] Implement vector type legalization for function signatures (#98337)
### Description
This PR implements a minimal version of function signature conversion to
unroll vectors into 1D and with a size supported by SPIR-V (2, 3 or 4
depending on the original dimension). This PR also includes new unit
tests that only check for function signature conversion.

### Future Plans
- Check for capabilities that support vectors of size 8 or 16.
- Set up `OneToNTypeConversion` and `DialectConversion` to replace the
current implementation that uses `GreedyPatternRewriteDriver`.
- Introduce other vector unrolling patterns to cancel out the
`vector.insert_strided_slice` and `vector.extract_strided_slice` ops and
fully legalize the vector types in the function body.
- Handle `func::CallOp` and declarations.
- Restructure the code in `SPIRVConversion.cpp`.
- Create test passes for testing sets of patterns in isolation.
- Optimize the way original shape is splitted into target shapes, e.g.
`vector<5xi32>` can be splitted into `vector<4xi32>` and
`vector<1xi32>`.

---------

Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
2024-07-17 13:09:15 -04:00
MaheshRavishankar
c077a4f305
[mlir][Tensor] Add pattern to fold concats of empty. (#98994)
A concatenation of empty tensors can be replaced by a single empty
tensor of the concatenated shape. Add this pattern to
`populateFoldTensorEmptyPatterns`.
2024-07-17 09:51:00 -07:00
Giuseppe Rossini
c63125d453
[mlir] Fix block merging (#97697)
With this PR I am trying to address:
https://github.com/llvm/llvm-project/issues/63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted. This needs to happen to all
the arguments we pass to the different successors of the parent block
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note**: this a rework of #96871 . I ran all the integration tests
(`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.
2024-07-17 17:05:40 +01:00
Jan Leyonberg
3fae5551de
[MLIR][ROCDL] Refactor conversion of math operations to ROCDL calls to a separate pass (#98653)
This patch refactors the conversion of math operations to ROCDL library
calls. This pass will also be used in flang to lower Fortran
intrinsics/math functions for OpenMP target offloading codgen.
2024-07-17 09:33:04 -04:00
Finlay
aa21ee7926
[MLIR] Add attributes no_unwind and will_return to the LLVMIR dialect (#98921)
And testing.
These are being added to be used in the GPU to LLVM SPV pass.

---------

Co-authored-by: Victor Perez <victor.perez@codeplay.com>
2024-07-17 11:26:25 +02:00
Hugo Trachino
de61875e9d
[MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / trailing dimensions. (#98455)
Generalizes DropUnitDimFromElementwiseOps to support inner unit
dimensions.
This change stems from improving lowering of contractionOps for Arm SME.
Where we end up with inner unit dimensions on MulOp, BroadcastOp and
TransposeOp, preventing the generation of outerproducts.
discussed
[here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa).

Fix after : https://github.com/llvm/llvm-project/pull/97652 showed an
unhandled edge case when all dimensions are one. The generated target
VectorType would be `vector<f32>` which is apparently not supported by
the mulf.
In case all dimensions are dropped, the target vectorType is
vector<1xf32>

---------

Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
2024-07-17 10:22:25 +01:00
Guray Ozen
f2251f93ab [mlir][gpu] Add mlir_c_runner_utils to fix #99035
This fixes the unit test that is broken in #99035.
2024-07-17 09:23:32 +02:00
Guray Ozen
20861f1f2f
[mlir][gpu] Use alloc OP's host_shared in cuda runtime (#99035) 2024-07-17 07:25:11 +02:00
Jie Fu
ffbda47159 [mlir] Fix build error (NFC)
/llvm-project/mlir/include/mlir/CAPI/Rewrite.h:21:63:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
DEFINE_C_API_PTR_METHODS(MlirRewriterBase, mlir::RewriterBase);
                                                              ^
1 error generated.
2024-07-17 07:13:16 +08:00
Fehr Mathieu
5f8c46b887
[mlir] Add RewriterBase to the C API (#98962)
This exposes most of the `RewriterBase` methods to the C API.
This allows to manipulate both the `IRRewriter` and the
`PatternRewriter`. The
`IRRewriter` can be created from the C API, while the `PatternRewriter`
cannot.

The missing operations are the ones taking `Block::iterator` and
`Region::iterator` as
parameters, as they are not exposed by the C API yet AFAIK.

The Python bindings for these methods and classes are not implemented.
2024-07-16 20:37:11 +01:00
Andrzej Warzyński
2ee5586ac7
[mlir][vector] Make the in_bounds attribute mandatory (#97049)
At the moment, the in_bounds attribute has two confusing/contradicting
properties:
  1. It is both optional _and_ has an effective default-value.
  2. The default value is "out-of-bounds" for non-broadcast dims, and
     "in-bounds" for broadcast dims.

(see the `isDimInBounds` vector interface method for an example of this
"default" behaviour [1]).

This PR aims to clarify the logic surrounding the `in_bounds` attribute
by:
  * making the attribute mandatory (i.e. it is always present),
  * always setting the default value to "out of bounds" (that's
    consistent with the current behaviour for the most common cases).

#### Broadcast dimensions in tests

As per [2], the broadcast dimensions requires the corresponding
`in_bounds` attribute to be `true`:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

The changes in this PR mean that we can no longer rely on the
default value in cases like the following (dim 0 is a broadcast dim):
```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

Instead, the broadcast dimension has to explicitly be marked as "in
bounds:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

All tests with broadcast dims are updated accordingly.

#### Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The following patterns in "Vectorization.cpp" are updated to explicitly
set the `in_bounds` attribute to `false`:
* `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and
`vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to
make sure that xfer Ops created by these hooks set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would complain

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Still, this is consistent with the
current behaviour where the broadcast dim would be implicitly assumed
to be "in bounds".

[1]
4145ad2bac/mlir/include/mlir/Interfaces/VectorInterfaces.td (L243-L246)
[2]
https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
2024-07-16 16:49:52 +01:00
Alexis Perry-Holby
f1d3fe7aae
Add basic -mtune support (#98517)
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
2024-07-16 16:48:24 +01:00
Andrzej Warzyński
4ecb5383e1
[mlir][vector] Update tests for collapse 6/n (nfc) (#98902)
The main goal of this PR (and subsequent PRs), is to add more tests with
scalable vectors to:
  * vector-transfer-collapse-inner-most-dims.mlir

There's quite a few cases to consider, hence this is split into multiple
PRs.

In this PR, I am making the following changes:
* All input memrefs for `xfer_read` are are renamed as `%src`.
* All input memrefs for `xfer_write` are are renamed as `%dest`.
* All variables representing pad values for `xfer_read` are renamed as
  `%pad`.
* All vector variables (for `xfer_read` and `xfer_write`) are renamed as
  `%v`.
* Add `@contiguous_inner_most_non_zero_idx_in_bounds_scalable` for
  `xfer_read` (similar test already exists for `xfer_write`)
* All index variables are renamed as `%i` (1st index) and `%ii` (2nd
  index).

The above were marked as TODOs in the test file - these are not
resolved. In addition (to avoid sending another PR):
* `@drop_inner_most_dim` is deleted - it duplicates
  `@contiguous_inner_most` for xfer_write
* For consistency with other negative tests, renamed `@non_unit_strides`
  as `@negative_non_unit_strides` and added a similar test for
  `xfer_read`
* `@non_unit_strides` is renamed as `@negative_non_unit_strides` and
  a similar test is added for `xfer_read`.

This is a follow-up for: #94490, #94604, #94906, #96214, #96227
2024-07-16 16:41:57 +01:00
Fabian Mora
16dd75b41c
[mlir][ROCDL] Construct AMDGCN ISA control variable explicitly (#98912)
This patch constructs the AMDGCN ISA control variable explicitly instead
of linking against the library shipped with ROCm. This change prevents
issues arising from the order in which the AMDGCN libraries are linked.
2024-07-16 09:16:00 -05:00
Alex Bradbury
fdf94e1632 Reapply "[Intrinsics][PreISelInstrinsicLowering] llvm.memcpy.inline length no longer needs to be constant (#98281)"
This reverts commit ac4b6b662630cd4d3bf6929f2b39ea203c0054a1.

A test change was missing for
mlir/test/Target/LLVMIR/llvmir-intrinsics.mlir in the initial commit.
2024-07-16 14:48:59 +01:00
Tom Natan
f091848504
Add support for enum doc gen (#98885) 2024-07-16 10:18:49 +02:00