Existing implementation of the LLVM type converter for LLVM structs
containing incompatible types was attempting to change identifiers of
the struct in case of name clash post-conversion (all identified structs
have different names post-conversion since one cannot change the body of
the struct once initialized). Beyond a trivial error of not updating the
counter in renaming, this approach was broken for recursive structs that
can't be made aware of the renaming and would use the pre-existing
struct with clashing name instead.
For example, given
`!llvm.struct<"_Converted.foo", (struct<"_Converted.foo">, f32)>`
the following type
`!llvm.struct<"foo", (struct<"foo", index>)>`
would incorrectly convert to
```
!llvm.struct<"_Converted_1.foo",
(struct<"_Converted.foo",
(struct<"_Converted.foo">, f32)>)>
```
Remove this incorrect renaming and simply refuse to convert types if it
would lead to identifier clashes for structs with different bodies.
Document the expectation that such generated names are reserved and must
not be present in the input IR of the converter. If we ever actually
need to use handle such cases, this can be achieved by temporarily
renaming structs with reserved identifiers to an unreserved name and
back in a pre/post-processing pass that does _not_ use the type
conversion infra.
Data layout queries may be issued for types whose size exceeds the range
of 32-bit integer as well as for types that don't have a size known at
compile time, such as scalable vectors. Use best practices from LLVM IR
and adopt `llvm::TypeSize` for size-related queries and `uint64_t` for
alignment-related queries.
See #72678.
MLIR can't really be const-correct (it would need a `ConstValue` class
alongside the `Value` class really, like `ArrayRef` and
`MutableArrayRef`). This is however making is more consistent: method
that are directly modifying the Value shouldn't be marked const.
This renames the `emitc.call` op to `emitc.call_opaque` as the existing
call op does not refer to the callee by symbol. The rename allows to
introduce a new call op alongside with a future `emitc.func` op to model
and facilitate functions and function calls.
`bufferization.materialize_in_destination` should be used instead. Both
ops bufferize to a memcpy. This change also conceptually cleans up the
memref dialect a bit: the memref dialect no longer contains ops that
operate on tensor values.
The auto-generated summaries were hard to read (and pretty unhelpful), a
SME tile was:
```
vector<[16]x[16]xi8> of 8-bit signless integer values or vector<[8]x[8]xi16> of 16-bit signless integer values or vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values or vector<[1]x[1]xi128> of 128-bit signless integer values or vector<[8]x[8]xf16> of 16-bit float values or vector<[8]x[8]xbf16> of bfloat16 type values or vector<[4]x[4]xf32> of 32-bit float values or vector<[2]x[2]xf64> of 64-bit float values
```
...and a SVE vector was:
```
of ranks 1scalable vector of 8-bit signless integer or 16-bit signless integer or 32-bit signless integer or 64-bit signless integer or 128-bit signless integer or 16-bit float or bfloat16 type or 32-bit float or 64-bit float values of length 16/8/4/2/1
```
Note: The descriptions added here won't yet be shown on the MLIR docs
(only the short summaries), but this should be easy to enable like it
was for attribute descriptions in #67009.
A table of contents (TOC) is also added to the ArmSME docs page to make
it easier to navigate.
Add an emitc.for op to the EmitC dialect as a lowering target for
scf.for, replacing its current direct translation to C; The translator
now handles emitc.for instead.
I was reading through the very well written Destination-Passing Style
docs. Even though I know not much about compilers, I managed to
understand it pretty well. A few things tripped me up though, which this
PR suggests to rewrite:
1. Write `buffer(%0)` instead of buffer(`%0`). While reading, I first
interpreted the text as "the buffer (%0)", whereas it should be
interpreted as pseudocode for "a function that determines the buffer
applied to %0". Quickly introducing it and moving the backticks around
as this PR does should make this more clear. Also, I verified that MLIR
does not contain any other occurences of `"buffer(<BACKTICK>"`. It does
contain many occurences of `"buffer("` (without the backtick after the
opening bracket), so this PR makes notation a bit more consistent.
2. Quotation marks slowed me down during reading, so I removed them. I
think it's also clear without.
3. The `outs` from `linalg` was suddenly introduced. I've tried to
clarify in as few words as possible that `outs` stands for `outputs` but
suggestions are welcome.
This PR replaces the mixin `OpView` extension mechanism with the
standard inheritance mechanism.
Why? Firstly, mixins are not very pythonic (inheritance is usually used
for this), a little convoluted, and too "tight" (can only be used in the
immediately adjacent `_ext.py`). Secondly, it (mixins) are now blocking
are correct implementation of "value builders" (see
[here](https://github.com/llvm/llvm-project/pull/68764)) where the
problem becomes how to choose the correct base class that the value
builder should call.
This PR looks big/complicated but appearances are deceiving; 4 things
were needed to make this work:
1. Drop `skipDefaultBuilders` in
`OpPythonBindingGen::emitDefaultOpBuilders`
2. Former mixin extension classes are converted to inherit from the
generated `OpView` instead of being "mixins"
a. extension classes that simply were calling into an already generated
`super().__init__` continue to do so
b. (almost all) extension classes that were calling `self.build_generic`
because of a lack of default builder being generated can now also just
call `super().__init__`
3. To handle the [lone single
use-case](https://sourcegraph.com/search?q=context%3Aglobal+select_opview_mixin&patternType=standard&sm=1&groupBy=repo)
of `select_opview_mixin`, namely
[linalg](https://github.com/llvm/llvm-project/blob/main/mlir/python/mlir/dialects/_linalg_ops_ext.py#L38),
only a small change was necessary in `opdsl/lang/emitter.py` (thanks to
the emission/generation of default builders/`__init__`s)
4. since the `extend_opview_class` decorator is removed, we need a way
to register extension classes as the desired `OpView` that `op.opview`
conjures into existence; so we do the standard thing and just enable
replacing the existing registered `OpView` i.e.,
`register_operation(_Dialect, replace=True)`.
Note, the upgrade path for the common case is to change an extension to
inherit from the generated builder and decorate it with
`register_operation(_Dialect, replace=True)`. In the slightly more
complicated case where `super().__init(self.build_generic(...))` is
called in the extension's `__init__`, this needs to be updated to call
`__init__` in `OpView`, i.e., the grandparent (see updated docs).
Note, also `<DIALECT>_ext.py` files/modules will no longer be automatically loaded.
Note, the PR has 3 base commits that look funny but this was done for
the purpose of tracking the line history of moving the
`<DIALECT>_ops_ext.py` class into `<DIALECT>.py` and updating (commit
labeled "fix").
Buffer deallocation pipeline previously was incorrect when applied to
functions. It has since been fixed. Make sure it is exercised in the
tutorial to avoid leaking allocations.
[MLIR] Add stage and effectOnFullRegion to side effect
This patch add stage and effectOnFullRegion to side effect for optimization pass
to obtain more accurate information.
Stage uses numbering to track the side effects's stage of occurrence.
EffectOnFullRegion indicates if effect act on every single value of resource.
RFC disscussion: https://discourse.llvm.org/t/rfc-add-effect-index-in-memroy-effect/72235
Differential Revision: https://reviews.llvm.org/D156087
Reviewed By: mehdi_amini, Mogball
Differential Revision: https://reviews.llvm.org/D156087
The vector.extract assembly format currently only contains the source
type, for example:
%1 = vector.extract %0[1] : vector<3x7x8xf32>
it's not immediately obvious if this is the source or result type. This
patch improves the assembly format to make this clearer, so the above
becomes:
%1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>
This patch recommits 126f0374cbc2110aa97e2141ac898014a8b9531a, reverted by
3ada774d0f65b44f21b360d222f446e533df1a34, along with the missing dependence.
Add an emitc.if op to the EmitC dialect. A new convert-scf-to-emitc
pass replaces the existing direct translation of scf.if to C; The
translator now handles emitc.if instead.
The emitc.if op doesn't return any value and its then/else regions are
terminated with a new scf.yield op. Values returned by scf.if are
lowered using emitc.variable ops, assigned to in the then/else regions
using a new emitc.assign op.
Rename and restructure tiling-related transform ops from the structured
extension to be more homogeneous. In particular, all ops now follow a
consistent naming scheme:
- `transform.structured.tile_using_for`;
- `transform.structured.tile_using_forall`;
- `transform.structured.tile_reduction_using_for`;
- `transform.structured.tile_reduction_using_forall`.
This drops the "_op" naming artifact from `tile_to_forall_op` that
shouldn't have been included in the first place, consistently specifies
the name of the control flow op to be produced for loops (instead of
`tile_reduction_using_scf` since `scf.forall` also belongs to `scf`),
and opts for the `using` connector to avoid ambiguity.
The loops produced by tiling are now systematically placed as *trailing*
results of the transform op. While this required changing 3 out of 4 ops
(except for `tile_using_for`), this is the only choice that makes sense
when producing multiple `scf.for` ops that can be associated with a
variadic number of handles. This choice is also most consistent with
*other* transform ops from the structured extension, in particular with
fusion ops, that produce the structured op as the leading result and the
loop as the trailing result.
This revision replaces the LLVM dialect NullOp by the recently
introduced ZeroOp. The ZeroOp is more generic in the sense that it
represents zero values of any LLVM type rather than null pointers only.
This is a follow to https://github.com/llvm/llvm-project/pull/65508
This chapter demonstrates how one can replicate Halide DSL
transformations using transform dialect operations transforming payload
expressed using Linalg. This was a part of the live tutorial presented
at EuroLLVM 2023.
This commit removes the deallocation capabilities of
one-shot-bufferization. One-shot-bufferization should never deallocate
any memrefs as this should be entirely handled by the
ownership-based-buffer-deallocation pass going forward. This means the
`allow-return-allocs` pass option will default to true now,
`create-deallocs` defaults to false and they, as well as the escape
attribute indicating whether a memref escapes the current region, will
be removed. A new `allow-return-allocs-from-loops` option is added as a
temporary workaround for some bufferization limitations.
Add a new Buffer Deallocation pass with the intend to replace the old
one. For now it is added as a separate pass alongside in order to allow
downstream users to migrate over gradually. This new pass has the goal
of inserting fewer clone operations and supporting additional use-cases.
Please refer to the Buffer Deallocation section in the updated
Bufferization.md file for more information on how this new pass works.
This reverts commit 6a91dfedeb956dfa092a6a3f411e8b02f0d5d289.
This caused problems in downstream projects. We are reverting to give
them more time for integration.
This reverts commit 1bebb60a7565e5197d23120528f544b886b4d905.
This caused problems in downstream projects. We are reverting to give
them more time for integration.
Add a new Buffer Deallocation pass replacing the old one with the goal of
inserting fewer clone operations and supporting additional use-cases.
Please refer to the Buffer Deallocation section in the updated
Bufferization.md file for more information on how this new pass works.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D158421
This is the first commit in a series with the goal to rework the
BufferDeallocation pass. Currently, this pass heavily relies on copies
to perform correct deallocations, which leads to very slow code and
potentially high memory usage. Additionally, there are unsupported cases
such as returning memrefs which this series of commits aims to add
support for as well.
This first commit removes the deallocation capabilities of
one-shot-bufferization.One-shot-bufferization should never deallocate any
memrefs as this should be entirely handled by the buffer-deallocation pass
going forward. This means the allow-return-allocs pass option will
default to true now, create-deallocs defaults to false and they, as well
as the escape attribute indicating whether a memref escapes the current region,
will be removed.
The documentation should w.r.t. these pass option changes should also be
updated in this commit.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D156662
Functions are always callable operations and thus every operation
implementing the `FunctionOpInterface` also implements the
`CallableOpInterface`. The only exception was the FuncOp in the toy
example. To make implementation of the `FunctionOpInterface` easier,
this commit lets `FunctionOpInterface` inherit from
`CallableOpInterface` and merges some of their methods. More precisely,
the `CallableOpInterface` has methods to get the argument and result
attributes and a method to get the result types of the callable region.
These methods are always implemented the same way as their analogues in
`FunctionOpInterface` and thus this commit moves all the argument and
result attribute handling methods to the callable interface as well as
the methods to get the argument and result types. The
`FuntionOpInterface` then does not have to declare them as well, but
just inherits them from the `CallableOpInterface`.
Adding the inheritance relation also required to move the
`FunctionOpInterface` from the IR directory to the Interfaces directory
since IR should not depend on Interfaces.
Reviewed By: jpienaar, springerm
Differential Revision: https://reviews.llvm.org/D157988
This reverts commit 02596693fac55f550e85620f5184547c80c8f930.
This reverts commit 3c5b4dabdc06dd380391ac965b16961610c0db77.
The build is broken:
mlir/test/lib/Dialect/Test/TestOps.td:988:7: error: Value specified for template argument 'Pat:supplemental_results' is of type dag; expected type list<dag>: (addBenefit 10)
def : Pat<(OpD $input), (OpF $input), [], (addBenefit 10)>;
^
I've been struggling with generating the C++ class declarations and definitions for custom attributes from TableGen, as described on this documentation page: https://mlir.llvm.org/docs/DefiningDialects/AttributesAndTypes/#adding-a-new-attribute-or-type-definition
The code for custom types is automatically generated when the MLIR Dialect is added with `add_mlir_dialect()` in the CMake file.
The same is not the case for custom attributes. I think people could benefit from learning how to adjsut their CMakeLists.txt to automatically generate the classes as described on that documentation page. This change adds the necessary information for this.
makslevental on Discord was so kind to help me figure this out myself
Reviewed By: makslevental
Differential Revision: https://reviews.llvm.org/D155249
This adds a parameter SupplementalPatterns in tablegen class Pattern for
postprocessing code. For example, this can be used to ensure ops are
placed in the correct device by copying the atttributes that decide
devicement placement in Tensorflow dialect to prevent performance
regression.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D157032