The ``dataclasses`` package makes sense for Python 3.6, becauses
``dataclasses`` is only included in the standard library with 3.7
version. Now, 3.6 has reached EOL, so all current supported versions of
Python (3.8, 3.9, 3.10, 3.11, 3.12) have this feature in their standard
libraries.
Therefore there's no need to install the ``dataclasses`` package now.
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
Reverts llvm/llvm-project#84103
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
_SubClassValueT is only useful when it is has >1 usage in a signature.
This was not true for the signatures produced by tblgen.
For example
def call(result, callee, operands_, *, loc=None, ip=None) ->
_SubClassValueT:
...
here a type checker does not have enough information to infer a type
argument for _SubClassValueT, and thus effectively treats it as Any.
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.
Currently, a method exists to get the count of the operation objects
which are still alive. This helps for sanity checking, but isn't
terribly useful for debugging. This new method returns the actual
operation objects which are still alive.
This allows Python code like the following:
```
gc.collect()
live_ops = ir.Context.current._get_live_operation_objects()
for op in live_ops:
print(f"Warning: {op} is still live. Referrers:")
for referrer in gc.get_referrers(op)[0]:
print(f" {referrer}")
```
Similar to `transform.get_result`, except it returns a handle to the
operand indicated by a positional specification, same as is defined for
the linalg match ops.
Additionally updates `get_result` to take the same positional specification.
This makes the use case of wanting to get all of the results of an
operation easier by no longer requiring the user to reconstruct the list
of results one-by-one.
In addition to the existing `OpHandle` which provides an abstraction to
emit transform ops targeting operations this introduces a similar
concept for _values_ and _parameters_ in form of `ValueHandle` and
`ParamHandle`.
New core transform abstractions:
- `constant_param`
- `OpHandle.get_result`
- `OpHandle.print`
- `ValueHandle.get_defining_op`
This adds Python abstractions for the different handle types of the
transform dialect
The abstractions allow for straightforward chaining of transforms by
calling their member functions.
As an initial PR for this infrastructure, only a single transform is
included: `transform.structured.match`.
With a future `tile` transform abstraction an example of the usage is:
```Python
def script(module: OpHandle):
module.match_ops(MatchInterfaceEnum.TilingInterface).tile(tile_sizes=[32,32])
```
to generate the following IR:
```mlir
%0 = transform.structured.match interface{TilingInterface} in %arg0
%tiled_op, %loops = transform.structured.tile_using_for %0 [32, 32]
```
These abstractions are intended to enhance the usability and flexibility
of the transform dialect by providing an accessible interface that
allows for easy assembly of complex transformation chains.
The `conv_2d_ngchw_fgchw` Op implements 2d grouped convolution with
dimensions ordered as given in the name. However, the current
implementation orders weights as `gfchw` instead of `fgchw`. This was
already pointed out in an old phabricator revision which never landed:
https://reviews.llvm.org/D150064
This patch
1) Adds a new op `conv_2d_ngchw_gfchw`
2) Fixes the dimension ordering of the old op `conv_2d_ngchw_fgchw`
3) Adds tests with non-dynamic dimensions so that it's easier to
understand.
The existing initialization sequence always enables multi-threading at
MLIRContext construction time, making it impractical to provide a
customized thread pool.
Here, this is changed to always create the context with threading
disabled, process all site-specific init hooks (which can set thread
pools) and ultimately enable multi-threading unless if site-configured
to not do so.
This should preserve the existing user-visible initialization behavior
while also letting downstreams ensure that contexts are always created
with a shared thread pool. This was tested with IREE, which has such a
concept. Using site-specific thread tuning produced up to 2x single
compilation job behavior and customization of batch compilation (i.e. as
part of a build system) to utilize half the memory and run the entire
test suite ~2x faster. Given this, I believe that the additional
configurability can well pay for itself for implementations that use it.
We may also want to present user-level Python APIs for controlling
threading configuration in the future.
This PR adds "value casting", i.e., a mechanism to wrap `ir.Value` in a
proxy class that overloads dunders such as `__add__`, `__sub__`, and
`__mul__` for fun and great profit.
This is thematically similar to
bfb1ba7526
and
9566ee2806.
The example in the test demonstrates the value of the feature (no pun
intended):
```python
@register_value_caster(F16Type.static_typeid)
@register_value_caster(F32Type.static_typeid)
@register_value_caster(F64Type.static_typeid)
@register_value_caster(IntegerType.static_typeid)
class ArithValue(Value):
__add__ = partialmethod(_binary_op, op="add")
__sub__ = partialmethod(_binary_op, op="sub")
__mul__ = partialmethod(_binary_op, op="mul")
a = arith.constant(value=FloatAttr.get(f16_t, 42.42))
b = a + a
# CHECK: ArithValue(%0 = arith.addf %cst, %cst : f16)
print(b)
a = arith.constant(value=FloatAttr.get(f32_t, 42.42))
b = a - a
# CHECK: ArithValue(%1 = arith.subf %cst_0, %cst_0 : f32)
print(b)
a = arith.constant(value=FloatAttr.get(f64_t, 42.42))
b = a * a
# CHECK: ArithValue(%2 = arith.mulf %cst_1, %cst_1 : f64)
print(b)
```
**EDIT**: this now goes through the bindings and thus supports automatic
casting of `OpResult` (including as an element of `OpResultList`),
`BlockArgument` (including as an element of `BlockArgumentList`), as
well as `Value`.
The _mlirRegisterEverything symbol may not be built by some customers. The
code here was intended to support this, but didn't properly initialize the
init_module variable.
This would break JAX with:
NameError: free variable 'init_module' referenced before assignment in enclosing scope
Added missing register_translations in python to replicate the same in
the C-API
Cleaned up the current calls to register passes where the other calls
are already embedded in the mlirRegisterAllPasses.
found here,
https://discourse.llvm.org/t/opencl-example/74187
Linalg currently has these named ops:
* `matmul`
* `matvec`
* `vecmat`
* `batch_matmul`
* `batch_matvec`
But it does not have:
* `batch_vecmat`
This PRs adds that for consistency, and I have a short-term need for it
( https://github.com/openxla/iree/issues/15158 ), so not having this
would cause some contortion on my end.
Currently, `linalg.transpose` and `linalg.broadcast` can't be emitted
through either the C API or the python bindings (which of course go
through the C API). See
https://discourse.llvm.org/t/how-to-build-linalg-transposeop-in-mlir-pybind/73989/10.
The reason is even though they're named ops, there is no opdsl
`@linalg_structured_op` for them and thus while they can be instantiated
they cannot be passed to
[`mlirLinalgFillBuiltinNamedOpRegion`](a7cccb9cbb/mlir/lib/CAPI/Dialect/Linalg.cpp (L18)).
I believe the issue is they both take a `IndexAttrDef` but
`IndexAttrDef` cannot represent dynamic rank. Note, if I'm mistaken and
there is a way to write the `@linalg_structured_op` let me know.
The solution here simply implements the `regionBuilder` interface which
is then picked up by
[`LinalgDialect::addNamedOpBuilders`](7557530f42/mlir/lib/Dialect/Linalg/IR/LinalgDialect.cpp (L116)).
Extension classes are added "by hand" that mirror the API of the
`@linalg_structured_op`s. Note, the extension classes are added to to
`dialects/linalg/__init__.py` instead of
`dialects/linalg/opdsl/ops/core_named_ops.py` in order that they're not
confused for opdsl generators/emitters.
This PR replaces the mixin `OpView` extension mechanism with the
standard inheritance mechanism.
Why? Firstly, mixins are not very pythonic (inheritance is usually used
for this), a little convoluted, and too "tight" (can only be used in the
immediately adjacent `_ext.py`). Secondly, it (mixins) are now blocking
are correct implementation of "value builders" (see
[here](https://github.com/llvm/llvm-project/pull/68764)) where the
problem becomes how to choose the correct base class that the value
builder should call.
This PR looks big/complicated but appearances are deceiving; 4 things
were needed to make this work:
1. Drop `skipDefaultBuilders` in
`OpPythonBindingGen::emitDefaultOpBuilders`
2. Former mixin extension classes are converted to inherit from the
generated `OpView` instead of being "mixins"
a. extension classes that simply were calling into an already generated
`super().__init__` continue to do so
b. (almost all) extension classes that were calling `self.build_generic`
because of a lack of default builder being generated can now also just
call `super().__init__`
3. To handle the [lone single
use-case](https://sourcegraph.com/search?q=context%3Aglobal+select_opview_mixin&patternType=standard&sm=1&groupBy=repo)
of `select_opview_mixin`, namely
[linalg](https://github.com/llvm/llvm-project/blob/main/mlir/python/mlir/dialects/_linalg_ops_ext.py#L38),
only a small change was necessary in `opdsl/lang/emitter.py` (thanks to
the emission/generation of default builders/`__init__`s)
4. since the `extend_opview_class` decorator is removed, we need a way
to register extension classes as the desired `OpView` that `op.opview`
conjures into existence; so we do the standard thing and just enable
replacing the existing registered `OpView` i.e.,
`register_operation(_Dialect, replace=True)`.
Note, the upgrade path for the common case is to change an extension to
inherit from the generated builder and decorate it with
`register_operation(_Dialect, replace=True)`. In the slightly more
complicated case where `super().__init(self.build_generic(...))` is
called in the extension's `__init__`, this needs to be updated to call
`__init__` in `OpView`, i.e., the grandparent (see updated docs).
Note, also `<DIALECT>_ext.py` files/modules will no longer be automatically loaded.
Note, the PR has 3 base commits that look funny but this was done for
the purpose of tracking the line history of moving the
`<DIALECT>_ops_ext.py` class into `<DIALECT>.py` and updating (commit
labeled "fix").
The reason I want this is that I am writing my own Python bindings and
would like to use the insertion point from
`PyThreadContextEntry::getDefaultInsertionPoint()` to call C++ functions
that take an `OpBuilder` (I don't need to expose it in Python but it
also seems appropriate). AFAICT, there is currently no way to translate
a `PyInsertionPoint` into an `OpBuilder` because the operation is
inaccessible.