This PR adds `f6E3M2FN` type to mlir.
`f6E3M2FN` type is proposed in [OpenCompute MX
Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf).
It defines a 6-bit floating point number with bit layout S1E3M2. Unlike
IEEE-754 types, there are no infinity or NaN values.
```c
f6E3M2FN
- Exponent bias: 3
- Maximum stored exponent value: 7 (binary 111)
- Maximum unbiased exponent value: 7 - 3 = 4
- Minimum stored exponent value: 1 (binary 001)
- Minimum unbiased exponent value: 1 − 3 = −2
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs
Additional details:
- Zeros (+/-): S.000.00
- Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28
- Min normal number: S.001.00 = ±2^(-2) = ±0.25
- Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875
- Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625
```
Related PRs:
- [PR-94735](https://github.com/llvm/llvm-project/pull/94735) [APFloat]
Add APFloat support for FP6 data types
- [PR-97118](https://github.com/llvm/llvm-project/pull/97118) [MLIR] Add
f8E4M3 type - was used as a template for this PR
Without this fix, `scf.if` operations would be created without a parent.
Since `scf.if` operations often have no results, this caused silent bugs
where the generated code was straight-up missing the operation.
As reported in https://github.com/llvm/llvm-project/issues/101132, this
fixes two bugs:
1. When accessing variadic operands inside an operation, it must be
accessed as `self.operation.operands` instead of `operation.operands`
2. The implementation of the `equally_sized_accessor` function is doing
wrong arithmetics when calculating the resulting index and group sizes.
I have added a test for the `equally_sized_accessor` function, which did
not have a test previously.
Since we have extended `EmptyOp`, maybe we should also provide a
corresponding `tensor.empty` method. In the downstream usage, I tend to
use APIs with all lowercase letters to create ops, so having a
`tensor.empty` to replace the extended `tensor.EmptyOp` would keep my
code style consistent.
PyYAML 5.3.1 has a security vulnerability as described here:
https://nvd.nist.gov/vuln/detail/CVE-2020-14343. Update the minimum
PyYAML version to 5.4. Also limit ml_dtypes version to 0.4.0.
This PR adds `f8E3M4` type to mlir.
`f8E3M4` type follows IEEE 754 convention
```c
f8E3M4 (IEEE 754)
- Exponent bias: 3
- Maximum stored exponent value: 6 (binary 110)
- Maximum unbiased exponent value: 6 - 3 = 3
- Minimum stored exponent value: 1 (binary 001)
- Minimum unbiased exponent value: 1 − 3 = −2
- Precision specifies the total number of bits used for the significand (mantissa),
including implicit leading integer bit = 4 + 1 = 5
- Follows IEEE 754 conventions for representation of special values
- Has Positive and Negative zero
- Has Positive and Negative infinity
- Has NaNs
Additional details:
- Max exp (unbiased): 3
- Min exp (unbiased): -2
- Infinities (+/-): S.111.0000
- Zeros (+/-): S.000.0000
- NaNs: S.111.{0,1}⁴ except S.111.0000
- Max normal number: S.110.1111 = +/-2^(6-3) x (1 + 15/16) = +/-2^3 x 31 x 2^(-4) = +/-15.5
- Min normal number: S.001.0000 = +/-2^(1-3) x (1 + 0) = +/-2^(-2)
- Max subnormal number: S.000.1111 = +/-2^(-2) x 15/16 = +/-2^(-2) x 15 x 2^(-4) = +/-15 x 2^(-6)
- Min subnormal number: S.000.0001 = +/-2^(-2) x 1/16 = +/-2^(-2) x 2^(-4) = +/-2^(-6)
```
Related PRs:
- [PR-99698](https://github.com/llvm/llvm-project/pull/99698) [APFloat]
Add support for f8E3M4 IEEE 754 type
- [PR-97118](https://github.com/llvm/llvm-project/pull/97118) [MLIR] Add
f8E4M3 IEEE 754 type
This PR adds `f8E4M3` type to mlir.
`f8E4M3` type follows IEEE 754 convention
```c
f8E4M3 (IEEE 754)
- Exponent bias: 7
- Maximum stored exponent value: 14 (binary 1110)
- Maximum unbiased exponent value: 14 - 7 = 7
- Minimum stored exponent value: 1 (binary 0001)
- Minimum unbiased exponent value: 1 − 7 = −6
- Precision specifies the total number of bits used for the significand (mantisa),
including implicit leading integer bit = 3 + 1 = 4
- Follows IEEE 754 conventions for representation of special values
- Has Positive and Negative zero
- Has Positive and Negative infinity
- Has NaNs
Additional details:
- Max exp (unbiased): 7
- Min exp (unbiased): -6
- Infinities (+/-): S.1111.000
- Zeros (+/-): S.0000.000
- NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111}
- Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240
- Min normal number: S.0001.000 = +/-2^(-6)
- Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7
- Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9)
```
Related PRs:
- [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat]
Add support for f8E4M3 IEEE 754 type
`linalg.matmul` already has an attribute for casts, defaults to signed
but allowed unsigned, so the operation `linalg.matmul_unsigned` is
redundant. The generalization test has an example on how to lower to
unsigned matmul in linalg.
This is the first PR in a list of many that will simplify the linalg
operations by using similar attributes.
Ref:
https://discourse.llvm.org/t/rfc-transpose-attribute-for-linalg-matmul-operations/80092
The following logic can lead to a class name mismatch when using
`linalg.powf` in Python. This PR fixed the issue and also renamed
`NegfOp` to `NegFOp` in linalg to adhere to the naming convention, as
exemplified by `arith::NegFOp`.
173514d58e/mlir/python/mlir/dialects/linalg/opdsl/lang/dsl.py (L140-L143)
```
# linalg.powf(arg0, arg1, outs=[init_result.result])
NotImplementedError: Unknown named op_name / op_class_name: powf / PowfOp
```
Currently `f8E4M3` is mapped to `Float8E4M3FNType`.
This PR renames `f8E4M3` to `f8E4M3FN` to accurately reflect the actual
type.
This PR is needed to avoid names conflict in upcoming PR which will add
IEEE 754 `Float8E4M3Type`.
https://github.com/llvm/llvm-project/pull/97118 Add f8E4M3 IEEE 754 type
Maksim, can you review this PR? @makslevental ?
This patch enables continuous tiling of a target structured op using
diminishing tile sizes. In cases where the tensor dimensions are not
exactly divisible by the tile size, we are left with leftover tensor
chunks that are irregularly tiled. This approach enables tiling of the
leftover chunk with a smaller tile size and repeats this process
recursively using exponentially diminishing tile sizes. This eventually
generates a chain of loops that apply tiling using diminishing tile
sizes.
Adds `continuous_tile_sizes` op to the transform dialect. This op, when
given a tile size and a dimension, computes a series of diminishing tile
sizes that can be used to tile the target along the given dimension.
Additionally, this op also generates a series of chunk sizes that the
corresponding tile sizes should be applied to along the given dimension.
Adds `multiway` attribute to `transform.structured.split` that enables
multiway splitting of a single target op along the given dimension, as
specified in a list enumerating the chunk sizes.
The MLIR C and Python Bindings expose various methods from
`mlir::OpPrintingFlags` . This PR adds a binding for the `skipRegions`
method, which allows to skip the printing of Regions when printing Ops.
It also exposes this option as parameter in the python `get_asm` and
`print` methods
Following a rather direct approach to expose PDL usage from C and then
Python. This doesn't yes plumb through adding support for custom
matchers through this interface, so constrained to basics initially.
This also exposes greedy rewrite driver. Only way currently to define
patterns is via PDL (just to keep small). The creation of the PDL
pattern module could be improved to avoid folks potentially accessing
the module used to construct it post construction. No ergonomic work
done yet.
---------
Signed-off-by: Jacques Pienaar <jpienaar@google.com>
Adds a named op: linalg.conv_2d_ngchw_gfchw_q. This op is similar to
linalg.conv_2d_ngchw_gfchw, but additionally incorporates zero point
offset corrections.
Using `for_` is very hand with python bindings. Currently, it doesn't
support results, we had to fallback to two lines scf.for.
This PR yields results of scf.for in `for_`
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
When this code was written, we didn't have proper isinstance support for
operation classes in Python. Now we do, so there is no reason to keep
the expensive exception-based flow.
Following #90236, adding `select` to linalg as `arith.select`. No
implicit type casting.
OpDSL doesn't expose a type restriction for bool, but I saw no reason in
adding it (put a separate symbolic type and check the semantics in the
builder).
---------
Co-authored-by: Renato Golin <rengolin@systemcall.eu>
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This patch modifies the definition of `PadOp` to take transform params
and handles for the `pad_to_multiple_of` operand.
---------
Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
Adding `erf` as unary and `powf` as binary.
Same as `max(arg, 0.0)` for `ReLU`, `powf(arg, const)` can be either a
generic (with broadcast) or a pair (`linalg.broadcast + linalg.powf`)
and then lowered "correctly". Either way, the lower dialects need to
know what kind of broadcast anyway, so no materialization of the
constant tensors should remain.
I want to flush the easy ones before we start working on type cast &
softmax.
Adding `min` that was already implemented but not exposed.
Adding a few additional unary ops:
* Reciprocal as `arith.div(1,arg)`
* Round as `math.round(arg)`
* Sqrt as `math.sqrt(arg)`
* Rsqrt as `math.rsqrt(arg)`
* Square as `math.powf(arg, 2)`
* TanH as `math.tanh(arg)`
All with the agreed semantics at the round table: no implicit
broadcast/type cast.
The Python bindings generated for "async" dialect didn't include any of
the "async" dialect ops. This PR fixes issues with generation of Python
bindings for "async" dialect and adds a test case to use them.
The ``dataclasses`` package makes sense for Python 3.6, becauses
``dataclasses`` is only included in the standard library with 3.7
version. Now, 3.6 has reached EOL, so all current supported versions of
Python (3.8, 3.9, 3.10, 3.11, 3.12) have this feature in their standard
libraries.
Therefore there's no need to install the ``dataclasses`` package now.
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
Reverts llvm/llvm-project#84103
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
_SubClassValueT is only useful when it is has >1 usage in a signature.
This was not true for the signatures produced by tblgen.
For example
def call(result, callee, operands_, *, loc=None, ip=None) ->
_SubClassValueT:
...
here a type checker does not have enough information to infer a type
argument for _SubClassValueT, and thus effectively treats it as Any.
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.