mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-16 15:36:40 +00:00

This PR adds `f8E8M0FNU` type to MLIR. `f8E8M0FNU` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 8-bit floating point number with bit layout S0E8M0. Unlike IEEE-754 types, there are no infinity, denormals, zeros or negative values. ```c f8E8M0FNU - Exponent bias: 127 - Maximum stored exponent value: 254 (binary 1111'1110) - Maximum unbiased exponent value: 254 - 127 = 127 - Minimum stored exponent value: 0 (binary 0000'0000) - Minimum unbiased exponent value: 0 − 127 = -127 - Doesn't have zero - Doesn't have infinity - NaN is encoded as binary 1111'1111 Additional details: - Zeros cannot be represented - Negative values cannot be represented - Mantissa is always 1 ``` Related PRs: - [PR-107127](https://github.com/llvm/llvm-project/pull/107127) [APFloat] Add APFloat support for E8M0 type - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR - [PR-107999](https://github.com/llvm/llvm-project/pull/107999) [MLIR] Add f6E2M3FN type - [PR-108877](https://github.com/llvm/llvm-project/pull/108877) [MLIR] Add f4E2M1FN type