llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 10:56:06 +00:00

Author	SHA1	Message	Date
Sergey Kozub	3f9cabae00	[MLIR] Add f8E8M0FNU type (#111028 ) This PR adds `f8E8M0FNU` type to MLIR. `f8E8M0FNU` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 8-bit floating point number with bit layout S0E8M0. Unlike IEEE-754 types, there are no infinity, denormals, zeros or negative values. ```c f8E8M0FNU - Exponent bias: 127 - Maximum stored exponent value: 254 (binary 1111'1110) - Maximum unbiased exponent value: 254 - 127 = 127 - Minimum stored exponent value: 0 (binary 0000'0000) - Minimum unbiased exponent value: 0 − 127 = -127 - Doesn't have zero - Doesn't have infinity - NaN is encoded as binary 1111'1111 Additional details: - Zeros cannot be represented - Negative values cannot be represented - Mantissa is always 1 ``` Related PRs: - [PR-107127](https://github.com/llvm/llvm-project/pull/107127) [APFloat] Add APFloat support for E8M0 type - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR - [PR-107999](https://github.com/llvm/llvm-project/pull/107999) [MLIR] Add f6E2M3FN type - [PR-108877](https://github.com/llvm/llvm-project/pull/108877) [MLIR] Add f4E2M1FN type	2024-10-04 09:23:12 +02:00
Aman LaChapelle	759a7b5933	[mlir] Add the ability to define dialect-specific location attrs. (#105584 ) This patch adds the capability to define dialect-specific location attrs. This is useful in particular for defining location structure that doesn't necessarily fit within the core MLIR location hierarchy, but doesn't make sense to push upstream (i.e. a custom use case). This patch adds an AttributeTrait, `IsLocation`, which is tagged onto all the builtin location attrs, as well as the test location attribute. This is necessary because previously LocationAttr::classof only returned true if the attribute was one of the builtin location attributes, and well, the point of this patch is to allow dialects to define their own location attributes. There was an alternate implementation I considered wherein LocationAttr becomes an AttrInterface, but that was discarded because there are likely to be many locations in a single program, and I was concerned that forcing every MLIR user to pay the cost of the additional lookup/dispatch was unacceptable. It also would have been a much more invasive change. It would have allowed for more flexibility in terms of pretty printing, but it's unclear how useful/necessary that flexibility would be given how much customizability there already is for attribute definitions.	2024-10-03 10:25:44 -07:00
Shoaib Meenai	8773bd0e6e	[mlir] Print aliases for recursive types (#110346 ) We're already keeping track of the alias depth to ensure that aliases are printed before they're referenced. For recursive types, we can additionally track whether an alias has been printed and only reference it if so, to lift the restrictions on not printing aliases inside mutable types.	2024-09-28 18:15:14 -07:00
Andrzej Warzyński	6d11494414	[mlir][Linalg] Refine how broadcast dims are treated (#99015 ) This PR fixes how broadcast dims (identified as "zero" results in permutation maps) corresponding to a reduction iterator are vectorised in the case of generic Ops. Here's an example: ```mlir #map = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)> #map1 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, 0)> func.func @generic_with_reduction_and_broadcast(%arg0: tensor<1x12x197x197xf32>) -> (tensor<1x12x197x1xf32>) { %0 = tensor.empty() : tensor<1x12x197x1xf32> %1 = linalg.generic {indexing_maps = [#map, #map1], iterator_types = ["parallel", "parallel", "parallel", "reduction"]} ins(%arg0 : tensor<1x12x197x197xf32>) outs(%0 : tensor<1x12x197x1xf32>) { ^bb0(%in: f32, %out: f32): %818 = arith.addf %in, %out : f32 linalg.yield %818 : f32 } -> tensor<1x12x197x1xf32> return %1 : tensor<1x12x197x1xf32> } ``` This is a perfectly valid Generic Op, but currently triggers two issues in the vectoriser. The root cause is this map: ```mlir #map1 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, 0)> ``` This map triggers an assert in `reindexIndexingMap` - this hook incorrectly assumes that every result in the input map is a `dim` expression and that there are no constants. That's not the case in this example. `reindexIndexingMap` is extended to allow maps like the one above. For now, only constant "zero" results are allowed. This can be extended in the future once a good motivating example is available. Separately, the permutation map highlighted above "breaks" mask calculation (ATM masks are always computed, even in the presence of static shapes). When applying the following permutation: ```mlir (d0, d1, d2, d3) -> (d0, d1, d2, 0) ``` to these canonical shapes (corresponding to the example above): ``` (1, 12, 197, 197) ``` we end up with the following error: ```bash error: vector types must have positive constant sizes but got 1, 12, 197, 0 ``` The error makes sense and indicates that we should update the permutation map above to: ``` (d0, d1, d2, d3) -> (d0, d1, d2) ``` This would correctly give the following vector type: ``` vector<1x12x197xi1> ``` Fixes #97247	2024-09-26 16:17:15 +01:00
Sergey Kozub	2c58063435	[MLIR] Add f4E2M1FN type (#108877 ) This PR adds `f4E2M1FN` type to mlir. `f4E2M1FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 4-bit floating point number with bit layout S1E2M1. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f4E2M1FN - Exponent bias: 1 - Maximum stored exponent value: 3 (binary 11) - Maximum unbiased exponent value: 3 - 1 = 2 - Minimum stored exponent value: 1 (binary 01) - Minimum unbiased exponent value: 1 − 1 = 0 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.00.0 - Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0 - Min normal number: S.01.0 = ±2^(0) = ±1.0 - Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5 ``` Related PRs: - [PR-95392](https://github.com/llvm/llvm-project/pull/95392) [APFloat] Add APFloat support for FP4 data type - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR - [PR-107999](https://github.com/llvm/llvm-project/pull/107999) [MLIR] Add f6E2M3FN type	2024-09-24 08:22:48 +02:00
wang-y-z	041b0a81b0	[MLIR][Operation] Fix `isBeforeInBlock` crash bug mentioned in https://github.com/llvm/llvm-project/issues/60909 (#101172 ) # summary This MR fix `isBeforeInBlock` crash bug mentioned in https://github.com/llvm/llvm-project/issues/60909. Fixes #60909. # Trigger condition 1. A block only have one operation. 2. `block->isOpOrderValid()` is true, but `op->hasValidOrder()` is false. 3. call: `op->isBeforeInBlock(op)`, compared with op itself. Will crash on `assert(blockFront != blockBack && "expected more than one operation");` # Case study Simplified repro case in `mlir/test/Pass/scf2cf-print-liveness-crash.mlir` When put `-convert-scf-to-cf -test-print-liveness` together in one cmd line, the first pass will work normally and crash on the second pass. Details please refer https://github.com/llvm/llvm-project/issues/60909 # Solutions option1. in `isBeforeInBlock`, check if block only have one operation before step into `updateOrderIfNecessary`, if have only one, it must return false option2. in `isBeforeInBlock`, check if `this == other`, if true return false option3. fix `addNodeToList` logic I prefer option3: When a block contains only one operation and the user calls op->isBeforeInBlock(op), if block->isOpOrderValid() returns true, updateOrderIfNecessary is called. If op->hasValidOrder() is false, it will crash at the assertion assert(blockFront != blockBack && "expected more than one operation");. This behavior is abnormal and needs fixing. I discovered that after the first pass of `-convert-scf-to-cf`, there is a block with only one operation where the block order is valid but the operation order is invalid, leading to a crash when `-test-print-liveness` pass runs. --------- Co-authored-by: isaacw <isaacw@nvidia.com>	2024-09-21 23:40:52 +08:00
Billy Zhu	a1d64626ba	[MLIR][IR] Fix InProgressAliasInfo init for non-alias (#109013 ) When visiting an attr/type that is NoAlias, the created `InProgressAliasInfo` was not getting its `canBeDeferred` and `isType` fields set. Not setting `canBeDeferred` when it should be true breaks the assumption that all nested elements are also false. This will cause problems when at a later point the attr/type needs to be converted by `markAliasNonDeferrable`, as recursion will stop when a `canBeDeferred=false` attr/type is reached, leaving its nested elements not flipped. This causes nested elements to be printed later in the textual IR and cannot be parsed back in.	2024-09-17 14:27:31 -07:00
JOE1994	884221eddb	[mlir] Tidy uses of llvm::raw_stream_ostream (NFC) As specified in the docs, 1) raw_string_ostream is always unbuffered and 2) the underlying buffer may be used directly ( 65b13610a5226b84889b923bae884ba395ad084d for further reference ) * Don't call raw_string_ostream::flush(), which is essentially a no-op. * Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.	2024-09-16 23:23:25 -04:00
Sergey Kozub	73d83f20c9	[MLIR] Add f6E2M3FN type (#107999 ) This PR adds `f6E2M3FN` type to mlir. `f6E2M3FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E2M3. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E2M3FN - Exponent bias: 1 - Maximum stored exponent value: 3 (binary 11) - Maximum unbiased exponent value: 3 - 1 = 2 - Minimum stored exponent value: 1 (binary 01) - Minimum unbiased exponent value: 1 − 1 = 0 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.00.000 - Max normal number: S.11.111 = ±2^(2) x (1 + 0.875) = ±7.5 - Min normal number: S.01.000 = ±2^(0) = ±1.0 - Max subnormal number: S.00.111 = ±2^(0) x 0.875 = ±0.875 - Min subnormal number: S.00.001 = ±2^(0) x 0.125 = ±0.125 ``` Related PRs: - [PR-94735](https://github.com/llvm/llvm-project/pull/94735) [APFloat] Add APFloat support for FP6 data types - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR	2024-09-16 21:09:27 +02:00
Sergey Kozub	083e25c1d4	[MLIR] [NFC] Use APFloat semantics to get floating type width (#107372 ) As suggested in the comments of https://github.com/llvm/llvm-project/pull/105573	2024-09-10 10:50:34 +02:00
Sergey Kozub	918222ba43	[MLIR] Add f6E3M2FN type (#105573 ) This PR adds `f6E3M2FN` type to mlir. `f6E3M2FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E3M2. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E3M2FN - Exponent bias: 3 - Maximum stored exponent value: 7 (binary 111) - Maximum unbiased exponent value: 7 - 3 = 4 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.000.00 - Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28 - Min normal number: S.001.00 = ±2^(-2) = ±0.25 - Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875 - Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625 ``` Related PRs: - [PR-94735](https://github.com/llvm/llvm-project/pull/94735) [APFloat] Add APFloat support for FP6 data types - [PR-97118](https://github.com/llvm/llvm-project/pull/97118) [MLIR] Add f8E4M3 type - was used as a template for this PR	2024-09-10 10:41:05 +02:00
Kazu Hirata	01eb071de0	[mlir] Avoid repeated hash lookups (NFC) (#107519 )	2024-09-06 07:48:39 -07:00
Johannes Reifferscheid	8af0860529	AffineExpr: Fix result of d0 + (d0 // -c) * c. (#107530 ) Currently, this is rewritten to d0 mod -c. However, we do not support modulo with a negative RHS in our lowering passes, so this triggers undefined behavior. It would be better to not have these ad hoc simplifications at all, but I guess that ship has sailed.	2024-09-06 12:53:33 +02:00
Benjamin Maxwell	84aa02d3fa	[memref] Handle edge case in subview of full static size fold (#105635 ) It is possible to have a subview with a fully static size and a type that matches the source type, but a dynamic offset that may be different. However, currently the memref dialect folds: ```mlir func.func @subview_of_static_full_size( %arg0: memref<16x4xf32, strided<[4, 1], offset: ?>>, %idx: index) -> memref<16x4xf32, strided<[4, 1], offset: ?>> { %0 = memref.subview %arg0[%idx, 0][16, 4][1, 1] : memref<16x4xf32, strided<[4, 1], offset: ?>> to memref<16x4xf32, strided<[4, 1], offset: ?>> return %0 : memref<16x4xf32, strided<[4, 1], offset: ?>> } ``` To: ```mlir func.func @subview_of_static_full_size( %arg0: memref<16x4xf32, strided<[4, 1], offset: ?>>, %arg1: index) -> memref<16x4xf32, strided<[4, 1], offset: ?>> { return %arg0 : memref<16x4xf32, strided<[4, 1], offset: ?>> } ``` Which drops the dynamic offset from the `subview` op.	2024-08-23 06:52:09 +01:00
Matthias Springer	a3d41879ec	[mlir][ODS] Optionally generate public C++ functions for type constraints (#104577 ) Add `gen-type-constraint-decls` and `gen-type-constraint-defs`, which generate public C++ functions for type constraints. The name of the C++ function is specified in the `cppFunctionName` field. Type constraints are typically used for op/type/attribute verification. They are also sometimes called from builders and transformations. Until now, this required duplicating the check in C++. Note: This commit just adds the option for type constraints, but attribute constraints could be supported in the same way. Alternatives considered: 1. The C++ functions could also be generated as part of `gen-typedef-decls/defs`, but that can be confusing because type constraints may rely on type definitions from multiple `.td` files. `#include`s could cause duplicate definitions of the same type constraint. 2. The C++ functions could also be generated as static member functions of dialects, but they don't really belong to a dialect. (Because they may rely on type definitions from multiple dialects.)	2024-08-21 08:44:54 +02:00
Luke Boyer	4c77cc634d	[mlir][IR] Fix `checkFoldResult` error message (#104559 ) checkFoldResult error message has expected and actual backwards.	2024-08-16 09:51:01 +02:00
Will Dietz	7a98071da2	[mlir] Verifier: steal bit to track seen instead of set. (#102626 ) Tracking a set containing every block and operation visited can become very expensive and is unnecessary. Co-authored-by: Will Dietz <w@wdtz.org>	2024-08-09 10:50:25 -05:00
Nikhil Kalra	84cc1865ef	[mlir] Support DialectRegistry extension comparison (#101119 ) `PassManager::run` loads the dependent dialects for each pass into the current context prior to invoking the individual passes. If the dependent dialect is already loaded into the context, this should be a no-op. However, if there are extensions registered in the `DialectRegistry`, the dependent dialects are unconditionally registered into the context. This poses a problem for dynamic pass pipelines, however, because they will likely be executing while the context is in an immutable state (because of the parent pass pipeline being run). To solve this, we'll update the extension registration API on `DialectRegistry` to require a type ID for each extension that is registered. Then, instead of unconditionally registered dialects into a context if extensions are present, we'll check against the extension type IDs already present in the context's internal `DialectRegistry`. The context will only be marked as dirty if there are net-new extension types present in the `DialectRegistry` populated by `PassManager::getDependentDialects`. Note: this PR removes the `addExtension` overload that utilizes `std::function` as the parameter. This is because `std::function` is copyable and potentially allocates memory for the contained function so we can't use the function pointer as the unique type ID for the extension. Downstream changes required: - Existing `DialectExtension` subclasses will need a type ID to be registered for each subclass. More details on how to register a type ID can be found here: `8b68e06731/mlir/include/mlir/Support/TypeID.h (L30)` - Existing uses of the `std::function` overload of `addExtension` will need to be refactored into dedicated `DialectExtension` classes with associated type IDs. The attached `std::function` can either be inlined into or called directly from `DialectExtension::apply`. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2024-08-06 01:32:36 +02:00
Kazu Hirata	5262865aac	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896 )	2024-08-04 11:43:05 -07:00
Alexander Pivovarov	eef1d7e377	[MLIR] Add f8E3M4 IEEE 754 type (#101230 ) This PR adds `f8E3M4` type to mlir. `f8E3M4` type follows IEEE 754 convention ```c f8E3M4 (IEEE 754) - Exponent bias: 3 - Maximum stored exponent value: 6 (binary 110) - Maximum unbiased exponent value: 6 - 3 = 3 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Precision specifies the total number of bits used for the significand (mantissa), including implicit leading integer bit = 4 + 1 = 5 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 3 - Min exp (unbiased): -2 - Infinities (+/-): S.111.0000 - Zeros (+/-): S.000.0000 - NaNs: S.111.{0,1}⁴ except S.111.0000 - Max normal number: S.110.1111 = +/-2^(6-3) x (1 + 15/16) = +/-2^3 x 31 x 2^(-4) = +/-15.5 - Min normal number: S.001.0000 = +/-2^(1-3) x (1 + 0) = +/-2^(-2) - Max subnormal number: S.000.1111 = +/-2^(-2) x 15/16 = +/-2^(-2) x 15 x 2^(-4) = +/-15 x 2^(-6) - Min subnormal number: S.000.0001 = +/-2^(-2) x 1/16 = +/-2^(-2) x 2^(-4) = +/-2^(-6) ``` Related PRs: - [PR-99698](https://github.com/llvm/llvm-project/pull/99698) [APFloat] Add support for f8E3M4 IEEE 754 type - [PR-97118](https://github.com/llvm/llvm-project/pull/97118) [MLIR] Add f8E4M3 IEEE 754 type	2024-08-02 00:22:11 -07:00
Benjamin Kramer	ae4f2495a4	[IR] Verifier: Use a SmallPtrSet for a small set of pointers. NFC	2024-07-27 13:55:04 +02:00
Krzysztof Drewniak	8955e285e1	[mlir] Add property combinators, initial ODS support (#94732 ) While we have had a Properties.td that allowed for defining non-attribute-backed properties, such properties were not plumbed through the basic autogeneration facilities available to attributes, forcing those who want to migrate to the new system to write such code by hand. ## Potentially breaking changes - The `setFoo()` methods on `Properties` struct no longer take their inputs by const reference. Those wishing to pass non-owned values of a property by reference to constructors and setters should set the interface type to `const [storageType]&` - Adapters and operations now define getters and setters for properties listed in ODS, which may conflict with custom getters. - Builders now include properties listed in ODS specifications, potentially conflicting with custom builders with the same type signature. ## Extensions to the `Property` class This commit adds several fields to the `Property` class, including: - `parser`, `optionalParser`, and `printer` (for parsing/printing properties of a given type in ODS syntax) - `storageTypeValueOverride`, an extension of `defaultValue` to allow the storage and interface type defaults to differ - `baseProperty` (allowing for classes like `DefaultValuedProperty`) Existing fields have also had their documentation comments updated. This commit does not add a `PropertyConstraint` analogous to `AttrConstraint`, but this is a natural evolution of the work here. This commit also adds the concrete property kinds `I32Property`, `I64Property`, `UnitProperty` (and special handling for it like for UnitAttr), and `BoolProperty`. ## Property combinators `Properties.td` also now includes several ways to combine properties. One is `ArrayProperty<Property elem>`, which now stores a variable-length array of some property as `SmallVector<elem.storageType>` and uses `ArrayRef<elem.storageType>` as its interface type. It has `IntArrayProperty` subclasses that change its conversion to attributes to use `DenseI[N]Attr`s instead of an `ArrayAttr`. Similarly, `OptionalProperty<Property p>` wraps a property's storage in `std::optional<>` and adds a `std::nullopt` default value. In the case where the underlying property can be parsed optionally but doesn't have its own default value, `OptionalProperty` can piggyback off the optional parser to produce a cleaner syntax, as opposed to its general form, which is either `none` or `some<[value]>`. (Note that `OptionalProperty` can be nested if desired). ## Autogeneration changes Operations and adaptors now support getters and setters for properties like those for attributes. Unlike for attributes, there aren't separate value and attribute forms, since there is no `FooAttr()` available for a `getFooAttr()` to return. The largest change is to operation formats. Previously, properties could only be used in custom directives. Now, they can be used anywhere an attribute could be used, and have parsers and printers defined in their tablegen records. These updates include special `UnitProperty` logic like that used for `UnitAttr`. ## Misc. Some attempt has been made to test the new functionality. This commit takes tentative steps towards updating the documentation to account for properties. A full update will be in order once any followup work has been completed and the interfaces have stabilized. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com> Co-authored-by: Christian Ulmann <christianulmann@gmail.com>	2024-07-26 09:35:06 -05:00
Johannes Reifferscheid	528a662d3a	Fix sign of largest known divisor of div. (#100081 ) There's a missing abs, so it returns a negative value if the divisor is negative. Later this is then cast to uint.	2024-07-23 10:55:32 +02:00
Alexander Pivovarov	019136e30f	[MLIR] Add f8E4M3 IEEE 754 type (#97118 ) This PR adds `f8E4M3` type to mlir. `f8E4M3` type follows IEEE 754 convention ```c f8E4M3 (IEEE 754) - Exponent bias: 7 - Maximum stored exponent value: 14 (binary 1110) - Maximum unbiased exponent value: 14 - 7 = 7 - Minimum stored exponent value: 1 (binary 0001) - Minimum unbiased exponent value: 1 − 7 = −6 - Precision specifies the total number of bits used for the significand (mantisa), including implicit leading integer bit = 3 + 1 = 4 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 7 - Min exp (unbiased): -6 - Infinities (+/-): S.1111.000 - Zeros (+/-): S.0000.000 - NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111} - Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240 - Min normal number: S.0001.000 = +/-2^(-6) - Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7 - Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9) ``` Related PRs: - [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat] Add support for f8E4M3 IEEE 754 type	2024-07-22 23:20:28 -07:00
Andrzej Warzyński	2ee5586ac7	[mlir][vector] Make the in_bounds attribute mandatory (#97049 ) At the moment, the in_bounds attribute has two confusing/contradicting properties: 1. It is both optional _and_ has an effective default-value. 2. The default value is "out-of-bounds" for non-broadcast dims, and "in-bounds" for broadcast dims. (see the `isDimInBounds` vector interface method for an example of this "default" behaviour [1]). This PR aims to clarify the logic surrounding the `in_bounds` attribute by: * making the attribute mandatory (i.e. it is always present), * always setting the default value to "out of bounds" (that's consistent with the current behaviour for the most common cases). #### Broadcast dimensions in tests As per [2], the broadcast dimensions requires the corresponding `in_bounds` attribute to be `true`: ``` vector.transfer_read op requires broadcast dimensions to be in-bounds ``` The changes in this PR mean that we can no longer rely on the default value in cases like the following (dim 0 is a broadcast dim): ```mlir %read = vector.transfer_read %A[%base1, %base2], %f, %mask {permutation_map = affine_map<(d0, d1) -> (0, d1)>} : memref<?x?xf32>, vector<4x9xf32> ``` Instead, the broadcast dimension has to explicitly be marked as "in bounds: ```mlir %read = vector.transfer_read %A[%base1, %base2], %f, %mask {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} : memref<?x?xf32>, vector<4x9xf32> ``` All tests with broadcast dims are updated accordingly. #### Changes in "SuperVectorize.cpp" and "Vectorization.cpp" The following patterns in "Vectorization.cpp" are updated to explicitly set the `in_bounds` attribute to `false`: * `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern` Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and `vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to make sure that xfer Ops created by these hooks set the dimension corresponding to broadcast dims as "in bounds". Otherwise, the Op verifier would complain Note that there is no mechanism to verify whether the corresponding memory access are indeed in bounds. Still, this is consistent with the current behaviour where the broadcast dim would be implicitly assumed to be "in bounds". [1] `4145ad2bac/mlir/include/mlir/Interfaces/VectorInterfaces.td (L243-L246)` [2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop	2024-07-16 16:49:52 +01:00
Johannes Reifferscheid	dd7d81ea49	Fix simplification of x + x//c*-c to x mod c. (#98909 ) There was no check that rhs is actually a multiplication.	2024-07-15 16:59:47 +02:00
Billy Zhu	06bbbf1ee0	[MLIR] Cyclic AttrType Replacer (#98206 ) The current `AttrTypeReplacer` does not allow for custom handling of replacer functions that may cause self-recursion. For example, the replacement of one attr/type may depend on the replacement of another attr/type (by calling into the replacer manually again), which in turn may depend on the replacement of the original attr/type. To enable this functionality, this PR broke out the original AttrTypeReplacer into two parts: - An uncached base version (`detail::AttrTypeReplacerBase`) that allows registering replacer functions and has logic for invoking it on attr/types & their sub-elements - A cached version (`AttrTypeReplacer`) that provides the same caching as the original one. This is still the one used everywhere and behavior is unchanged. On top of the uncached base version, a `CyclicAttrTypeReplacer` is introduced that provides caching & cycle-handling for replacer logic that is cyclic. Cycle-breaking & caching is provided by the `CyclicReplacerCache` from https://github.com/llvm/llvm-project/pull/98202. Both concrete implementations of the uncached base version use CRTP to avoid dynamic dispatch. The base class merely provides replacer registration & invocation, and is not meant to be used, or otherwise extended elsewhere.	2024-07-12 09:24:01 -07:00
Ramkumar Ramachandra	f1eed011b4	MathExtras: add overflow query for signed-div (#97901 ) 5221634 (Do not trigger UB during AffineExpr parsing) noticed that divideCeilSigned and divideFloorSigned would overflow when Numerator = INT_MIN, and Denominator = -1. This observation has already been made by DynamicAPInt, and it has code to check this. To avoid checks in multiple callers, centralize this query in MathExtras, and change divideCeilSigned/divideFloorSigned to assert on overflow.	2024-07-09 09:33:46 +01:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Johannes Reifferscheid	22dfa1aa2c	[mlir] Fold ceil/floordiv with negative RHS. (#97031 ) Currently, we only fold if the RHS is a positive constant. There doesn't seem to be a good reason to do that. The comment claims that division by negative values is undefined, but I suspect that was just copied over from the `mod` simplifier.	2024-06-30 11:53:04 +02:00
Johannes Reifferscheid	52216349b6	Do not trigger UB during AffineExpr parsing. (#96896 ) Currently, parsing expressions that are undefined will trigger UB during compilation (e.g. `9223372036854775807 * 2`). This change instead leaves the expressions as they were. This change is an NFC for compilations that did not previously involve UB.	2024-06-28 07:31:33 +02:00
Kazu Hirata	b7b337fb91	[mlir] Use llvm::unique (NFC) (#96415 )	2024-06-24 11:54:02 -07:00
Nikita Popov	0aea1f2f21	[mlir] Add missing ManagedStatic.h includes (NFC)	2024-06-21 16:13:41 +02:00
Niranjan Hasabnis	abd95342f0	Reimplementing target description concept using DLTI attribute (#92138 ) and Interfaces. This is a newer implementation of PR https://github.com/llvm/llvm-project/pull/85141 and [RFC](https://discourse.llvm.org/t/rfc-target-description-and-cost-model-in-mlir/76990) by considering reviews and comments on the original PR. As an example of attributes supported by this commit: ``` module attributes { dlti.target_system_spec = #dlti.target_device_spec< #dlti.dl_entry<"dlti.device_id", 0: ui32>, #dlti.dl_entry<"dlti.device_type", "CPU">, #dlti.dl_entry<"dlti.L1_cache_size_in_bytes", 8192 : ui32>>, #dlti.target_device_spec < #dlti.dl_entry<"dlti.device_id", 1: ui32>, #dlti.dl_entry<"dlti.device_type", "GPU">, #dlti.dl_entry<"dlti.max_vector_op_width", 64 : ui32>>, #dlti.target_device_spec < #dlti.dl_entry<"dlti.device_id", 2: ui32>, #dlti.dl_entry<"dlti.device_type", "XPU">>> } ```	2024-06-19 19:40:08 +01:00
Ramkumar Ramachandra	0fb216fb2f	mlir/MathExtras: consolidate with llvm/MathExtras (#95087 ) This patch is part of a project to move the Presburger library into LLVM.	2024-06-11 23:00:02 +01:00
Will Dietz	46e41c8631	[mlir] Sanitize identifiers with leading symbol. (#94795 ) Presently, if name starts with a symbol it's converted to hex which may cause the result to be invalid by starting with a digit. Address this and add a small test. Co-authored-by: Will Dietz <w@wdtz.org>	2024-06-10 19:12:34 -05:00
Mehdi Amini	2df68e0503	[MLIR] Fix generic assembly syntax for ArrayAttr containing hex float (#94583 ) When a float attribute is printed with Hex, we should not elide the type because it is parsed back as i64 otherwise.	2024-06-06 07:51:47 -07:00
Benjamin Maxwell	29a925abb6	[mlir][affine][Analysis] Add conservative bounds for semi-affine mods (#93576 ) This patch adds support for computing bounds for semi-affine mod expression to FlatLinearConstraints. This is then enabled within the ScalableValueBoundsConstraintSet to allow computing the bounds of scalable remainder loops. E.g. computing the bound of something like: ``` // `1000 mod s0` is a semi-affine. #remainder_start_index = affine_map<()[s0] -> (-(1000 mod s0) + 1000)> #remaining_iterations = affine_map<(d0) -> (-d0 + 1000)> %0 = affine.apply #remainder_start_index()[%c8_vscale] scf.for %i = %0 to %c1000 step %c8_vscale { %remaining_iterations = affine.apply #remaining_iterations(%i) // The upper bound for the remainder loop iterations should be: // %c8_vscale - 1 (expressed as an affine map, // affine_map<()[s0] -> (s0 * 8 - 1)>, where s0 is vscale) %bound = "test.reify_bound"(%remaining_iterations) <{scalable, ...}> } ``` There are caveats to this implementation. To be able to add a bound for a `mod` we need to assume the rhs is positive (> 0). This may not be known when adding the bounds for the `mod` expression. So to handle this a constraint is added for `rhs > 0`, this may later be found not to hold (in which case the constraints set becomes empty/invalid). This is not a problem for computing scalable bounds where it's safe to assume `s0` is vscale (or some positive multiple of it). But this may need to be considered when enabling this feature elsewhere (to ensure correctness).	2024-06-05 11:35:13 +01:00
Beal Wang	4d60be0452	[mlir] Do not print empty property (#93379 ) Skip printing property as `<<<NULL ATTRIBUTE>>>` when operation has an empty property. Co-authored-by: Biao Wang <biaow@nvidia.com>	2024-05-25 09:24:12 -06:00
Krzysztof Parzyszek	33550b43f4	[mlir] Add operator<< for printing `Block` (#92550 ) Turns out it was already in Analysis/CFGLoopInfo, so just move it to IR/AsmPrinter.	2024-05-18 08:03:19 -05:00
Kazu Hirata	dec8055a1e	[mlir] Use StringRef::operator== instead of StringRef::equals (NFC) (#91560 ) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator==/!= outnumber StringRef::equals by a factor of 10 under mlir/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".	2024-05-08 23:52:22 -07:00
Max191	7e35a9a0e7	[mlir] Replace dynamic sizes in insert_slice of tensor.cast canonicalization (#91352 ) In some cases this pattern may ignore static information due to dynamic operands in the insert_slice sizes operands, e.g.: ``` %0 = tensor.cast %arg0 : tensor<1x?xf32> to tensor<?x?xf32> %1 = tensor.insert_slice %0 into %arg1[...] [%s0, %s1] [...] : tensor<?x?xf32> into tensor<?x?xf32> ``` Can be rewritten into: ``` %1 = tensor.insert_slice %arg0 into %arg1[...] [1, %s1] [...] : tensor<1x?xf32> into tensor<?x?xf32> ``` This PR updates the matching in the pattern to allow rewrites like this.	2024-05-08 15:05:53 -04:00
Scott Manley	57175533da	[MLIR][IR] add -mlir-print-unique-ssa-ids to AsmPrinter (#91241 ) Add an option to unique the numbers of values, block arguments and naming conflicts when requested and/or printing generic op form. This is helpful when debugging. For example, if you have: scf.for %0 = %1 = opA %0 scf.for %0 = %1 = opB %0 And you get a verifier error which says opB's "operand #0 does not dominate this use", it looks like %0 does dominate the use. This is not intuitive. If these were numbered uniquely, it would look like: scf.for %0 = %1 = opA %0 scf.for %2 = %3 = opB %0 And thus, much clearer as to why you are getting the error since %0 is out of scope. Since generic op form should aim to give you the most possible information, it seems like a good idea to use unique numbers in this situation. Adding an option also gives those an option to use it outside of generic op form. Co-authored-by: Scott Manley <scmanley@nvidia.com>	2024-05-07 08:45:28 -07:00
Christian Ulmann	4513050f52	[MLIR] Harmonize the behavior of the folding API functions (#88508 ) This commit changes `OpBuilder::tryFold` to behave more similarly to `Operation::fold`. Concretely, this ensures that even an in-place fold returns `success`. This is necessary to fix a bug in the dialect conversion that occurred when an in-place folding made an operation legal. The dialect conversion infrastructure did not check if the result of an in-place folding legalized the operation and just went ahead and tried to apply pattern anyways. The added test contains a simplified version of a breakage we observed downstream.	2024-04-23 08:05:55 +02:00
Christian Sigg	a5757c5b65	Switch member calls to `isa/dyn_cast/cast/...` to free function calls. (#89356 ) This change cleans up call sites. Next step is to mark the member functions deprecated. See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-19 15:58:27 +02:00
Beal Wang	d488b2225d	[mlir][ods] Do not print default-valued properties when the value is equal to the default (#87970 ) This diff causes the `tblgen`-erated printProperties() function to skip printing a `DefaultValuedAttr` property when the value is equal to the default. Co-authored-by: Biao Wang <biaow@nvidia.com>	2024-04-12 10:37:50 +02:00
Andrei Golubev	be006372f3	[mlir][OpPrintingFlags] Allow to disable ElementsAttr hex printing (#85766 ) At present, large ElementsAttr is unconditionally printed with a hex string. This means that in IR large constant values often look like: dense<"0x000000000004000000080000000004000000080000000..."> : tensor<10x10xi32> Hoisting hex printing control to the user level for tooling means that one can disable the feature and get human-readable values when necessary: dense<[16, 32, 48, 500...]> : tensor<10x10xi32> Note: AsmPrinterOptions::printElementsAttrWithHexIfLarger is not always possible to be used as it requires that one exposes MLIR's command-line options in user tooling (including an actual compiler). Co-authored-by: Harald Rotuna <harald.razvan.rotuna@intel.com>	2024-04-09 02:08:32 +02:00
MaheshRavishankar	5aeb604c7c	[mlir][SCF] Modernize `coalesceLoops` method to handle `scf.for` loops with iter_args (#87019 ) As part of this extension this change also does some general cleanup 1) Make all the methods take `RewriterBase` as arguments instead of creating their own builders that tend to crash when used within pattern rewrites 2) Split `coalesePerfectlyNestedLoops` into two separate methods, one for `scf.for` and other for `affine.for`. The templatization didnt seem to be buying much there. Also general clean up of tests.	2024-04-04 13:44:24 -07:00
Matthias Springer	38113a0832	[mlir][IR] Trigger `notifyOperationReplaced` on `replaceAllOpUsesWith` (#84721 ) Before this change: `notifyOperationReplaced` was triggered when calling `RewriteBase::replaceOp`. After this change: `notifyOperationReplaced` is triggered when `RewriterBase::replaceAllOpUsesWith` or `RewriterBase::replaceOp` is called. Until now, every `notifyOperationReplaced` was always sent together with a `notifyOperationErased`, which made that `notifyOperationErased` callback irrelevant. More importantly, when a user called `RewriterBase::replaceAllOpUsesWith`+`RewriterBase::eraseOp` instead of `RewriterBase::replaceOp`, no `notifyOperationReplaced` callback was sent, even though the two notations are semantically equivalent. As an example, this can be a problem when applying patterns with the transform dialect because the `TrackingListener` will only see the `notifyOperationErased` callback and the payload op is dropped from the mappings. Note: It is still possible to write semantically equivalent code that does not trigger a `notifyOperationReplaced` (e.g., when op results are replaced one-by-one), but this commit already improves the situation a lot.	2024-04-02 10:53:57 +09:00
Jakub Kuderski	971b852546	[mlir][NFC] Simplify type checks with isa predicates (#87183 ) For more context on isa predicates, see: https://github.com/llvm/llvm-project/pull/83753.	2024-04-01 11:40:09 -04:00

1 2 3 4 5 ...

1975 Commits