This commit folds a `tensor.cast` op into a `tensor.collapse_shape` op
when following two conditions meet:
1. the `tensor.collapse_shape` op consumes result of the `tensor.cast` op.
2. `tensor.cast` op casts to a more dynamic version of the source tensor.
This is added as a canonicalization pattern in `tensor.collapse_shape` op.
Signed-Off-By: Gaurav Shukla <gaurav@nod-labs.com>
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D130650
InsertSliceOp and ParallelInsertSliceOp are very similar and can share some of the bufferization analysis code.
Differential Revision: https://reviews.llvm.org/D130465
Load dialects that will be generated by the extension. (Except for BufferizationDialect and MemrefDialect which are loaded already.)
Differential Revision: https://reviews.llvm.org/D130463
The `unknownTypeConversion` bufferization option (enum) is now a type converter function option. Some logic of `getMemRefType` is now handled by that function.
This change makes type conversion more controllable. Previously, there were only two options when generating memref types for non-bufferizable ops: Static identity layout or fully dynamic layout. With this change, users of One-Shot Bufferize can provide a function with custom logic.
Differential Revision: https://reviews.llvm.org/D129273
The result shape of a rank-reducing subview cannot be inferred in the general case. Just the result rank is not enough. The only thing that we can infer is the layout map.
This change also improves the bufferization patterns of tensor.extract_slice and tensor.insert_slice to fully support rank-reducing operations.
Differential Revision: https://reviews.llvm.org/D129144
This revision updates the op semantics to also allow rank-reducing behavior as well
as updates the implementation to reuse code between the sequential and the parallel
version of the op.
Depends on D128920
Differential Revision: https://reviews.llvm.org/D128985
This is moslty NFC and will allow tensor.parallel_insert_slice to gain
rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl.
Depends on D128857
Differential Revision: https://reviews.llvm.org/D128920
This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.)
Differential Revision: https://reviews.llvm.org/D128423
Add a failure return value and bufferization options argument. This is to keep a subsequent change smaller.
Differential Revision: https://reviews.llvm.org/D128278
This revision makes sure we accept sparse tensors as arguments
of the expand/collapse reshaping operations in the tensor dialect.
Note that the actual lowering to runnable IR is still TBD.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D128311
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
implementation of the operation based on the tile of the result
needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
source is defined by an operation that implements the
`TilingInterface` with a tiled implementation that produces the
extracted slice in-place (using the method added to
`TilingInterface`).
- A pattern is added that takes a sequence of operations that
implement the `TilingInterface` (for now `LinalgOp`s), tiles the
consumer, and greedily fuses its producers iteratively.
Differential Revision: https://reviews.llvm.org/D127809
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
implementation of the operation based on the tile of the result
needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
source is defined by an operation that implements the
`TilingInterface` with a tiled implementation that produces the
extracted slice in-place (using the method added to
`TilingInterface`).
- A pattern is added that takes a sequence of operations that
implement the `TilingInterface` (for now `LinalgOp`s), tiles the
consumer, and greedily fuses its producers iteratively.
Differential Revision: https://reviews.llvm.org/D127809
This aligns the SCF dialect file layout with the majority of the dialects.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D128049
Follow up from flipping dialects to both, flip accessor used to prefixed
variant ahead to flipping from _Both to _Prefixed. This just flips to
the accessors introduced in the preceding change which are just prefixed
forms of the existing accessor changed from.
Mechanical change using helper script
https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.
Marked all dialects that could be (reasonably) easily flipped to _Both
prefix. Updating the accessors to prefixed form will happen in follow
up, this was to flush out conflicts and to mark all dialects explicitly
as I plan to flip OpBase default to _Prefixed to avoid needing to
migrate new dialects.
Except for Standalone example which got flipped to _Prefixed.
Differential Revision: https://reviews.llvm.org/D128027
With the recent refactorings, this class is no longer needed. We can use BufferizationOptions in all places were BufferizationState was used.
Differential Revision: https://reviews.llvm.org/D127653
This change changes the bufferization so that it utilizes the new TensorCopyInsertion pass. One-Shot Bufferize no longer calls the One-Shot Analysis. Instead, it relies on the TensorCopyInsertion pass to make the entire IR fully inplacable. The `bufferize` implementations of all ops are simplified; they no longer have to account for out-of-place bufferization decisions. These were already materialized in the IR in the form of `bufferization.alloc_tensor` ops during the TensorCopyInsertion pass.
Differential Revision: https://reviews.llvm.org/D127652
The function creates dim ops for each dynamic dimension of the raked tensor
argument and returns these as values.
Differential Revision: https://reviews.llvm.org/D127533
Before this fix, the bufferization implementation made the incorrect assumption that the values yielded from the "before" region must match with the values yielded from the "after" region.
Differential Revision: https://reviews.llvm.org/D125835
Instead of requiring the client to compute the "isSplat" bit,
compute it internally. This makes the logic more consistent
and defines away a lot of "elements.size()==1" in the clients.
This addresses Issue #55185
Differential Revision: https://reviews.llvm.org/D125447
This patch updates calls to AnalysisState::getBuffer() so that we return
early with a failure if the call does not succeed.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D125251
This patch augments the `tensor-bufferize` pass by adding a conversion
rule to translate ReshapeOp from the `tensor` dialect to the `memref`
dialect, in addition to adding a unit test to validate the translation.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D125031
Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.
Differential Revision: https://reviews.llvm.org/D124298
It seems more natural than to have it as a static method of ExpandShapeOp.
Also fix a typo ("the the" -> "the").
Differential Revision: https://reviews.llvm.org/D124234
Insert a buffer copy unless the dims are guaranteed to be collapsible. In the verifier, accept collapses unless they are guaranteed to be non-collapsible.
Differential Revision: https://reviews.llvm.org/D123316
Infer a tighter MemRef type instead of always falling back to the most dynamic MemRef type. This is inefficient and caused op verification errors.
Differential Revision: https://reviews.llvm.org/D122649