Current one-shot bufferization infrastructure operates on top of
TensorType and BaseMemRefType. These are non-extensible base classes of
the respective builtins: tensor and memref. Thus, the infrastructure is
bound to work only with builtin tensor/memref types. At the same time,
there are customization points that allow one to provide custom logic to
control the bufferization behavior.
This patch introduces new type interfaces: tensor-like and buffer-like
that aim to supersede TensorType/BaseMemRefType within the bufferization
dialect and allow custom tensors / memrefs to be used. Additionally,
these new type interfaces are attached to the respective builtin types
so that the switch is seamless.
Note that this patch does very minimal initial work, it does NOT
refactor bufferization infrastructure.
See https://discourse.llvm.org/t/rfc-changing-base-types-for-tensors-and-memrefs-from-c-base-classes-to-type-interfaces/85509
Current inliner disables inlining when the caller is in a region with
single block trait, while the callee function contains multiple blocks.
the SingleBlock trait is used in operations such as do/while loop, for
example fir.do_loop, fir.iterate_while and fir.if. Typically, calls within
loops are good candidates for inlining. However, functions with multiple
blocks are also common. for example, any function with "if () then
return" will result in multiple blocks in MLIR.
This change gives the flexibility of a customized inliner to handle such
cases.
doClone: clones instructions and other information from the callee
function into the caller function. .
canHandleMultipleBlocks: checks if functions with multiple blocks can be
inlined into a region with the SingleBlock trait.
The default behavior of the inliner remains unchanged.
---------
Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
The added utility method moves all SSA values that an operation depends
upon before an insertion point. This is useful during transformations
where such movements might make transformations (like fusion) more
powerful.
To test the operation add a transform dialect op that calls the move
operation. To be able to capture the `notifyMatchFailure` messages from
the transformation and to report/check these in the test modify the
`ErrorCheckingTrackingListener` to capture the last match failure
notification.
---------
Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
TOSA MLIR profile-based validation is designed to identify the
profile/extension requirements for each operation in TOSA MLIR graph,
ensuring that TOSA operators conform to the profiles and extensions
enabled by the target implementation.
The available profiles/extensions are reflected in the availability
property attached to each TOSA operator in the dialect. The design of
availability, the profile/extension classes, and their interface, is
inspired by the SPIRV implementation.
This patch includes the following changes:
- Introduces profile and extension knowledge within the dialect and
establishes an interface to query this information.
- Implements profile-based validation logic in the pass.
- Adds a TargetEnv class that represents the capabilities enabled in the
target implementation, such as profiles, extensions, and levels.
- Adds a set of tests to ensure that profile and extension requirements
are properly attached to the operations and that validation correctly
verifies the requirements of a given operation against the target
implementation.
With the removal of mlir-vulkan-runner (as part of #73457) in
e7e3c45bc70904e24e2b3221ac8521e67eb84668, this pass no longer has to be
public (previously it had to be so the runner could use it). This commit
makes it instead only available for use by mlir-opt.
This is a recommit of 058d183980a2f334d085a46c32abded0557aa789 (#124301)
which had been reverted in 4573c857da88b3210d497d9a88a89351a74b5964 due
to a missing linker dependency on MLIRSPIRVTransforms in
mlir/test/lib/Pass/CMakeLists.txt (fixed in this commit).
With the removal of mlir-vulkan-runner (as part of #73457) in
e7e3c45bc70904e24e2b3221ac8521e67eb84668, this pass no longer has to be
public (previously it had to be so the runner could use it). This commit
makes it instead only available for use by mlir-opt.
Use `mlir_target_link_libraries()` to link dependencies of libraries
that are not included in libMLIR, to ensure that they link to the dylib
when they are used in Flang. Otherwise, they implicitly pull in all
their static dependencies, effectively causing Flang binaries to
simultaneously link to the dylib and to static libraries, which is never
a good idea.
I have only covered the libraries that are used by Flang. If you wish, I
can extend this approach to all non-libMLIR libraries in MLIR, making
MLIR itself also link to the dylib consistently.
[v3 with more `-DBUILD_SHARED_LIBS=ON` fixes]
Use `mlir_target_link_libraries()` to link dependencies of libraries
that are not included in libMLIR, to ensure that they link to the dylib
when they are used in Flang. Otherwise, they implicitly pull in all
their static dependencies, effectively causing Flang binaries to
simultaneously link to the dylib and to static libraries, which is never
a good idea.
I have only covered the libraries that are used by Flang. If you wish, I
can extend this approach to all non-libMLIR libraries in MLIR, making
MLIR itself also link to the dylib consistently.
[v2 with fixed `-DBUILD_SHARED_LIBS=ON` build]
Use `mlir_target_link_libraries()` to link dependencies of libraries
that are not included in libMLIR, to ensure that they link to the dylib
when they are used in Flang. Otherwise, they implicitly pull in all
their static dependencies, effectively causing Flang binaries to
simultaneously link to the dylib and to static libraries, which is never
a good idea.
I have only covered the libraries that are used by Flang. If you wish, I
can extend this approach to all non-libMLIR libraries in MLIR, making
MLIR itself also link to the dylib consistently.
In 10ef20f6a629797d81252de143117e2a0bc6556d I dropped $test_libs
from $LIBS to handle them separately for the mlir-opt tool.
However, they should still include them in LINK_LIBS for the
MLIRMlirOptMain library.
While MLIR currently supports building a libMLIR.so, it does not support
actually linking against it for its own tools. When building with LTO,
this means we have to relink the world for every tool, and the resulting
binaries are large.
This adds basic support for MLIR_LINK_MLIR_DYLIB, modelled after how
CLANG_LINK_CLANG_DYLIB is implemented: Libraries that are part of
libMLIR.so should be added via mlir_target_link_libraries instead of
target_link_libraries. This will replace them with libMLIR.so if
MLIR_LINK_MLIR_DYLIB is enabled.
This adds basic support, I think there are two more things that can be
done here:
* C API unit tests should link against libMLIR-C.so. Currently these
still link statically.
* Linking the test libs (not part of libMLIR.so) still pulls in
dependencies statically that should come from libMLIR.so.
Adds a new mlir-opt test-only pass, -test-vulkan-runner-pipeline, which
runs a set of passes needed for mlir-vulkan-runner, and removes them
from the runner. The tests are changed to invoke mlir-opt with this flag
before invoking the runner. The passes moved are ones concerned with
lowering of the device code prior to serialization to SPIR-V. This is an
incremental step towards moving the entire pipeline to mlir-opt, to
align with other runners (see #73457).
This commit adds extra checks to the MemRef argument materializations in
the LLVM type converter. These materializations construct a
`MemRefType`/`UnrankedMemRefType` from the unpacked elements of a MemRef
descriptor or from a bare pointer.
The extra checks ensure that the inputs to the materialization function
are correct. It is possible that a user added extra type conversion
rules that convert MemRef types in a different way and the extra checks
ensure that we construct a MemRef descriptor only if the inputs are what
we expect.
This commit also drops a check around bare pointer materializations:
```
// This is a bare pointer. We allow bare pointers only for function entry
// blocks.
```
This check should not be part of the materialization function. Whether a
MemRef block argument is converted into a MemRef descriptor or a bare
pointer is decided in the lowering pattern. At the point of time when
materialization functions are executed, we already made that decision
and we should just materialize regardless of the input format.
Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline,
which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner,
and removes them from the runner. The tests are changed to invoke
mlir-opt with this flag before running the runner. The eventual goal is
to move all host/device code generation steps out of the runner, like
with some of the other runners.
Recommit of 17e9752267ed9c81c8da87f3a6d0e01f130b0d04. It was reverted
due to a build failure, but the build failure had in fact already been
fixed in e7302319b52e3d231216d54d10622b0698928a96.
Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline,
which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner,
and removes them from the runner. The tests are changed to invoke
mlir-opt with this flag before running the runner. The eventual goal is
to move all host/device code generation steps out of the runner, like
with some of the other runners.
Fix the builds with LLVM_TOOL_LLVM_DRIVER_BUILD enabled.
LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES is not completely
compatible with export_executable_symbols as the later will be ignored
if the previous is set to NO.
Fix the issue by passing if symbols need to be exported to
llvm_add_exectuable so the link flag can be determined directly
without calling export_executable_symbols_* later.
`LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES` is not completely
compatible with `export_executable_symbols` as the later will be ignored
if the previous is set to NO.
Fix the issue by passing if symbols need to be exported to
`llvm_add_exectuable` so the link flag can be determined directly
without calling `export_executable_symbols_*` later.
Add metadata to Diagnostic.
Motivation: we have a use case where we want to do some filtering in our
customized Diagnostic Handler based on some customized info that is not
`location` or `severity` or `diagnostic arguments` that are member
variables of `Diagnostic`. Specifically, we want to add a unique ID to
the `Diagnostic` for the handler to filter in a compiler pass that emits
errors in async tasks with multithreading and the diagnostic handling is
associated to the task.
This patch adds a field of `metadata` to `mlir::Diagnostics` as a
general solution. `metadata` is of type `SmallVector<DiagnosticArgument,
0>` to save memory size and reuse existing `DiagnosticArgument` for
metadata type.
### Description
This PR builds on #99872. It implements a minimal version of function
body vector unrolling to convert vector types into 1D and with a size
supported by SPIR-V (2, 3 or 4 depending on the original dimension). The
ops that are currently supported include those with elementwise traits
(e.g. `arith.addi`), `vector.reduction` and `vector.transpose`. This PR
also includes new LIT tests that only check for vector unrolling.
### Future Plans
- Support more ops
---------
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
### Description
This PR implements a minimal version of function signature conversion to
unroll vectors into 1D and with a size supported by SPIR-V (2, 3 or 4
depending on the original dimension). This PR also includes new unit
tests that only check for function signature conversion.
### Future Plans
- Check for capabilities that support vectors of size 8 or 16.
- Set up `OneToNTypeConversion` and `DialectConversion` to replace the
current implementation that uses `GreedyPatternRewriteDriver`.
- Introduce other vector unrolling patterns to cancel out the
`vector.insert_strided_slice` and `vector.extract_strided_slice` ops and
fully legalize the vector types in the function body.
- Handle `func::CallOp` and declarations.
- Restructure the code in `SPIRVConversion.cpp`.
- Create test passes for testing sets of patterns in isolation.
- Optimize the way original shape is splitted into target shapes, e.g.
`vector<5xi32>` can be splitted into `vector<4xi32>` and
`vector<1xi32>`.
---------
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
This patch introduces pattern rewrites for reducing the rank of named
linalg contraction ops with unit spatial dim(s) to other named
contraction ops. For example `linalg.batch_matmul` with batch size 1 ->
`linalg.matmul` and `linalg.matmul` with unit LHS spatial dim ->
`linalg.vecmat`, etc. These patterns don't support reducing the rank
along reduction dimension as those don't convert to other named
contraction ops.
This patch removes the last vestiges of the old gpu serialization
pipeline. To compile GPU code use target attributes instead.
See [Compilation overview | 'gpu' Dialect - MLIR
docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for
additional information on the target attributes compilation pipeline
that replaced the old serialization pipeline.
When the integer range analysis was first develop, a pass that did
integer range-based constant folding was developed and used as a test
pass. There was an intent to add such a folding to SCCP, but that hasn't
happened.
Meanwhile, -int-range-optimizations was added to the arith dialect's
transformations. The cmpi simplification in that pass is a strict subset
of the constant folding that lived in
-test-int-range-inference.
This commit moves the former test pass into -int-range-optimizaitons,
subsuming its previous contents. It also adds an optimization from
rocMLIR where `rem{s,u}i` operations that are noops are replaced by
their left operands.
In today's repo, attribute `llvm.emit_c_interface` of func op is handled
outside of `mlir::convertFuncOpToLLVMFuncOp` in `FuncOpConversion`
pattern. In some cases, `FuncOpConversion` can't be directly re-used,
but we still want to re-use the code to emit c interface for
`llvm.emit_c_interface`.
Changes in this PR
* move the code to generate c with "llvm.emit_c_interface" interface
into `mlir::convertFuncOpToLLVMFuncOp` to be able to re-use it.
* added unit test to verify c interface for jit can be generated
correctly if only call `convertFuncOpToLLVMFuncOp`.
* removed `FuncOpConversionBase`
---------
Co-authored-by: Fung Xie <ftse@nvidia.com>
Add uplifting from `scf.while` to `scf.for`.
This uplifting expects a very specific ops pattern:
* `before` block consisting of single `arith.cmp` op
* `after` block containing `arith.addi`
We also have a set of patterns to cleanup `scf.while` loops to get them
close to the desired form, they will be added in separate PRs.
This is part of upstreaming `numba-mlir` scf uplifting pipeline: `cf ->
scf.while -> scf.for -> scf.parallel`
Original code:
https://github.com/numba/numba-mlir/blob/main/mlir/lib/Transforms/PromoteToParallel.cpp
Composite pass allows to run sequence of passes in the loop until fixed
point or maximum number of iterations is reached. The usual candidates
are canonicalize+CSE as canonicalize can open more opportunities for CSE
and vice-versa.
This patch adds a the `LowerVectorToArmNeonPattern` patterns to the
ArmNeon.
This pattern inspects `vector.contract` ops that can be 1-1 mapped to an
`arm.neon.smmla` intrinsic. The contract ops must be separated into
tiles who's inputs must fit that of a single smmla op (`2x8xi32` inputs
and `2x2xi32` output). The `vector.contract` inputs must be sign
extended from narrow types (<=i8) to be converted. If all conditions are
met, an smmla op is inserted with additional `vector.shape_casts` to
handle linearizing the input and output dimension.
This is another follow-up of #83004, which made the same change for
`MLIR_CUDA_CONVERSIONS_ENABLED`. As the previous PR, this PR commit
exposes mentioned CMake variable through `mlir-config.h` and uses the
macro that is introduced with the same name. This replaces the macro
`MLIR_ENABLE_DEPRECATED_GPU_SERIALIZATION`, which the CMake files
previously defined manually.
`isContiguousAccess` is an important affine analysis utility but is only
tested very indirectly via passes like vectorization and is not exposed.
Expose it and add a test pass for it that'll make it easier/feasible to
write test cases. This is especially needed since the utility can be
significantly enhanced in power, and we need a test pass to exercise it
directly.
This pass can in the future be used to test the utility of invariant
accesses as well.
The ArmSME compilation pipeline has evolved significantly and is now
sufficiently complex enough that it warrants a proper lowering pipeline
that encapsulates the various passes and orderings. Currently the
pipeline is loosely defined in our integration tests, but these have
diverged and are not using the same passes or ordering everywhere.
This patch introduces a test-lower-to-arm-sme pipeline mirroring
test-lower-to-llvm that provides some sanity when running e2e examples
and can be used a reference for targeting ArmSME in MLIR.
All the integration tests are updated to use this pipeline. The
intention is to productize the pipeline once it becomes more mature.
This op is the inverse of all-gather. It is useful to have an explicit
concise representation instead of having a blob of slicing logic.
Add lowering for the op that slices from the tensor based on the
in-group process index.
Make resharding generate an all-slice instead of inserting the slicing
logic directly.
This removes 3 dead function decls from mlir-opt:
- registerTestLowerToNVVM - recently removed in #75775 when NVVM was
productized.
- registerTestPreparationPassWithAllowedMemrefResults - removed in
D90778 (f7bc56826616).
- registerTestGenericIRVisitorsInterruptPass - added in D116230
(8067ced144a2) but never existed. Pass is registered by
registerTestGenericIRVisitorsPass.
In the process a couple of test transform dialect ops are added just
for testing. These operations are not intended to use as full flushed
out of transformation ops, but are rather operations added for testing.
A separate operation is added to `LinalgTransformOps.td` to convert a
`TilingInterface` operation to loops using the
`generateScalarImplementation` method implemented by the
operation. Eventually this and other operations related to tiling
using the `TilingInterface` need to move to a better place (i.e. out
of `Linalg` dialect)
* Rename mesh.process_index -> mesh.process_multi_index.
* Add mesh.process_linear_index op.
* Add lowering of mesh.process_multi_index into an expression using
mesh.process_linear_index, mesh.cluster_shape and
affine.delinearize_index.
This is useful to lower mesh ops and prepare them for further lowering
where the runtime may have only the linear index of a device/process.
For example in MPI we have a rank (linear index) in a communicator.
Support WalkResult for AffineExpr walk and support interrupting walks
along the lines of Operation::walk. This allows interrupted walks when a
condition is met. Also, switch from std::function to llvm::function_ref
for the walk function.
Make it so that PDL in pattern rewrites can be optionally disabled.
PDL is still enabled by default and not optional bazel. So this should
be a NOP for most folks, while enabling other to disable.
This only works with tests disabled. With tests enabled this still
compiles but tests fail as there is no lit config to disable tests that
depend on PDL rewrites yet.