This commit simplifies the handling of dropped arguments and updates
some dialect conversion documentation that is outdated.
When converting a block signature, a `BlockTypeConversionRewrite` object
and potentially multiple `ReplaceBlockArgRewrite` are created. During
the "commit" phase, uses of the old block arguments are replaced with
the new block arguments, but the old implementation was written in an
inconsistent way: some block arguments were replaced in
`BlockTypeConversionRewrite::commit` and some were replaced in
`ReplaceBlockArgRewrite::commit`. The new
`BlockTypeConversionRewrite::commit` implementation is much simpler and
no longer modifies any IR; that is done only in `ReplaceBlockArgRewrite`
now. The `ConvertedArgInfo` data structure is no longer needed.
To that end, materializations of dropped arguments are now built in
`applySignatureConversion` instead of `materializeLiveConversions`; the
latter function no longer has to deal with dropped arguments.
Other minor improvements:
- Improve variable name: `origOutputType` -> `origArgType`. Add an
assertion to check that this field is only used for argument
materializations.
- Add more comments to `applySignatureConversion`.
Note: Error messages around failed materializations for dropped basic
block arguments changed slightly. That is because those materializations
are now built in `legalizeUnresolvedMaterialization` instead of
`legalizeConvertedArgumentTypes`.
This commit is in preparation of decoupling argument/source/target
materializations from the dialect conversion.
This patch removes the last vestiges of the old gpu serialization
pipeline. To compile GPU code use target attributes instead.
See [Compilation overview | 'gpu' Dialect - MLIR
docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for
additional information on the target attributes compilation pipeline
that replaced the old serialization pipeline.
This commit adds `emitc.size_t`, `emitc.ssize_t` and `emitc.ptrdiff_t`
types to the EmitC dialect. These are used to map `index` types to C/C++
types with an explicit signedness, and are emitted in C/C++ as `size_t`,
`ssize_t` and `ptrdiff_t`.
The `noinline`, `alwaysinline`, and `optnone` function attributes are
already being used in MLIR code for the LLVM inlining interface and in
some SPIR-V lowering, despite residing in the passthrough dictionary,
which is intended as exactly that -- a pass through MLIR -- and not to
model any actual semantics being handled in MLIR itself.
Promote the `noinline`, `alwaysinline`, and `optnone` attributes out of
the passthrough dictionary on `llvm.func` into first class unit
attributes, updating the import and export accordingly.
Add a verifier to `llvm.func` that checks that these attributes are not
set in an incompatible way according to the LLVM specification.
Update the LLVM dialect inlining interface to use the first class
attributes to check whether inlining is possible.
This commit simplifies and improves documentation for the part of the
`ConversionPatternRewriter` API that deals with signature conversions.
There are now two public functions for signature conversion:
* `applySignatureConversion` converts a single block signature. This
function used to take a `Region *` (but converted only the entry block).
It now takes a `Block *`.
* `convertRegionTypes` converts all block signatures of a region.
`convertNonEntryRegionTypes` is removed because it is not widely used
and can easily be expressed with a call to `applySignatureConversion`
inside a loop. (See `Detensorize.cpp` for an example.)
Note: For consistency, `convertRegionTypes` could be renamed to
`applySignatureConversion` (overload) in the future. (Or
`applySignatureConversion` renamed to `convertBlockTypes`.)
Also clarify when a type converter and/or signature conversion object is
needed and for what purpose.
Internal code refactoring (NFC) of `ConversionPatternRewriterImpl` (the
part that deals with signature conversions). This part of the codebase
was quite convoluted and unintuitive.
From a functional perspective, this change is NFC. However, the public
API changes, thus not marking as NFC.
Note for LLVM integration: When you see
`applySignatureConversion(region, ...)`, replace with
`applySignatureConversion(region->front(), ...)`. In the unlikely case
that you see `convertNonEntryRegionTypes`, apply the same changes as
this commit did to `Detensorize.cpp`.
---------
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
This change expands the existing instrumentation that prints the IR
before/after each pass to an output stream (usually stderr). It adds
a new configuration that will print the output of each pass to a
separate file. The files will be organized into a directory tree
rooted at a specified directory. For existing tools, a CL option
`-mlir-print-ir-tree-dir` is added to specify this directory and
activate the new printing config.
The created directory tree mirrors the nesting structure of the IR. For
example,
if the IR is congruent to the pass-pipeline
"builtin.module(pass1,pass2,func.func(pass3,pass4),pass5)", and
`-mlir-print-ir-tree-dir=/tmp/pipeline_output`, then then the tree file
tree
created will look like:
```
/tmp/pass_output
├── builtin_module_the_symbol_name
│ ├── 0_pass1.mlir
│ ├── 1_pass2.mlir
│ ├── 2_pass5.mlir
│ ├── func_func_my_func_name
│ │ ├── 1_0_pass3.mlir
│ │ ├── 1_1_pass4.mlir
│ ├── func_func_my_other_func_name
│ │ ├── 1_0_pass3.mlir
│ │ ├── 1_1_pass4.mlir
```
The subdirectories are named by concatenating the relevant parent
operation names and symbol name (if present). The printer keeps a
counter associated with ops that are targeted by passes and their
isolated-from-above parents. Each filename is given a numeric prefix
using the counter value for the op that the pass is targeting and then
prepending the counter values for each parent. This gives a naming
where it is easy to distinguish which passes may have run concurrently
vs. which have a clear ordering. In the above example, for both
`1_1_pass4.mlir` files, the first `1` refers to the counter for the
parent op, and the second refers to the counter for the respective
function.
- Some sentences are incorrectly split across list items.
- Some pre-formatted syntax is left in plaintext
- Some lines end in spaces
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Some docs were emitted into the wrong location (Polynomial/ instead of
Dialect/). Furthermore, `-gen-dialect-docs` subsumes
`-gen-attr/typedef-docs` so the latter are not required.
Add a top-level entry that includes both other files in a proper order.
Move the documentation of the ownership-based buffer deallocation pass
to a separate file. Also improve the documentation a bit and insert a
figure that explains the `bufferization.dealloc` op (copied from the
tutorial at the LLVM Dev Summit 2023).
`attr-dict` currently prints any attribute (inherent or discardable)
that does not occur elsewhere in the assembly format. `prop-dict` on the
other hand, always prints and parses all properties (including inherent
attributes stored as properties) as part of the property dictionary,
regardless of whether the properties or attributes are used elsewhere.
(with the exception of default-valued attributes implemented recently in
https://github.com/llvm/llvm-project/pull/87970).
This PR changes the behavior of `prop-dict` to only print and parse
attributes and properties that do not occur elsewhere in the assembly
format. This is achieved by 1) adding used attributes and properties to
the elision list when printing and 2) using a custom version of
`setPropertiesFromAttr` called `setPropertiesFromParsedAttr` that is
sensitive to the assembly format and auto-generated by ODS.
The current and new behavior of `prop-dict` and `attr-dict` were also
documented.
Happens to also fix https://github.com/llvm/llvm-project/issues/88506
Adds two new CMake functions to query the host system:
* `check_hwcap`,
* `check_emulator`.
Together, these functions are used to check whether a given set of MLIR
integration tests require an emulator. If yes, then the corresponding
CMake var that defies the required emulator executable is also checked.
`check_hwcap` relies on ELF_HWCAP for discovering CPU features from
userspace on Linux systems. This is the recommended approach for Arm
CPUs running on Linux as outlined in this blog post:
* https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/runtime-detection-of-cpu-features-on-an-armv8-a-cpu
Other operating systems (e.g. Android) and CPU architectures will
most likely require some other approach. Right now these new hooks are
only used for SVE and SME integration tests.
This relands #86489 with the following changes:
* Replaced:
`set(hwcap_test_file ${CMAKE_BINARY_DIR}/${CMAKE_FILES_DIRECTORY}/hwcap_check.c)`
with:
`set(hwcap_test_file ${CMAKE_BINARY_DIR}/temp/hwcap_check.c)`
The former would trigger an infinite loop when running `ninja`
(after the initial CMake configuration).
* Fixed commit msg. Previous one was taken from the initial GH PR
commit rather than the final re-worked solution (missed this when
merging via GH UI).
* A couple more NFCs/tweaks.
References to headings need to be preceded with a slash. Also,
references to headings on the same page do not need to contain the name
of the document (omitting the document name means if the name changes
the links will still be valid).
I double checked the links by building [the
website](https://github.com/llvm/mlir-www):
```shell
./mlir-www-helper.sh --install-docs ../llvm-project website
cd website && hugo serve
```
Integration tests for ArmSME require an emulator (there's no hardware
available). Make sure that CMake complains if `MLIR_RUN_ARM_SME_TESTS`
is set while `ARM_EMULATOR_EXECUTABLE` is empty.
I'm also adding a note in the docs for future reference.
Revise the printing functionality of TimingManager to accommodate
various output formats. At present, TimingManager is limited to
outputting data solely in plain text format. To overcome this
limitation, I have introduced an abstract class that serves as the
foundation for printing. This approach allows users to implement
additional output formats by extending this abstract class. As part of
this update, I have integrated support for JSON as a new output format,
enhancing the ease of parsing for subsequent processing scripts.
When importing from LLVM IR the data layout of all pointer types
contains an index bitwidth that should be used for index computations.
This revision adds a getter to the DataLayout that provides access to
the already stored bitwidth. The function returns an optional since only
pointer-like types have an index bitwidth. Querying the bitwidth of a
non-pointer type returns std::nullopt.
The new function works for the built-in Index type and, using a type
interface, for the LLVMPointerType.
Transform interfaces are implemented, direction or via extensions, in
libraries belonging to multiple other dialects. Those dialects don't
need to depend on the non-interface part of the transform dialect, which
includes the growing number of ops and transitive dependency footprint.
Split out the interfaces into a separate library. This in turn requires
flipping the dependency from the interface on the dialect that has crept
in because both co-existed in one library. The interface shouldn't
depend on the transform dialect either.
As a consequence of splitting, the capability of the interpreter to
automatically walk the payload IR to identify payload ops of a certain
kind based on the type used for the entry point symbol argument is
disabled. This is a good move by itself as it simplifies the interpreter
logic. This functionality can be trivially replaced by a
`transform.structured.match` operation.
I double checked the links by building [the
website](https://github.com/llvm/mlir-www):
```
$ mlir-www-helper.sh --install-docs ../llvm-project website
$ cd website && hugo serve
```
- Fixed OpenACC's spec link format
- Add missed `OpenACCPasses.md` into Passes.md
- Add missed `MyExtensionCh4.md` into Ch4.md of tutorial of transform
As part of the renaming the Standard dialect to Func dialect, *support*
for the `func.constant` operation was added to the emitter. However, the
emitter cannot emit function types. Hence the emission for a snippet
like
```
%0 = func.constant @myfn : (f32) -> f32
func.func private @myfn(%arg0: f32) -> f32 {
return %arg0 : f32
}
```
failes with `func.mlir:1:6: error: cannot emit type '(f32) -> f32'`.
This removes `func.constant` from the emitter.
It seems the `Traits.md` file was turned into `Traits/_index.md` in
https://reviews.llvm.org/D153291, causing links to `Traits.md` to no
longer work (instead, `Traits` needs to be used).
Use the "main" transform-interpreter pass instead of the test pass.
This, along with the previously introduced debug extension, now allow
tutorials to no longer depend on test passes and extensions.
Even when `private-function-dynamic-ownership` is set, ownership should
never be passed to the callee. This can lead to double deallocs (#77096)
or use-after-free in the caller because ownership is currently passed
regardless of whether there are any further uses of the buffer in the
caller or not.
Note: This is consistent with the fact that ownership is never passed to
nested regions.
This commit fixes#77096.
* Split out `MeshDialect.h` form `MeshOps.h` that defines the dialect
class. Reduces include clutter if you care only about the dialect and
not the ops.
* Expose functions `getMesh` and `collectiveProcessGroupSize`. There
functions are useful for outside users of the dialect.
* Remove unused code.
* Remove examples and tests of mesh.shard attribute in tensor encoding.
Per the decision that Spmdization would be performed on sharding
annotations and there will be no tensors with sharding specified in the
type.
For more info see this RFC comment:
https://discourse.llvm.org/t/rfc-sharding-framework-design-for-device-mesh/73533/81
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`
The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.
Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).