This is an implementation for [RFC: Supporting Sub-Channel Quantization
in
MLIR](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694).
In order to make the review process easier, the PR has been divided into
the following commit labels:
1. **Add implementation for sub-channel type:** Includes the class
design for `UniformQuantizedSubChannelType`, printer/parser and bytecode
read/write support. The existing types (per-tensor and per-axis) are
unaltered.
2. **Add implementation for sub-channel type:** Lowering of
`quant.qcast` and `quant.dcast` operations to Linalg operations.
3. **Adding C/Python Apis:** We first define he C-APIs and build the
Python-APIs on top of those.
4. **Add pass to normalize generic ....:** This pass normalizes
sub-channel quantized types to per-tensor per-axis types, if possible.
A design note:
- **Explicitly storing the `quantized_dimensions`, even when they can be
derived for ranked tensor.**
While it's possible to infer quantized dimensions from the static shape
of the scales (or zero-points) tensor for ranked
data tensors
([ref](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694/3)
for background), there are cases where this can lead to ambiguity and
issues with round-tripping.
```
Consider the example: tensor<2x4x!quant.uniform<i8:f32:{0:2, 0:2}, {{s00:z00, s01:z01}}>>
```
The shape of the scales tensor is [1, 2], which might suggest that only
axis 1 is quantized. While this inference is technically correct, as the
block size for axis 0 is a degenerate case (equal to the dimension
size), it can cause problems with round-tripping. Therefore, even for
ranked tensors, we are explicitly storing the quantized dimensions.
Suggestions welcome!
PS: I understand that the upcoming holidays may impact your schedule, so
please take your time with the review. There's no rush.
In order to meaningfully generate getters and setters from IRDL, it
makes sense to embed the names of operands, results, etc. in the IR
definition. This PR introduces this feature. Names are constrained
similarly to TableGen names.
Add support for denormal in the Arith dialect (binary and unary
operations).
Denormal are attached to every operation, and they can be of three
different
kinds:
1) ieee, denormal are preserved and processed as defined by IEEE 754
rules.
2) preserve sign, a mode where denormal numbers are flushed to zero, but
the
sign of the zero (+0 or -0) is preserved.
3) positive zero, a mode where all denormal numbers are flushed to
positive zero
(+0), ignoring the sign of the original number.
Denormal refers to both the operands and the result. Currently only
lowering for
ieee is supported.
The SystemZ ABI requires that i32 values should be extended when passed
between functions.
This patch fixes some tests that were lacking this, either by adding
some SystemZ specific inlinings of test functions or by disabling the
verification of this with the CL option controlling this.
Fixes#115564
This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.
Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.
There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
LLVM already supports `DW_TAG_LLVM_annotation` entries for subprograms,
but this hasn't been surfaced to the LLVM dialect.
I'm doing the minimal amount of work to support string-based
annotations, which is useful for attaching metadata to
functions, which is useful for debuggers to offer features beyond basic
DWARF.
As LLVM already supports this, this patch is not controversial.
This reverts commit fa93be4, restoring
commit d884b77, with fixes that ensure the CAPI declarations are
exported properly.
This commit implements LLVM_DIRecursiveTypeAttrInterface for the
DISubprogramAttr to ensure cyclic subprograms can be imported properly.
In the process multiple shortcuts around the recently introduced
DIImportedEntityAttr can be removed.
This commit implements LLVM_DIRecursiveTypeAttrInterface for the
DISubprogramAttr to ensure cyclic subprograms can be imported properly.
In the process multiple shortcuts around the recently introduced
DIImportedEntityAttr can be removed.
The `DIImporedEntity` can be used to represent imported entities like
C++'s namespace with using directive or fortran's moudule with use
statement.
This PR adds `DIImportedEntityAttr` and 2-way translation from
`DIImportedEntity` to `DIImportedEntityAttr` and vice versa.
When an entity is imported in a function, the `retainedNodes` field of
the `DISubprogram` contains all the imported nodes. See the C++ code and
the LLVM IR below.
```
void test() {
using namespace n1;
...
}
!2 = !DINamespace(name: "n1", scope: null)
!16 = distinct !DISubprogram(name: "test", ..., retainedNodes: !19) !19 = !{!20}
!20 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !16, entity: !2 ...)
```
This PR makes sure that the translation from mlir to `retainedNodes`
field happens correctly both ways.
To side step the cyclic dependency between `DISubprogramAttr` and `DIImportedEntityAttr`,
we have decided to not have `scope` field in the `DIImportedEntityAttr` and it is inferred
from the entity which hold the list of `DIImportedEntityAttr`. A `retainedNodes` field has been
added in the `DISubprogramAttr` which contains the list of `DIImportedEntityAttr` for that
function.
This PR currently does not handle entities imported in a global scope
but that should be easy to handle in a subsequent PR.
This exposes most of the `RewriterBase` methods to the C API.
This allows to manipulate both the `IRRewriter` and the
`PatternRewriter`. The
`IRRewriter` can be created from the C API, while the `PatternRewriter`
cannot.
The missing operations are the ones taking `Block::iterator` and
`Region::iterator` as
parameters, as they are not exposed by the C API yet AFAIK.
The Python bindings for these methods and classes are not implemented.
The MLIR C and Python Bindings expose various methods from
`mlir::OpPrintingFlags` . This PR adds a binding for the `skipRegions`
method, which allows to skip the printing of Regions when printing Ops.
It also exposes this option as parameter in the python `get_asm` and
`print` methods
This PR handle translation of DIStringType. Mostly mechanical changes to
translate DIStringType to/from DIStringTypeAttr. The 'stringLength'
field is 'DIVariable' in DIStringType. As there was no `DIVariableAttr`
previously, it has been added to ease the translation.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
The fortran arrays use 'dataLocation', 'rank', 'allocated' and
'associated' fields of the DICompositeType. These were not available in
'DICompositeTypeAttr'. This PR adds the missing fields.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
This field is present in LLVM, but was missing from the MLIR wrapper
type. This addition allows MLIR languages to add proper DWARF info for
GPU programs.
This PR fixes the warning message due to the non ISO standard usage of
`__FUNCTION__`
```
/home/lewuathe/llvm-project/mlir/test/CAPI/transform_interpreter.c: In function ‘testApplyNamedSequence’:
/home/lewuathe/llvm-project/mlir/test/CAPI/transform_interpreter.c:21:27: warning: ISO C does not support ‘__FUNCTION__’ predefined identifier [-Wpedantic]
21 | fprintf(stderr, "%s\n", __FUNCTION__);
|
```
As `__FUNCTION__` is another name of `__func__` and it conforms to the
specification. We should be able to use `__func__` here.
Ref:
https://stackoverflow.com/questions/52962812/how-to-silence-gcc-pedantic-wpedantic-warning-regarding-function
Compiler
```
Ubuntu clang version 18.1.3 (1)
Target: x86_64-pc-linux-gnu
```
Being able to add custom dialects is one of the big missing pieces of
the C API. This change should make it achievable via IRDL. Hopefully
this should open custom dialect definition to non-C++ users of MLIR.
1. Explicit value means the non-zero value in a sparse tensor. If
explicitVal is set, then all the non-zero values in the tensor have the
same explicit value. The default value Attribute() indicates that it is
not set.
2. Implicit value means the "zero" value in a sparse tensor. If
implicitVal is set, then the "zero" value in the tensor is equal to the
implicit value. For now, we only support `0` as the implicit value but
it could be extended in the future. The default value Attribute()
indicates that the implicit value is `0` (same type as the tensor
element type).
Example:
```
#CSR = #sparse_tensor.encoding<{
map = (d0, d1) -> (d0 : dense, d1 : compressed),
posWidth = 64,
crdWidth = 64,
explicitVal = 1 : i64,
implicitVal = 0 : i64
}>
```
Note: this PR tests that implicitVal could be set to other values as
well. The following PR will add verifier and reject any value that's not
zero for implicitVal.
This patch updates the definition of `omp.wsloop` to enforce the
restrictions of a loop wrapper operation.
Related tests are updated but this PR on its own will not pass premerge
tests. All patches in the stack are needed before it can be compiled and
passes tests.
This commit adds `walk` method to PyOperationBase that uses a python
object as a callback, e.g. `op.walk(callback)`. Currently callback must
return a walk result explicitly.
We(SiFive) have implemented walk method with python in our internal
python tool for a while. However the overhead of python is expensive and
it didn't scale well for large MLIR files. Just replacing walk with this
version reduced the entire execution time of the tool by 30~40% and
there are a few configs that the tool takes several hours to finish so
this commit significantly improves tool performance.
These resulted in link failures:
```
/usr/bin/ld:
tools/mlir/test/CAPI/CMakeFiles/mlir-capi-translation-test.dir/translation.c.o:
in function `main':
translation.c:(.text.main+0x58): undefined reference to
`LLVMContextCreate'
/usr/bin/ld: translation.c:(.text.main+0x9b): undefined reference to
`LLVMDumpModule'
/usr/bin/ld: translation.c:(.text.main+0xa3): undefined reference to
`LLVMDisposeModule'
/usr/bin/ld: translation.c:(.text.main+0xb3): undefined reference to
`LLVMContextDispose'
```
Found in mlir-hs. Not sure why this hasn't been flagged elsewhere.
This commit extends the DIDerivedTypeAttr with the `extraData` field.
For now, the type of it is limited to be a `DINodeAttr`, as extending
the debug metadata handling to support arbitrary metadata nodes does not
seem to be necessary so far.
Following the discussion from [this
thread](https://discourse.llvm.org/t/handling-cyclic-dependencies-in-debug-info/67526/11),
this PR adds support for recursive DITypes.
This PR adds:
1. DIRecursiveTypeAttrInterface: An interface that DITypeAttrs can
implement to indicate that it supports recursion. See full description
in code.
2. Importer & exporter support (The only DITypeAttr that implements the
interface is DICompositeTypeAttr, so the exporter is only implemented
for composites too. There will be two methods that each llvm DI type
that supports mutation needs to implement since there's nothing
general).
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
`%ld` specifier is defined to work on values of type `long`. The parameter given to `fprintf` is of type `intptr_t` whose actual underlying integer type is unspecified. On Unix systems it happens to commonly be `long` but on 64-bit Windows it is defined as `long long`.
The cross-platform way to print a `intptr_t` is to use `PRIdPTR` which expands to the correct format specifier for `intptr_t`. This avoids any undefined behaviour and compiler warnings.
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.
llvm-project/mlir/test/CAPI/sparse_tensor.c:50:43:
error: format specifies type 'unsigned long long' but the argument has type 'MlirSparseTensorLevelType' (aka 'unsigned long') [-Werror,-Wformat]
fprintf(stderr, "level_type: %llu\n", lvlTypes[l]);
~~~~ ^~~~~~~~~~~
%lu
1 error generated.
llvm-project/mlir/test/CAPI/sparse_tensor.c:50:42:
error: format specifies type 'unsigned long' but the argument has type 'MlirSparseTensorLevelType' (aka 'unsigned long long') [-Werror,-Wformat]
50 | fprintf(stderr, "level_type: %lu\n", lvlTypes[l]);
| ~~~ ^~~~~~~~~~~
| %llu
1 error generated.
1. C++ enum is set through enum class LevelType : uint_64.
2. C enum is set through typedef uint_64 level_type. It is due to the
limitations in Windows build: setting enum width to ui64 is not
supported in C.
The "Dim" prefix is a legacy left-over that no longer makes sense, since
we have a very strict "Dimension" vs. "Level" definition for sparse
tensor types and their storage.