llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-21 23:37:05 +00:00

Author	SHA1	Message	Date
Johannes Reifferscheid	528a662d3a	Fix sign of largest known divisor of div. (#100081 ) There's a missing abs, so it returns a negative value if the divisor is negative. Later this is then cast to uint.	2024-07-23 10:55:32 +02:00
Tobias Gysi	5da4310082	[MLIR][LLVM] Always print variadic callee type (#99293 ) This commit updates the LLVM dialect CallOp and InvokeOp to always print the variadic callee type (previously callee type) if present. An additional verifier checks that only variadic calls have a non-null variadic callee type, and the builders are adapted accordingly to set the variadic callee type for variadic calls only. Finally, the CallOp and InvokeOp verifiers are strengthened to check that the variadic callee type matches the call argument and result types. The motivation of this change is that CallOp and InvokeOp don't have hidden state that is not pretty printed, but used during the export to LLVM IR. Previously, it could happen that a call looked correct in MLIR, but the return type changed after exporting to LLVM IR (since it has been taken from the hidden callee type attribute). After landing this change, this is not possible anymore since the variadic callee type is always printed if present.	2024-07-23 08:21:46 +02:00
Alexander Pivovarov	019136e30f	[MLIR] Add f8E4M3 IEEE 754 type (#97118 ) This PR adds `f8E4M3` type to mlir. `f8E4M3` type follows IEEE 754 convention ```c f8E4M3 (IEEE 754) - Exponent bias: 7 - Maximum stored exponent value: 14 (binary 1110) - Maximum unbiased exponent value: 14 - 7 = 7 - Minimum stored exponent value: 1 (binary 0001) - Minimum unbiased exponent value: 1 − 7 = −6 - Precision specifies the total number of bits used for the significand (mantisa), including implicit leading integer bit = 3 + 1 = 4 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 7 - Min exp (unbiased): -6 - Infinities (+/-): S.1111.000 - Zeros (+/-): S.0000.000 - NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111} - Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240 - Min normal number: S.0001.000 = +/-2^(-6) - Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7 - Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9) ``` Related PRs: - [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat] Add support for f8E4M3 IEEE 754 type	2024-07-22 23:20:28 -07:00
donald chen	9cc11b98a7	[mlir] [linalg] Add pattern to swap transpose with broadcast (#97063 ) Add a pattern that implement: transpose(broadcast(input)) -> broadcast(transpose(input))	2024-07-23 12:52:25 +08:00
Angel Zhang	db1d881372	[mlir][spirv] Fix bug for `vector.broadcast` op in `convert-vector-to-spirv` pass (#99928 ) This PR addresses [!17976](https://github.com/iree-org/iree/issues/17976) by using converted `resultType` instead of the original result type obtained from `castOp.getResultVectorType`. A new LIT test is also included.	2024-07-22 18:57:11 -04:00
Jacques Pienaar	d2f42c7372	[mlir] Add unit test for RankedTensorType wrapper example. (#99789 ) Add example as unit test for creating a wrapper type/view for RankedTensorType with encoding. This view provides a more restricted & typed API while it allows one to avoid repeated casting queries and accessing the encoding directly. For users with more advance encodings, the expectation would be a separate attribute type, but here just StringAttr is used.	2024-07-22 13:16:22 -07:00
Pranav Bhandarkar	d7e185cca9	[OMPIRBuilder] - Handle dependencies in `createTarget` (#93977 ) This patch handles dependencies specified by the `depend` clause on an OpenMP target construct. It does this much the same way clang does it by materializing an OpenMP `task` that is tagged with the dependencies. The following functions are relevant to this patch - 1) `createTarget` - This function itself is largely unchanged except that it now accepts a vector of `DependData` objects that it simply forwards to `emitTargetCall` 2) `emitTargetCall` - This function has changed now to check if an outer target-task needs to be materialized (i.e if `target` construct has `nowait` or has `depend` clause). If yes, it calls `emitTargetTask` to do all the heavy lifting for creating and dispatching the task. 3) `emitTargetTask` - Bulk of the change is here. See the large comment explaining what it does at the beginning of this function	2024-07-22 10:56:45 -05:00
Andrzej Warzyński	85e7428562	[mlir][vector] Refactor vector-transfer-flatten.mlir (nfc) (3/n) (#95745 ) The main goal of this and subsequent PRs is to unify and categorize tests in: * vector-transfer-flatten.mlir This should make it easier to identify the edge cases being tested (and how they differ), remove duplicates and to add tests for scalable vectors. The main contributions of this PR: 1. For consistency with other tests, `@transfer_read_flattenable_with_dynamic_dims_and_indices` is renamed as `@transfer_read_leading_dynamic_dims`. It is also moved near other tests for `xfer_read`, variable names are updated to match other `xfer_read` tests 2. `@transfer_write_dims_mismatch_non_zero_indices_trailing_dynamic_dim` is renamed as `@negative_transfer_read_dynamic_dim_to_flatten` to better highlight that it's a negative test and to contrast it with `@transfer_read_leading_dynamic_dims` (and to emphasise the difference between the two). 3. Similar changes for tests for `xfer_write`. 4. Make sure that we consistently use `%idx_N` (as opposed to `%idxN`). Follow-up for #95743 and #95744	2024-07-22 11:11:52 +01:00
Matthias Springer	36d384b4dd	[mlir][Transforms][NFC] Dialect conversion: Simplify `EraseBlockRewrite` constructor (#99805 )	2024-07-22 09:09:02 +02:00
donald chen	324fea9baa	[mlir][docs] Update documentation for canonicalize. (#99753 ) Update canonicalize docs.	2024-07-22 11:10:27 +08:00
Alexander Belyaev	568845ae50	[mlir] Add a ValueSemantics trait. (#99493 ) We need to distinguish ShapedTypes with and without value semantics. This is needed for downstream users to define their custom vector and tensor types that can work with the arith/math dialect. RFC https://discourse.llvm.org/t/rfc-mlir-types-with-encoding/80189	2024-07-21 23:08:04 +02:00
Andrzej Warzyński	14a543ea2a	[mlir][test] Add comments in a test (nfc) (#99810 ) Documents which patterns are tested in: * vector-transfer-collapse-inner-most-dims.mlir.	2024-07-21 17:44:12 +01:00
Fangrui Song	b8220b986d	[MC] Remove unused bool arguments from createMCObjectStreamer callers	2024-07-20 21:27:36 -07:00
Kevin Gleason	d1578848e9	Add logging for emit functions in BytecodeWriter.cpp (#99558 ) Recently there was a change to materializing unrealized conversion casts, which inserted conversion that previously did not exist during legalization (https://github.com/llvm/llvm-project/pull/97903), after these cases are inserted and then washed away after transformation completes, it caused the use-list ordering of an op to change in some cases: `my.add %arg0(use1), %arg0(use2) --> my.add %arg0(use2), %arg0(use1)`, which subtly changes the bytecode emitted since this is considered a custom use-list. When investigating why the bytecode had changed I added the following logging which helped track down the difference, in my case it showed extra bytes with "use-list section". With `-debug-only=mlir-bytecode-writer` emits logs like the following, detailing the source of written bytes: ``` emitBytes(4b) bytecode header emitVarInt(6) bytecode version emitByte(13) bytecode version emitBytes(17b) bytecode producer emitByte(0) null terminator emitVarInt(2) dialects count ... emitByte(5) dialect version emitVarInt(4) op names count emitByte(9) op names count emitVarInt(0) dialect number ... emitVarInt(2) dialect writer emitByte(5) dialect writer emitVarInt(9259963783827161088) dialect APInt ... emitVarInt(3) attr/type offset emitByte(7) attr/type offset emitByte(3) section code emitVarInt(18) section size ... ``` Note: this uses string constants and `StringLiteral`, I'm not sure if these are washed away during compilation / OK to have these around for debuggin, or if there's a better way to do this? Alternative was adding many braces and `LLVM_DEBUG` calls at each callsite, but this felt more error prone / likely to miss some callsites.	2024-07-20 12:57:50 -05:00
Matthias Springer	bbd4af5da2	[mlir][Transforms] Dialect conversion: Simplify handling of dropped arguments (#97213 ) This commit simplifies the handling of dropped arguments and updates some dialect conversion documentation that is outdated. When converting a block signature, a `BlockTypeConversionRewrite` object and potentially multiple `ReplaceBlockArgRewrite` are created. During the "commit" phase, uses of the old block arguments are replaced with the new block arguments, but the old implementation was written in an inconsistent way: some block arguments were replaced in `BlockTypeConversionRewrite::commit` and some were replaced in `ReplaceBlockArgRewrite::commit`. The new `BlockTypeConversionRewrite::commit` implementation is much simpler and no longer modifies any IR; that is done only in `ReplaceBlockArgRewrite` now. The `ConvertedArgInfo` data structure is no longer needed. To that end, materializations of dropped arguments are now built in `applySignatureConversion` instead of `materializeLiveConversions`; the latter function no longer has to deal with dropped arguments. Other minor improvements: - Add more comments to `applySignatureConversion`. Note: Error messages around failed materializations for dropped basic block arguments changed slightly. That is because those materializations are now built in `legalizeUnresolvedMaterialization` instead of `legalizeConvertedArgumentTypes`. This commit is in preparation of decoupling argument/source/target materializations from the dialect conversion. This is a re-upload of #96207.	2024-07-20 10:12:13 +02:00
Angel Zhang	2b371003d1	[mlir][docs] Update documentation for `vector.multi_reduction`. NFC (#99668 )	2024-07-19 17:12:33 -04:00
Guillermo Callaghan	d1ca1d017f	[mlir] Makes `zip_shortest` an optional keyword in `transform.foreach` (#98492 ) This PR addresses a [comment] made by @ftynse about the syntax for `ForeachOp`. The syntax was modified by @muneebkhan85 in #82792, where the attribute dictionary was moved to the middle. This patch moves it back to its original place at the end. And introduces an optional keyword for `zip_shortest`. [comment]: https://github.com/llvm/llvm-project/pull/82792#pullrequestreview-2132814144	2024-07-19 17:36:12 +02:00
Dhruv Chauhan	9d86722eee	[TOSA] Add lowering for `tosa.sin` and `tosa.cos` (#99651 ) Lower tosa ops to `mlir::math::SinOp` and `mlir::math::CosOp` as part of the tosa to linalg conversion. Added lit tests for conversion.	2024-07-19 16:11:23 +01:00
Rafael Ubal	38d0b2d174	[mlir] New canonicalization patterns for shape.shape_of and tensor.reshape (#98531 ) This PR includes 3 new canonicalization patterns: - Operation `shape.shape_of`: shape of reshape ``` // Before func.func @f(%arg0: tensor<xf32>, %arg1: tensor<?xindex>) -> tensor<?xindex> { %reshape = tensor.reshape %arg0(%arg1) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> %0 = shape.shape_of %reshape : tensor<xf32> -> tensor<?xindex> return %0 : tensor<?xindex> } // After func.func @f(%arg0: tensor<xf32>, %arg1: tensor<?xindex>) -> tensor<?xindex> { return %arg1 : tensor<?xindex> } ``` - Operation `tensor.reshape`: reshape of reshape ``` // Before func.func @fold_tensor_reshape(%arg0: tensor<xf32>, %arg1: tensor<?xindex>, %arg2: tensor<?xindex>) -> tensor<xf32> { %0 = tensor.reshape %arg0(%arg1) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> %1 = tensor.reshape %0(%arg2) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> return %1 : tensor<xf32> } // After func.func @fold_tensor_reshape(%arg0: tensor<xf32>, %arg1: tensor<?xindex>, %arg2: tensor<?xindex>) -> tensor<xf32> { %reshape = tensor.reshape %arg0(%arg2) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> return %reshape : tensor<*xf32> } ``` - Operation `tensor.reshape`: reshape 1D to 1D ``` // Before func.func @fold_reshape_1d(%input: tensor<?xf32>, %shape: tensor<1xindex>) -> tensor<?xf32> { %0 = tensor.reshape %input(%shape) : (tensor<?xf32>, tensor<1xindex>) -> tensor<?xf32> return %0 : tensor<?xf32> } // After func.func @fold_reshape_1d(%arg0: tensor<?xf32>, %arg1: tensor<1xindex>) -> tensor<?xf32> { return %arg0 : tensor<?xf32> } ``` These three canonicalization patterns cooperate to simplify the IR structure emerging from the lowering of certain element-wise ops with unranked tensor inputs. See file `unranked-tensor-lowering.mlir` in the proposed change list for a detailed example and description. For context, this PR is meant to enable code optimizations for the code generated while lowering ops `quant.qcast` and `quant.dcast` with unranked tensors, as proposed in https://discourse.llvm.org/t/rfc-improvements-in-the-quant-dialect/79942 (implementation currently in progress).	2024-07-19 10:09:31 -04:00
Sergio Afonso	7f763b162f	[MLIR][OpenMP] NFC: Sort clause definitions (#99504 ) This patch moves the `filter` clause definition to keep alphabetical sorting of OpenMPClauses.td.	2024-07-19 10:41:24 +01:00
Corentin Ferry	e6668b1be8	[mlir][tosa] Use roundeven in TOSA cast splat constant op folding (#99484 ) The behavior of TOSA Cast operation for floating-point to integers is to round to the nearest even. This commit aligns the behavior of folding a TOSA Cast of a float splat to int, so it also uses roundeven.	2024-07-19 09:22:39 +02:00
Andrzej Warzyński	cfe043cf99	[mlir][linalg] Restrict scalable vectorisation (#98639 ) Updates `vectorizeScalableVectorPrecondition` so that scalable vectorisation is only applied in well understood and tested scenarios. It's unlikely that we would ever want an arbitrary dimension to be scalable. While the Linalg vectoriser should be flexible enough to handle all possibilities: * in more "exotic" cases, we are likely to struggle with lowerings further down the compilation stack, * it would be impractical given the limitations of LLVM (which usually reflect the limitations of actual hardware) - e.g. no support for "scalable" arrays of scalable or fixed width vectors (). Ultimately, the goal of this patch is to better document what's currently supported. While this PR adds some new restrictions, no existing tests are affected. () At MLIR vector level that would correspond to e.g. `vector<[4]x8xf32>`.	2024-07-19 08:05:10 +01:00
Chenguang Wang	a41a4b8fed	Revert "[mlir][ArmSME] Suppress potential unused warning (#99573 )" (#99578 ) This reverts commit 05bce3f079b677edd0efd28e3923f4776ffb8b59. The work was already done in 99faa03.	2024-07-18 14:59:26 -07:00
Chenguang Wang	05bce3f079	[mlir][ArmSME] Suppress potential unused warning (#99573 ) When building in release mode, the assert will be dropped, making `remove` unused.	2024-07-18 14:55:22 -07:00
Kazu Hirata	99faa038c6	[mlir] Fix a warning This patch fixes: mlir/lib/Dialect/ArmSME/Transforms/TileAllocation.cpp:621:16: error: unused variable 'removed' [-Werror,-Wunused-variable]	2024-07-18 14:41:15 -07:00
Benjamin Maxwell	eed72d4381	[mlir][ArmSME] Support filling liveness 'holes' in the tile allocator (#98350 ) Holes in a live range are points where the corresponding value does not need to be in a tile/register. If the tile allocator keeps track of these holes it can reuse tiles for more values (avoiding spills). Take this simple example: ```mlir func.func @example(%cond: i1) { %tileA = arm_sme.get_tile : vector<[4]x[4]xf32> cf.cond_br %cond, ^bb2, ^bb1 ^bb1: // If we end up here we never use %tileA again! "test.some_use"(%tileB) : (vector<[4]x[4]xf32>) -> () cf.br ^bb3 ^bb2: "test.some_use"(%tileA) : (vector<[4]x[4]xf32>) -> () cf.br ^bb3 ^bb3: return } ``` If you were to calculate the liveness of %tileA and %tileB. You'd see there is a hole in the liveness of %tileA in bb1: ``` %tileA %tileB ^bb0: Live ^bb1: Live ^bb2: Live ``` The tile allocator can make use of that hole and reuse the tile ID it assigned to %tileA for %tileB.	2024-07-18 20:13:45 +01:00
Angel Zhang	9527d77aef	[mlir][spirv] Restructure code in `SPIRVConversion.cpp`. NFC. (#99393 )	2024-07-18 14:31:15 -04:00
Uday Bondhugula	fe04aafe6c	[MLIR][Affine] NFC. Expose affine loop tiling validity utility (#99459 ) Move the utility to check for the validity of tiling affine loop nests to affine loop utils and expose for users outside the loop tiling pass or downstream users.	2024-07-18 22:06:44 +05:30
Johannes Reifferscheid	15495b8cd4	[mlir] Fix unused-variable warning w/o assertions. (#99489 )	2024-07-18 15:20:57 +02:00
Hideto Ueno	c0084c36ed	[mlir][BytecodeReader] Const qualify *SectionReader, NFC (#99376 ) `StringSectionReader`, `ResourceSectionReader` and `PropertiesSectionReader` are immutable after `initialize` so this PR adds const to their parsing functions and references in `AttrTypeReader` and `DialectReader`.	2024-07-18 19:45:36 +09:00
Uday Bondhugula	ba8e4920ca	[MLIR] NFC. Remove anti-patterns given the default null init for Value (#99457 ) Remove anti-patterns given the default null init for Value. Drop some extra includes while on this file. NFC. Co-authored-by: GitHub runner <github-runner@polymagelabs.com>	2024-07-18 14:51:09 +05:30
Adrian Kuegel	7b08c2774c	[mlir][Linalg] Remove unused header include. There seems to be no direct usage of any tosa utils.	2024-07-18 06:35:42 +00:00
Hsiangkai Wang	27ee33d136	[mlir][linalg] Decompose winograd operators (#96183 ) Convert Linalg winograd_filter_transform, winograd_input_transform, and winograd_output_transform into nested loops with matrix multiplication with constant transform matrices. Support several configurations of Winograd Conv2D, including F(2, 3), F(4, 3) and F(2, 5). These configurations show that the implementation can support different kernel size (3 and 5) and different output size (2 and 4). Besides symetric kernel size 3x3 and 5x5, this patch also supports 1x3, 3x1, 1x5, and 5x1 kernels. The implementation is based on the paper, Fast Algorithm for Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308) Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin Reviewed By: ftynse, Max191 Pull Request: https://github.com/llvm/llvm-project/pull/96183	2024-07-18 06:04:53 +01:00
Jordan Rupprecht	21e6777957	[mlir][NFC] Add rewrite header to fix standalone header compile (#99370 ) This uses `MlirRewriterBase` from from `mlir-c/Rewrite.h` without including it.	2024-07-17 13:50:35 -05:00
Angel Zhang	6867e49fc8	[mlir][spirv] Implement vector type legalization for function signatures (#98337 ) ### Description This PR implements a minimal version of function signature conversion to unroll vectors into 1D and with a size supported by SPIR-V (2, 3 or 4 depending on the original dimension). This PR also includes new unit tests that only check for function signature conversion. ### Future Plans - Check for capabilities that support vectors of size 8 or 16. - Set up `OneToNTypeConversion` and `DialectConversion` to replace the current implementation that uses `GreedyPatternRewriteDriver`. - Introduce other vector unrolling patterns to cancel out the `vector.insert_strided_slice` and `vector.extract_strided_slice` ops and fully legalize the vector types in the function body. - Handle `func::CallOp` and declarations. - Restructure the code in `SPIRVConversion.cpp`. - Create test passes for testing sets of patterns in isolation. - Optimize the way original shape is splitted into target shapes, e.g. `vector<5xi32>` can be splitted into `vector<4xi32>` and `vector<1xi32>`. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-07-17 13:09:15 -04:00
MaheshRavishankar	c077a4f305	[mlir][Tensor] Add pattern to fold concats of empty. (#98994 ) A concatenation of empty tensors can be replaced by a single empty tensor of the concatenated shape. Add this pattern to `populateFoldTensorEmptyPatterns`.	2024-07-17 09:51:00 -07:00
Giuseppe Rossini	c63125d453	[mlir] Fix block merging (#97697 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note: this a rework of #96871 . I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.	2024-07-17 17:05:40 +01:00
Jan Leyonberg	3fae5551de	[MLIR][ROCDL] Refactor conversion of math operations to ROCDL calls to a separate pass (#98653 ) This patch refactors the conversion of math operations to ROCDL library calls. This pass will also be used in flang to lower Fortran intrinsics/math functions for OpenMP target offloading codgen.	2024-07-17 09:33:04 -04:00
Finlay	aa21ee7926	[MLIR] Add attributes no_unwind and will_return to the LLVMIR dialect (#98921 ) And testing. These are being added to be used in the GPU to LLVM SPV pass. --------- Co-authored-by: Victor Perez <victor.perez@codeplay.com>	2024-07-17 11:26:25 +02:00
Hugo Trachino	de61875e9d	[MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / trailing dimensions. (#98455 ) Generalizes DropUnitDimFromElementwiseOps to support inner unit dimensions. This change stems from improving lowering of contractionOps for Arm SME. Where we end up with inner unit dimensions on MulOp, BroadcastOp and TransposeOp, preventing the generation of outerproducts. discussed [here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa). Fix after : https://github.com/llvm/llvm-project/pull/97652 showed an unhandled edge case when all dimensions are one. The generated target VectorType would be `vector<f32>` which is apparently not supported by the mulf. In case all dimensions are dropped, the target vectorType is vector<1xf32> --------- Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>	2024-07-17 10:22:25 +01:00
Guray Ozen	f2251f93ab	[mlir][gpu] Add mlir_c_runner_utils to fix #99035 This fixes the unit test that is broken in #99035.	2024-07-17 09:23:32 +02:00
Guray Ozen	20861f1f2f	[mlir][gpu] Use alloc OP's `host_shared` in cuda runtime (#99035 )	2024-07-17 07:25:11 +02:00
Jie Fu	ffbda47159	[mlir] Fix build error (NFC) /llvm-project/mlir/include/mlir/CAPI/Rewrite.h:21:63: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi] DEFINE_C_API_PTR_METHODS(MlirRewriterBase, mlir::RewriterBase); ^ 1 error generated.	2024-07-17 07:13:16 +08:00
Fehr Mathieu	5f8c46b887	[mlir] Add RewriterBase to the C API (#98962 ) This exposes most of the `RewriterBase` methods to the C API. This allows to manipulate both the `IRRewriter` and the `PatternRewriter`. The `IRRewriter` can be created from the C API, while the `PatternRewriter` cannot. The missing operations are the ones taking `Block::iterator` and `Region::iterator` as parameters, as they are not exposed by the C API yet AFAIK. The Python bindings for these methods and classes are not implemented.	2024-07-16 20:37:11 +01:00
Andrzej Warzyński	2ee5586ac7	[mlir][vector] Make the in_bounds attribute mandatory (#97049 ) At the moment, the in_bounds attribute has two confusing/contradicting properties: 1. It is both optional _and_ has an effective default-value. 2. The default value is "out-of-bounds" for non-broadcast dims, and "in-bounds" for broadcast dims. (see the `isDimInBounds` vector interface method for an example of this "default" behaviour [1]). This PR aims to clarify the logic surrounding the `in_bounds` attribute by: * making the attribute mandatory (i.e. it is always present), * always setting the default value to "out of bounds" (that's consistent with the current behaviour for the most common cases). #### Broadcast dimensions in tests As per [2], the broadcast dimensions requires the corresponding `in_bounds` attribute to be `true`: ``` vector.transfer_read op requires broadcast dimensions to be in-bounds ``` The changes in this PR mean that we can no longer rely on the default value in cases like the following (dim 0 is a broadcast dim): ```mlir %read = vector.transfer_read %A[%base1, %base2], %f, %mask {permutation_map = affine_map<(d0, d1) -> (0, d1)>} : memref<?x?xf32>, vector<4x9xf32> ``` Instead, the broadcast dimension has to explicitly be marked as "in bounds: ```mlir %read = vector.transfer_read %A[%base1, %base2], %f, %mask {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} : memref<?x?xf32>, vector<4x9xf32> ``` All tests with broadcast dims are updated accordingly. #### Changes in "SuperVectorize.cpp" and "Vectorization.cpp" The following patterns in "Vectorization.cpp" are updated to explicitly set the `in_bounds` attribute to `false`: * `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern` Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and `vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to make sure that xfer Ops created by these hooks set the dimension corresponding to broadcast dims as "in bounds". Otherwise, the Op verifier would complain Note that there is no mechanism to verify whether the corresponding memory access are indeed in bounds. Still, this is consistent with the current behaviour where the broadcast dim would be implicitly assumed to be "in bounds". [1] `4145ad2bac/mlir/include/mlir/Interfaces/VectorInterfaces.td (L243-L246)` [2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop	2024-07-16 16:49:52 +01:00
Alexis Perry-Holby	f1d3fe7aae	Add basic -mtune support (#98517 ) Initial implementation for the -mtune flag in Flang. This PR is a clean version of PR #96688, which is a re-land of PR #95043	2024-07-16 16:48:24 +01:00
Andrzej Warzyński	4ecb5383e1	[mlir][vector] Update tests for collapse 6/n (nfc) (#98902 ) The main goal of this PR (and subsequent PRs), is to add more tests with scalable vectors to: * vector-transfer-collapse-inner-most-dims.mlir There's quite a few cases to consider, hence this is split into multiple PRs. In this PR, I am making the following changes: * All input memrefs for `xfer_read` are are renamed as `%src`. * All input memrefs for `xfer_write` are are renamed as `%dest`. * All variables representing pad values for `xfer_read` are renamed as `%pad`. * All vector variables (for `xfer_read` and `xfer_write`) are renamed as `%v`. * Add `@contiguous_inner_most_non_zero_idx_in_bounds_scalable` for `xfer_read` (similar test already exists for `xfer_write`) * All index variables are renamed as `%i` (1st index) and `%ii` (2nd index). The above were marked as TODOs in the test file - these are not resolved. In addition (to avoid sending another PR): * `@drop_inner_most_dim` is deleted - it duplicates `@contiguous_inner_most` for xfer_write * For consistency with other negative tests, renamed `@non_unit_strides` as `@negative_non_unit_strides` and added a similar test for `xfer_read` * `@non_unit_strides` is renamed as `@negative_non_unit_strides` and a similar test is added for `xfer_read`. This is a follow-up for: #94490, #94604, #94906, #96214, #96227	2024-07-16 16:41:57 +01:00
Fabian Mora	16dd75b41c	[mlir][ROCDL] Construct AMDGCN ISA control variable explicitly (#98912 ) This patch constructs the AMDGCN ISA control variable explicitly instead of linking against the library shipped with ROCm. This change prevents issues arising from the order in which the AMDGCN libraries are linked.	2024-07-16 09:16:00 -05:00
Alex Bradbury	fdf94e1632	Reapply "[Intrinsics][PreISelInstrinsicLowering] llvm.memcpy.inline length no longer needs to be constant (#98281 )" This reverts commit ac4b6b662630cd4d3bf6929f2b39ea203c0054a1. A test change was missing for mlir/test/Target/LLVMIR/llvmir-intrinsics.mlir in the initial commit.	2024-07-16 14:48:59 +01:00
Tom Natan	f091848504	Add support for enum doc gen (#98885 )	2024-07-16 10:18:49 +02:00

1 2 3 4 5 ...

20291 Commits