llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 20:36:05 +00:00

Author	SHA1	Message	Date
Nirvedh Meshram	42b3f91fd6	[mlir] Vectorize tensor.pad with low padding for unit dims (#133808 ) We currently do not have masked vectorization support for tenor.pad with low padding. However, we can allow this in the special case where the result dimension after padding is a unit dim. The reason is when we actually have a low pad on a unit dim, the input size of that dimension will be (or should be for correct IR) dynamically zero and hence we will create a zero mask which is correct. If the low pad is dynamically zero then the lowering is correct as well. --------- Signed-off-by: Nirvedh <nirvedh@gmail.com>	2025-04-02 16:32:36 -05:00
ofri frishman	6f1347d57b	[MLIR] Bubble up tensor.extract_slice through tensor.collapse_shape (#131982 ) Add a pattern that bubbles up tensor.extract_slice through tensor.collapse_shape. The pattern is registered in a pattern population function that is used by the transform op transform.apply_patterns.tensor.bubble_up_extract_slice and by the tranform op transform.structured.fuse as a cleanup pattern. This pattern enables tiling and fusing op chains which contain tensor.collapse_shape if added as a cleanup pattern of tile and fuse utility. Without this pattern that would not be possible, as tensor.collapse_shape does not implement the tiling interface. This is an additional pattern to the one added in PR #126898	2025-04-02 21:06:43 +01:00
AdityaK	340f06a8d4	Fix: bail out when divisor is zero (#133518 ) Fixes: #131279	2025-04-02 09:44:18 -07:00
Fehr Mathieu	8b67f36258	[mlir] [arith] Fix ceildivsi lowering in arith-expand (#133774 ) This fixes the current lowering of `arith.ceildivsi` in the arith-expand pass, which was previously incorrect. The new version is based on the lowering of `arith.floordivsi`, and will not introduce new undefined behavior or poison during the lowering. It also replaces one division with a multiplication. The previous lowering of `ceildivsi(n, m)` was the following: ``` x = (m > 0) ? -1 : 1 (nm>0) ? ((n+x) / m) + 1 : - (-n / m) ``` This caused two problems: In the case where `n` is INT_MIN and `m` is positive, the result would be poison instead of an actual value * In the case where `n` is INT_MAX and `m` is `-1`, this would trigger undefined behavior, while the original code wouldn't. This is because `n+x` would be equal to `INT_MIN` (`INT_MAX + 1`), so the `(n+x) / m` division would overflow and trigger UB.	2025-04-02 17:26:58 +01:00
Shilei Tian	84cb08e118	[MLIR][AMDGPU] Bump to COV6 (#133849 ) We already bump to COV6 by default in the front-end and backend. This PR is for MLIR. Note that COV6 requires ROCm 6.3+.	2025-04-02 12:14:24 -04:00
Igor Wodiany	2a90631841	[mlir][spirv] Allow yielding values from selection regions (#133702 ) There are cases in SPIR-V shaders where values need to be yielded from the selection region to make valid MLIR. For example (part of the SPIR-V shader decompiled to GLSL): ``` bool _115 if (_107) { // ... float _200 = fma(...); // ... _115 = _200 < _174; } else { _115 = _107; } bool _123; if (_115) { // ... float _213 = fma(...); // ... _123 = _213 < _174; } else { _123 = _115; } ```` This patch extends `mlir.selection` so it can return values. `mlir.merge` is used as a "yield" operation. This allows to maintain a compatibility with code that does not yield any values, as well as, to maintain an assumption that `mlir.merge` is the only operation in the merge block of the selection region.	2025-04-02 14:35:22 +01:00
Longsheng Mou	7d441d9892	[mlir] Use `dyn_cast` instead of `cast` in MathToVCIX conversion (#134047 ) Fixes #131093.	2025-04-02 18:30:42 +08:00
donald chen	d40bab359c	[mlir][liveness] fix bugs in liveness analysis (#133416 ) This patch fixes the following bugs: - In SparseBackwardAnalysis, the setToExitState function should propagate changes if it modifies the lattice. Previously, this issue was masked because multi-block scenarios were not tested, and the traversal order of backward data flow analysis starts from the end of the program. - The method in liveness analysis for determining whether the non-forwarded operand in branch/region branch operations is live is incorrect, which may cause originally live variables to be marked as not live.	2025-04-02 11:56:13 +08:00
AdityaK	96d60c00e5	[mlir][spirv] Verify matching of entry block arguments and function signature (#133167 ) Fixes: #132894	2025-04-01 14:10:17 -07:00
Max191	1407f5bee9	[mlir] Canonicalize extract_slice(unpack) (#133777 ) Canonicalizes a chain of `linalg.unpack -> tensor.extract_slice` into a `linalg.unpack` with reduced dest sizes. This will only happen when the unpack op's only user is a non rank-reducing slice with zero offset and unit strides. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com> Signed-off-by: Max Dawkins <maxdawkins19@gmail.com> Co-authored-by: Max Dawkins <maxdawkins19@gmail.com>	2025-04-01 14:51:58 -04:00
Krzysztof Drewniak	25622aa745	[mlir][AMDGPU] Add gfx950 MFMAs to the amdgpu.mfma op (#133553 ) This commit extends the lowering of amdgpu.mfma to handle the new double-rate MFMAs in gfx950 and adds tests for these operations. It also adds support for MFMAs on small floats (f6 and f4), which are implented using the "scaled" MFMA intrinsic with a scale value of 0 in order to have an unscaled MFMA. This commit does not add a `amdgpu.scaled_mfma` operation, as that is future work. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2025-04-01 11:59:09 -05:00
Ivan Butygin	1f194ff34e	[mlir] Expose `simplifyAffineExpr` through python api (#133926 )	2025-04-01 19:28:53 +03:00
Jean-Didier PAILLEUX	15cfe4a774	[MLIR] Adding 'no_inline' and 'always_inline' attributes on LLMV::CallOp (#133726 ) Addition of `no_inline` and `always_inline` attributes for CallOps in MLIR in order to be able to inline or not directly the call of a function without having the attribute on the `FuncOp`. The addition of these attributes will be used in a future PR in Flang (`[NO]INLINE` directive).	2025-04-01 15:48:25 +02:00
Jeremy Morse	1ebc308bba	[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855 ) During the transition from debug intrinsics to debug records, we used several different command line options to customise handling: the printing of debug records to bitcode and textual could be independent of how the debug-info was represented inside a module, whether the autoupgrader ran could be customised. This was all valuable during development, but now that totally removing debug intrinsics is coming up, this patch removes those options in favour of a single flag (experimental-debuginfo-iterators), which enables autoupgrade, in-memory debug records, and debug record printing to bitcode and textual IR. We need to do this ahead of removing the experimental-debuginfo-iterators flag, to reduce the amount of test-juggling that happens at that time. There are quite a number of weird test behaviours related to this -- some of which I simply delete in this commit. Things like print-non-instruction-debug-info.ll , the test suite now checks for debug records in all tests, and we don't want to check we can print as intrinsics. Or the update_test_checks tests -- these are duplicated with write-experimental-debuginfo=false to ensure file writing for intrinsics is correct, but that's something we're imminently going to delete. A short survey of curious test changes: * free-intrinsics.ll: we don't need to test that debug-info is a zero cost intrinsic, because we won't be using intrinsics in the future. * undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory mode while we sorted something out; it works now either way. * salvage-cast-debug-info.ll: was testing intrinsics-in-memory get salvaged, isn't necessary now * localize-constexpr-debuginfo.ll: was producing "dead metadata" intrinsics for optimised-out variable values, dbg-records takes the (correct) representation of poison/undef as an operand. Looks like we didn't update this in the past to avoid spurious test differences. * Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing that debug-info affected codegen, and we deferred updating the tests until now. This is just one of those silent gnochange issues that get fixed by RemoveDIs. Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc, that checks we can autoupgrade debug intrinsics that are in bitcode into the new debug records.	2025-04-01 14:27:11 +01:00
Pablo Antonio Martinez	a338f80ddc	[mlir][Linalg] Add transform to convert linalg.copy into memref.copy (#132422 ) Targeted rewrite of a linalg.copy on memrefs to a memref.copy. This is useful when bufferizing copies to a linalg.copy, applying some transformations, and then rewriting the copy into a memref.copy. If the element types of the source and destination differ, or if the source is a scalar, the transform produces a silenceable failure.	2025-04-01 13:39:33 +01:00
Longsheng Mou	1cf6786e32	[mlir] Improve error handling for dense attribute parsing in complex types (#133220 ) - For splat dense attributes, the number of parsed elements must be 2. - For non-splat dense attributes, the number of parsed elements must be twice the number of elements in the type. Fixes #132859.	2025-04-01 16:27:43 +08:00
Frank Schlimbach	49f080afc4	[mlir][mpi] Mandatory Communicator (#133280 ) This is replacing #125361 - communicator is mandatory - new mpi.comm_world - new mp.comm_split - lowering and test --------- Co-authored-by: Sergio Sánchez Ramírez <sergio.sanchez.ramirez+git@bsc.es>	2025-04-01 08:58:55 +02:00
Sandeep Dasgupta	baacd1287b	Fix printing of `mlirUniformQuantizedSubChannelTypeGetNumBlockSizes` in 32-bit machine. (#133763 ) Fixes the issue reported in https://github.com/llvm/llvm-project/pull/120172#issuecomment-2763212827 cc @mgorny	2025-03-31 16:45:43 -04:00
Matthias Springer	5edf127384	[mlir][memref] Verify out-of-bounds access for memref.subview (#133086 ) * Improve the verifier of `memref.subview` to detect out-of-bounds extractions. * Improve the documentation of `memref.subview` to make clear that out-of-bounds extractions are not allowed. Rewrite examples to use the new `strided<>` notation instead of `affine_map` layout maps. Also remove all unrelated operations (`memref.alloc`) from the examples. * Fix various test cases where `memref.subview` ops ran out-of-bounds. * Update canonicalizations patterns to ensure that they do not fold IR if it would generate IR that no longer verifies. Related discussion on Discourse: https://discourse.llvm.org/t/out-of-bounds-semantics-of-memref-subview/85293 This is a re-upload of #131876, which was reverted due to failing GPU tests. These tests were faulty and fixed in #133051.	2025-03-31 10:28:55 -07:00
Matthias Springer	8b06da1682	[mlir][memref] Improve runtime verification for `memref.subview` (#132545 ) This commit addresses a TODO in the runtime verification of `memref.subview`. Each dimension is now verified: the offset must be in-bounds and the slice must not run out-of-bounds. This commit aligns runtime verification with static op verification (which was improved in #133086).	2025-03-31 10:24:30 -07:00
Han-Chung Wang	66b0b0466b	[MLIR][NFC] Fix incomplete boundary comments. (#133516 ) I observed that we have the boundary comments in the codebase like: ``` //===----------------------------------------------------------------------===// // ... //===----------------------------------------------------------------------===// ``` I also observed that there are incomplete boundary comments. The revision is generated by a script that completes the boundary comments. ``` //===----------------------------------------------------------------------===// // ... ... ``` Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-03-31 09:29:54 -07:00
Frank Schlimbach	1dee12531d	[mlir][mpi] Lowering MPI_Allreduce (#133133 ) Lowering of mpi.all_reduce to LLVM function call	2025-03-31 12:51:45 +02:00
Longsheng Mou	1878259c39	[mlir][spirv] Update verifier for `spirv.mlir.merge` (#133427 ) - Moves the verification logic to the `verifyRegions` method of the parent operation. - Fixes a crash during verification when the last block lacks a terminator. Fixes #132850.	2025-03-29 15:35:00 +08:00
Longsheng Mou	22a11be8ab	[mlir][vector] Fix parser of `vector.contract` (#133434 ) This PR adds a check in the parser to prevent a crash when `vector.contract` lacks the `iterator_types` attribute. Fixes #132886.	2025-03-29 09:24:42 +08:00
Mogball	e29f228a07	actually fix the test	2025-03-28 16:24:52 -07:00
Mogball	934f2557ba	[mlir][analysis] Commit missing check lines	2025-03-28 16:14:06 -07:00
Jeff Niu	5252bb1d74	[mlir] IntegerRangeAnalysis: return initialized state for noninteger values (#133541 ) Otherwise, the state for noninteger values remains uninitialized, causing the analysis to return bogus results.	2025-03-28 16:13:10 -07:00
Qinkun Bao	91d2ecf0d5	[NFC] Fix some typos in libc and mlir comments (#133374 )	2025-03-28 15:52:37 -04:00
Ian Wood	77ba6918a1	[mlir][linalg] Fix FoldReshapeWithGenericOpByCollapsing insertion point (#133476 ) Fixes dominance verifier error with `FoldReshapeWithGenericOpByCollapsing` by setting the insertion point after `producer`. The `tensor.collapse_shape` op only has a single operand (`producer`) so it is safe to insert after the producer. Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-03-28 12:38:39 -07:00
Matthias Springer	4abff4d7b2	[mlir][Transforms] Improve `replaceOpWithMultiple` API (#132608 ) This commit adds an additional overload to `replaceOpWithMultiple` that accepts additional container types. This has been brought up by users of the new `replaceOpWithMultiple` API. In particular, one missing container type was `SmallVector<SmallVector<Value>>`. The "default" `ArrayRef<ValueRange>` container type can lead to use-after-scope errors in cases such as: ```c++ // Compute the replacement value ranges. Some replacements are single // values, some are value ranges. SmallVector<ValueRange> repl; repl.push_back(someValueRange); // OK for (...) { // push_back(Value) triggers an implicit conversion to ValueRange, // which does not own the range. repl.push_back(someValue); // triggers use-after-scope later } rewriter.replaceOpWithMultiple(op, repl); ``` In this example, users should use `SmallVector<SmallVector<Value>> repl;`.	2025-03-28 14:18:54 +01:00
Letu Ren	29cb00331f	[mlir][llvmir] add llvm.experimental.constrained.uitofp intrinsics (#133300 ) https://llvm.org/docs/LangRef.html#llvm-experimental-constrained-uitofp-intrinsic Signed-off-by: Letu Ren <fantasquex@gmail.com>	2025-03-28 08:47:36 +01:00
Letu Ren	68f71aae3b	[mlir][llvmir] add llvm.sincos intrinsics (#133311 ) https://llvm.org/docs/LangRef.html#llvm-frexp-intrinsic Signed-off-by: Letu Ren <fantasquex@gmail.com>	2025-03-28 08:45:46 +01:00
Jakub Kuderski	f359c0bde5	[mlir][arith] Trim trailing spaces in wide int emulation tests. NFC. (#133349 ) Followup cleanup after https://github.com/llvm/llvm-project/pull/132375 and https://github.com/llvm/llvm-project/pull/133248	2025-03-27 22:36:05 -04:00
Longsheng Mou	a6cb5cc0f0	[mlir] Add nullptr checks in SparseElementsAttr parser (#133222 ) This PR adds nullptr checks in the SparseElementsAttr parser to improve robustness and prevent crashes. Fixes #132891.	2025-03-28 10:11:14 +08:00
egebeysel	3a3732c252	[mlir][arith] wide integer emulation support for fptoi ops (#132375 ) Adding wide integer emulation support for `arith.fptoi` operations. As the other emulated operations, the upper and lower `N` bits of the `i2N` integer result are emitted separately. For the unsigned case we use the following emulation ```c // example is 64 -> 32 bit emulation, but the implementation is generalized to any 2N -> N case const double TWO_POW_N = (uint_64_t(1) << N); // 2^N, N is the bitwidth of the widest int supported // f is a floating-point value representing the input of the fptoui op. uint32_t hi = (uint32_t)(f / TWO_POW_N); // Truncates the division result uint32_t lo = (uint32_t)(f - hi * TWO_POW_N); // Subtracts to get the lower bits. ``` For the signed case, we defer the emulation of the absolute value to `fptoui` and handle the sign: ``` fptosi(fp) = sign(fp) * fptoui(abs(fp)) ``` The edge cases of `NaNs, +-inf` and overflows/underflows are undefined behaviour and the resulting numbers are the combination of the lower bitwidth UB values. These operations also propagate poison values. Signed-off-by: Ege Beysel <beysel@roofline.ai>	2025-03-27 20:58:56 -04:00
Krzysztof Drewniak	d7c53a91c2	[mlir] Decouple enum generation from attributes, adding EnumInfo and EnumCase (#132148 ) This commit pulls apart the inherent attribute dependence of classes like EnumAttrInfo and EnumAttrCase, factoring them out into simpler EnumCase and EnumInfo variants. This allows specifying the cases of an enum without needing to make the cases, or the EnumInfo itself, a subclass of SignlessIntegerAttrBase. The existing classes are retained as subclasses of the new ones, both for backwards compatibility and to allow attribute-specific information. In addition, the new BitEnum class changes its default printer/parser behavior: cases when multiple keywords appear, like having both nuw and nsw in overflow flags, will no longer be quoted by the operator<<, and the FieldParser instance will now expect multiple keywords. All instances of BitEnumAttr retain the old behavior.	2025-03-27 19:40:06 -05:00
Michael Liao	52975d5c9f	[mlir][scf] Allow different forwarding ordering in uplift - Allow 'before' arguments are forwarded in different order to 'after' body when uplifting `scf.while` to `scf.for`.	2025-03-27 18:09:07 -04:00
AdityaK	f2849fe05f	Fix RemoveDeadValues: Bail out early when there are no terminators (#133316 ) Fixes: #131765	2025-03-27 15:04:31 -07:00
Bruno Cardoso Lopes	7c3ecffe9b	[MLIR][LLVMIR] Add support for the full form of global_{ctor,dtor} (#133176 ) Currently only ctor/dtor list and their priorities are supported. This PR adds support for the missing data field. Few implementation notes: - The assembly printer has a fixed form because previous `attr_dict` will sort the dict by key name, making global_dtor and global_ctor differ in the order of printed arguments. - LLVM's `ptr null` is being converted to `#llvm.zero` otherwise we'd have to create a region to use the default operation conversion from `ptr null`, which is silly given that the field only support null or a symbol.	2025-03-27 14:11:05 -07:00
egebeysel	a1a5594ad2	[mlir][arith] add wide integer emulation support for subi (#133248 ) Adds wide integer emulation support for the `arith.subi` op. `(i2N, i2N) -> (i2N)` ops are emulated as `(vector<2xiN>, vector<2xiN>) -> (vector<2xiN>)`, just as the other emulation patterns. The emulation uses the following scheme: ``` resLow = lhsLow - rhsLow; // carry = 1 if rhsLow > lhsLow resHigh = lhsLow - carry - rhsLow; ``` Signed-off-by: Ege Beysel <beysel@roofline.ai>	2025-03-27 15:01:04 -05:00
Fabian Mora	1a7af2a90f	[mlir][DataLayout] Add `IsolatedFromAbove` to `DataLayoutOpInterface` (#132742 ) This patch adds the `IsolatedFromAbove` trait as a dependent trait to the `DataLayoutOpInterface` op interface. The motivation behind this change comes from the implementation of the `ptr` dialect, specifically the `ptr.type_offset` op. This op produces an int-like value that equates to the size of a memory element. This is useful for ptr arithmetic and indexing arrays. For example: ```mlir %f32_off = ptr.type_offset f32 : index %addr = ptr.ptradd %ptr, %f32_off : !ptr, index %x = ptr.load %addr : !ptr -> f32 // Read ptr[1] ``` Without the `IsolatedFromAvobe` trait in the DL interface, the `ptr.type_offset` cannot be `ConstantLike`. Why? Take the example: ```mlir op {DL1} { %f32_off0 = ptr.type_offset f32 : index op {DL2} { %f32_off1 = ptr.type_offset f32 : index } } ``` If `ptr.type_offset` were to be `ConstantLike` then `canonicalize` would hoist and unique the value. However, that could be wrong as DL2 could have an entry to specify the size that's different from the size in DL1. The best solution to the above problem is to make `DataLayoutOpInterface` require the `IsolatedFromAbove` trait, as it preserves the constness of values in the DL with respect to the canonicalizer.	2025-03-27 14:37:37 -04:00
Bruno Cardoso Lopes	08aedf7201	[MLIR][LLVM] Lift alignstack attribute ptr type restriction (#133195 ) Current usage of alignstack is restricted to LLVM pointer types, whereas when it's used in parameters it's possible to use it for other types, see examples like `{i8, i8}, [2 x float], etc` in `llvm/test/CodeGen`. This PR lifts the restriction and add testcases.	2025-03-27 10:28:37 -07:00
Guray Ozen	38d9a44510	[MLIR][NVGPU] Add `tma.fence.descriptor` OP (#133218 ) When the TMA descriptor is transferred from host memory to global memory using cudaMemcpy, each thread block must insert a fence before any thread accesses the updated tensor map in global memory. Once the tensor map has been accessed, no additional fences are needed by that block unless the map is modified again. [Example from cuda programming guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#using-tma-to-transfer-multi-dimensional-arrays). The `tma.fence.descriptor` basically implements `ptx::fence_proxy_tensormap_generic`. ``` #include <cuda.h> #include <cuda/ptx> namespace ptx = cuda::ptx; __device__ CUtensorMap global_tensor_map; __global__ void kernel(CUtensorMap *tensor_map) { // Fence acquire tensor map: ptx::n32_t<128> size_bytes; // Since the tensor map was modified from the host using cudaMemcpy, // the scope should be .sys. ptx::fence_proxy_tensormap_generic( ptx::sem_acquire, ptx::scope_sys, tensor_map, size_bytes ); // Safe to use tensor_map after fence inside this thread.. } int main() { CUtensorMap local_tensor_map; // [ ..Initialize map.. ] cudaMemcpy(&global_tensor_map, &local_tensor_map, sizeof(CUtensorMap), cudaMemcpyHostToDevice); kernel<<<1, 1>>>(global_tensor_map); } ```	2025-03-27 15:20:19 +01:00
Guray Ozen	bc7e3915e1	[MLIR][NVGPU] Add `mbarrier.get` Op (#133221 ) The `mbarrier.create` op can create multiple mbarrier objects, and other mbarrier-related ops can access an mbarrier using a dynamic SSA value. This is especially useful when using mbarriers in dynamic loops. This PR adds the `mbarrier.get` op, which returns a pointer to a specific mbarrier object from a group of barriers created by the nvgpu.mbarrier.create operation. It is useful when composing the NVGPU and NVVM dialects. Example: ``` %mbars = nvgpu.mbarrier.create -> !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>, num_barriers = 10> %mbar_pointer = nvgpu.mbarrier.get %mbars[%c2] : !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>> -> i32 ```	2025-03-27 15:20:07 +01:00
Longsheng Mou	ac09b789d8	[mlir][scf] Remove redundant ensureTerminator for `scf.forall` (#133081 ) The override function `ensureTerminator` ensures that the terminator `InParallelOp` has a region. However, if the terminator of `scf.forall` is not an `InParallelOp`, calling ensureTerminator causes a crash. Since the InParallelOp builder already guarantees the existence of a region, `ForallOp::ensureTerminator` is redundant and can be safely removed. Fixes #130019.	2025-03-27 20:07:20 +08:00
Letu Ren	9438694a54	[mlir][llvmir] Add llvm.intr.ldexp operation (#133070 ) https://llvm.org/docs/LangRef.html#llvm-ldexp-intrinsic	2025-03-27 07:50:39 +01:00
Letu Ren	ad51368881	[mlir][llvmir] add llvm.experimental.constrained.sitofp intrinsics (#133166 ) https://llvm.org/docs/LangRef.html#llvm-experimental-constrained-sitofp-intrinsic Signed-off-by: Letu Ren <fantasquex@gmail.com>	2025-03-27 07:50:04 +01:00
Frank Schlimbach	9269aaecff	[mlir][mesh] fixes for 0d tensors (#132948 ) In some cases 0d tensors have no sharding. This PR provides a few minor fixes to account for such cases.	2025-03-26 18:13:41 +01:00
Guray Ozen	e8dfd70fe2	[MLIR][NVGPU] Use `gpu.dynamic_shared_memory` in tests (#133122 ) Reland #133051	2025-03-26 18:00:22 +01:00
Karlo Basioli	3f82c3d5a8	Revert "[MLIR][NVGPU] Use `gpu.dynamic_shared_memory` in tests" (#133103 ) Reverts llvm/llvm-project#133051 due to failing integration tests	2025-03-26 15:39:14 +00:00

1 2 3 4 5 ...

12848 Commits