Benjamin Chetioui
d3be190efb
[Mosaic GPU] Delete unused declarations of mosaic_gpu_memcpy_async_h2d
.
...
PiperOrigin-RevId: 716616807
2025-01-17 04:34:48 -08:00
Sergei Lebedev
d34c40f6b6
[mosaic_gpu] Added a serialization pass
...
The pass adds versioning to the Mosaic GPU IR in the lowered custom calls
and can apply forward/backward migration rules. Currently, no rules are
necessary since we are at version 1.
PiperOrigin-RevId: 716596848
2025-01-17 03:12:51 -08:00
Adam Paszke
bd22bfef71
[Mosaic TPU] Use large to compact 2nd minor retiling for conversions going both ways
...
This specific retiling is its own inverse and it faster than alternatives.
PiperOrigin-RevId: 716360070
2025-01-16 13:35:26 -08:00
Tzu-Wei Sung
5c020ee317
[Mosaic] Fix infer/apply extensions.
...
1. For apply, llvm::StringMap()::insert(MapEntryTy*) will cause dangling reference if not constructing mlir::tpu::extensions::rules() with const-reference. However, if we do construct it with const-reference, the signature is not const-qualified and fails to compile. Hence, change it to llvm::StringMap()::insert(std::pair<...>) and get extension rules by const-reference.
2. Pass default tiling to infer rule, we need it to infer single op. See infer of tpu::MatmulOp.
PiperOrigin-RevId: 716274818
2025-01-16 09:57:14 -08:00
Sergei Lebedev
4221f109d1
[mosaic] Extracted serialization pass traversal logic into a reusable function
...
I will use it to implement Mosaic GPU serialization pass in a follow up.
PiperOrigin-RevId: 716156650
2025-01-16 02:58:06 -08:00
Tzu-Wei Sung
4a9cc9ffc1
[Mosaic] Allow passing ApplyVectorLayoutCtx
to tpu.apply_layout_op.
...
To make it the same with C++ API. While I'm here, fix a bug in test_concatenate.
PiperOrigin-RevId: 716016244
2025-01-15 17:47:36 -08:00
Naums Mogers
d3ba1eb339
[Mosaic] Add a macro to convert abseil StatusOr to LLVM FailureOr
...
PiperOrigin-RevId: 715943314
2025-01-15 14:19:29 -08:00
George Necula
f1b894d14a
Reverts 391bad8ff59c07c8fad7b8ce05cd0e29dee4cf1a
...
PiperOrigin-RevId: 715435319
2025-01-14 10:31:59 -08:00
Ayaka
9ba1fd2801
[Pallas TPU] Add vector support to pl.debug_print
...
PiperOrigin-RevId: 715085454
2025-01-13 13:22:21 -08:00
Adam Paszke
391bad8ff5
[Mosaic TPU] Add support for arith.fptosi with non-32bit source and target types
...
This effectively moves some of the Pallas logic to the layer below.
PiperOrigin-RevId: 714965374
2025-01-13 07:49:13 -08:00
Tomás Longeri
7852045582
[Mosaic TPU] Enable non-sublane-aligned bf16 2D load/stores for earlier TPU gens
...
It is still not efficiently implemented, this is mostly to clean up some logic. We may be able to fuse the creation of masks for different tiles into the creation of a single one. But this is also a problem for the later gens.
This also cleans up an unreachable return statement.
PiperOrigin-RevId: 714847066
2025-01-12 23:58:40 -08:00
Tomás Longeri
0930289997
[Mosaic TPU][NFC] Remove redundant num_subelems attribute from CreateSubelementMaskOp
...
PiperOrigin-RevId: 714795856
2025-01-12 19:34:25 -08:00
jax authors
a16fbffc13
[Mosaic][TPU] Add a compatibility mode to Mosaic's canonicalization pass, skipping over elementwise and matmul op insertions and/or type compat casts.
...
PiperOrigin-RevId: 714132282
2025-01-10 12:12:54 -08:00
Adam Paszke
d2a5e8d072
[Mosaic TPU] Add support for integer truncation from packed types
...
PiperOrigin-RevId: 714048232
2025-01-10 07:40:55 -08:00
Adam Paszke
07f4fd3e51
[Mosaic TPU] Fix a bug in the impl of sublane broadcasts for int8 and int4
...
PiperOrigin-RevId: 713675029
2025-01-09 08:05:25 -08:00
Adam Paszke
f96339be1e
[Mosaic TPU] Be much more aggressive in inferring large 2nd minor layouts for 16-bit types on v6
...
This often lets us avoid ambiguities between selecting the (8, 128) and (16, 128) tiling,
by biasing the layout inference to prefer the latter.
PiperOrigin-RevId: 713270421
2025-01-08 06:30:36 -08:00
Adam Paszke
5fd1b2f825
[Mosaic TPU] Add support for second minor broadcasts with packed types
...
PiperOrigin-RevId: 713259707
2025-01-08 05:45:02 -08:00
Adam Paszke
e954930eaf
[Mosaic TPU] Add support for true divide in bf16 on TPUv6
...
PiperOrigin-RevId: 713247480
2025-01-08 04:49:22 -08:00
Tzu-Wei Sung
bf94389b08
[Mosaic] Use tpu::CreateMask for getX32VmaskByPaddingEnd.
...
It was cmp + iota before.
PiperOrigin-RevId: 713240888
2025-01-08 04:18:53 -08:00
Sharad Vikram
4caa263a94
[Mosaic TPU] Add some elementwise canonicalizations
...
PiperOrigin-RevId: 712671502
2025-01-06 15:10:02 -08:00
Peter Hawkins
90d8f37863
Rename pybind_extension to nanobind_extension.
...
We have no remaining uses of pybind11 outside a GPU custom call example.
PiperOrigin-RevId: 712608834
2025-01-06 11:53:44 -08:00
Jevin Jiang
9f842909ce
[Mosaic TPU] Validate inserted layout in relayout-insertion pass.
...
PiperOrigin-RevId: 712595778
2025-01-06 11:15:47 -08:00
Tzu-Wei Sung
57b21541a2
[Mosaic] NFC: Pull out vreg related functions to util.
...
These functions are related to vreg manipulation and are used in different rules.
PiperOrigin-RevId: 711484002
2025-01-02 11:50:19 -08:00
jax authors
68483b8ed6
Merge pull request #25710 from apaszke:mgpu_dialect_fix
...
PiperOrigin-RevId: 711430610
2025-01-02 08:23:28 -08:00
Adam Paszke
64433435ff
Fix OSS build for the Mosaic GPU dialect
2025-01-02 15:55:03 +00:00
Tomás Longeri
ac817b48ca
[Mosaic:TPU][NFC] Clean up unused variable
...
PiperOrigin-RevId: 711412888
2025-01-02 06:57:38 -08:00
Tomás Longeri
4452960947
[Mosaic:TPU] In infer ext rule, avoid assigning offsets outside of dst first tile
...
Note that offsets outside of first tile are still disabled (for both infer and apply), and once we support it we will want to assign offsets differently, this is mostly to avoid assigning invalid layouts (that may not just be outside the first tile, but outside the vreg slice)
PiperOrigin-RevId: 709168368
2024-12-23 15:49:39 -08:00
Tomás Longeri
3c79b98cd9
[Mosaic:TPU] Vreg-slice-aligned offset changes with scratch retiling
...
PiperOrigin-RevId: 709133729
2024-12-23 13:05:14 -08:00
Sergei Lebedev
68ec202d45
Use the right include for gmock and gtest
...
PiperOrigin-RevId: 709058082
2024-12-23 07:34:36 -08:00
Sergei Lebedev
8987867faa
[mosaic_gpu] Include Mosaic GPU dialect fiels into jaxlib
2024-12-23 13:46:25 +00:00
Tomás Longeri
7ecc947184
[Mosaic:TPU] Roll forward of cl/708011538 (expanded trunc support), minus changes in infer-vector-layout
...
We can enable them later but at least this way the support is available to build on
(e.g. in the new insert relayouts pass)
Reverts 05f3a701e769748ff1ec51d50324a3595c4aff0d
PiperOrigin-RevId: 708397219
2024-12-20 12:33:30 -08:00
Tomás Longeri
05f3a701e7
[Mosaic:TPU] Roll back cl/708011538 and cl/708112341
...
Reverts 307c8d3af81f16142fd4c64f501b05a5b69f815e
PiperOrigin-RevId: 708173083
2024-12-19 21:51:44 -08:00
Jevin Jiang
2faf540203
[Mosaic TPU] Add relayout-insertion pass and support bitwidth change for i1 vector relayout
...
We can use relayout-insertion pass to insert necessary ops and their layouts for relayout before unrolling in apply-vector-layout pass.
PiperOrigin-RevId: 708143852
2024-12-19 19:56:40 -08:00
Tomás Longeri
8b02884c3c
[Mosaic:TPU] Fix trunc infer rule after cl/708011538
...
For (1, 128) tiling 32-bit input, it assigns (1, 128) tiling at output, which can be invalid (e.g. it should be (1, 256) for bf16)
PiperOrigin-RevId: 708112341
2024-12-19 18:14:12 -08:00
Tzu-Wei Sung
60ebde89e6
[Mosaic] Extend macros to handle parentheses.
...
PiperOrigin-RevId: 708045694
2024-12-19 15:00:12 -08:00
Tzu-Wei Sung
77f3c114d0
[Mosaic] Remove TODOs that are already addressed or obsolete.
...
PiperOrigin-RevId: 708045439
2024-12-19 14:58:15 -08:00
Tomás Longeri
307c8d3af8
[Mosaic:TPU] For trunc, expand supported tilings, offsets and bitwidths
...
infer-vector-layout won't use the full generality anytime soon, but we could reuse this logic for relayouts
PiperOrigin-RevId: 708011538
2024-12-19 13:31:59 -08:00
Benjamin Chetioui
3915f4a147
[Mosaic GPU] Commit to using Vector
s everywhere (and no Tensor
s).
...
PiperOrigin-RevId: 707912637
2024-12-19 07:51:58 -08:00
Benjamin Chetioui
66ad2082ba
[Mosaic GPU] Replace the dialect's layout enum with layouts holding the proper
...
sub-attributes.
PiperOrigin-RevId: 707846907
2024-12-19 02:59:26 -08:00
Tomás Longeri
8188c57475
[Mosaic:TPU][NFC] Small cleanup of extui rule in apply-vector-layout
...
Removed some duplicate variables, changed dyn_cast to cast, and used in/out consistently instead of source/dst
PiperOrigin-RevId: 707836363
2024-12-19 02:13:18 -08:00
Jevin Jiang
3a5c4da4ef
[Mosaic TPU] Support i32 vector multi reduction except cross lane.
...
PiperOrigin-RevId: 707708236
2024-12-18 16:49:07 -08:00
Naums Mogers
6bcec910f2
[Mosaic] Improve error verbosity of tpu.memref_slice verification
...
Breaks down the compound verification conditional into smaller checks with verbose error messages.
PiperOrigin-RevId: 707699990
2024-12-18 16:18:45 -08:00
Naums Mogers
de359f5ce0
[Mosaic] Verify that the target IDs are provided in remote DMAs
...
Adds an extra verification check. Since the source semaphore is used only for remote DMAs, we should check that device or core IDs are also provided when source semaphore is provided.
PiperOrigin-RevId: 707675228
2024-12-18 14:49:59 -08:00
Tomás Longeri
13e721a25e
[Mosaic:TPU][NFC] Delete unused functions
...
PiperOrigin-RevId: 707660214
2024-12-18 14:00:22 -08:00
Jevin Jiang
bf692efbfb
[Mosaic TPU] Support direct cast i8 vector to mask
...
PiperOrigin-RevId: 707617318
2024-12-18 11:35:14 -08:00
Adam Paszke
6edfe9eae5
[Mosaic TPU] Add support for bf16 second minor reductions in TPUv6
...
PiperOrigin-RevId: 707557416
2024-12-18 08:17:43 -08:00
Tomás Longeri
dc0b77470e
[Mosaic:TPU] Allow null parts for tpu.pack_subelements, meaning "don't care"
...
PiperOrigin-RevId: 707439259
2024-12-18 00:56:41 -08:00
Tomás Longeri
f9737b957e
[Mosaic:TPU] Fix bug after cl/707025084
...
`tile_masks` was updated to use implicit, but we skipped the reshape for `tiles`
Seems like there was even a bug before cl/707025084: `tile_masks` was never reshaped, so if the shape was 1D and a store mask was specified, there would be a mismatch in dimensions.
PiperOrigin-RevId: 707368670
2024-12-17 20:31:29 -08:00
Adam Paszke
4911a396b2
[Mosaic TPU] Add support for the interleaved pack format to tpu.unpack_subelements
...
PiperOrigin-RevId: 707142562
2024-12-17 09:58:07 -08:00
Benjamin Chetioui
36b12d58f4
[Mosaic GPU] Add end-to-end lowering example for a pointwise kernel using the dialect and layout inference.
...
Also implement a lowering rule for `arith.AddFOp`.
PiperOrigin-RevId: 707131747
2024-12-17 09:28:05 -08:00