rocm_jax/jax/experimental
Adam Paszke 21598d02e5 [Mosaic GPU] Add support for non-multicast .cta_group::2 async_copies
This instruction is particularly useful for collective MMA, since it lets us
easily report on the progress of async copies from both blocks in the single
block that will be performing the MMA.

PiperOrigin-RevId: 725618793
2025-02-11 07:13:35 -08:00
..
2025-01-14 19:04:49 -06:00