1
0
mirror of https://github.com/ROCm/jax.git synced 2025-04-22 13:46:08 +00:00
Adam Paszke 8da93249d2 [Mosaic GPU] Fuse slicing into s4 -> bf16 upcasts
This allows us to significantly simplify the generated PTX/SASS,
which is currently cluttered with LLVM trying to align slices to
start at bit 0 and failing to CSE the right shifts.

PiperOrigin-RevId: 737967890
2025-03-18 05:38:49 -07:00
..
2025-02-25 09:30:08 -08:00
2025-02-13 04:35:06 +00:00
2025-03-05 17:44:34 +00:00
2025-02-25 09:30:08 -08:00
2025-02-18 16:47:19 -08:00
2025-03-12 18:15:14 -04:00
2024-09-11 23:34:03 +10:00