mirror of
https://github.com/ROCm/jax.git
synced 2025-04-24 03:56:07 +00:00

This should help with understanding cuTensorMapEncodeTiled failures, since CUDA doesn't provide any details beyond the error return code. Note that this change also ensures that TMA descriptors are 64-byte aligned. PiperOrigin-RevId: 656062820