Mathew Odden 6b35155294
Fix invalid lowerings for ROCm in Pallas (#223)
popcount and clz were effectively broken on ROCm,
since math_dialect had incorrect lowerings.

Use the device intrinsics for these functions, as
well as for exp and absf, which fixes some accuracy issues in
the pallas tests.

Docs for OCML/OCKL

- https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/doc/OCML.md
- https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/doc/OCKL.md
2025-02-14 11:27:52 -06:00
..
2023-08-01 16:43:13 -07:00
2025-01-15 10:22:29 +00:00