mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-17 00:16:38 +00:00

This commit also enables fp16 log, which was previously missing. Other than that, no changes to codegen for AMDGPU/Nvidia targets. Note that for simplicity this commit doesn't try to refactor or optimize the implementations. Notably, each log is only implementated for scalar types; vector types are scalarized. It doesn't look too difficult to make the implementations suitable for vector codegen, so I'll try that in a future commit. There's also an unused implementation of log in clc_log_base.h, whereas the implementation currently used by libclc targets re-uses log2 with an additional multiplication. That should also be cleaned up as on first inspection it looks a more optimal implementation, though it would have to be checked against the OpenCL CTS for good measure.