0
0
mirror of https://github.com/llvm/llvm-project.git synced 2025-04-18 12:56:50 +00:00

22 Commits

Author SHA1 Message Date
Fraser Cormack
7d32d72f10 [libclc][NFC] Remove blank line at end of file 2025-04-10 10:02:51 +01:00
Romaric Jodin
135a7874dc
libclc: clspv: fma: remove fp16 implementation ()
clspv is already handling generation of fp16. This implementation is
preventing clspv from making the best choice to use an emulation on top
of fp32-fma, or the native fp16-fma, depending on the command-line
arguments.
2025-04-10 10:01:57 +01:00
Fraser Cormack
f14ff59da7
[libclc] Move exp, exp2 and expm1 to the CLC library ()
These all share the use of a common helper function so are handled in
one go. These builtins are also now vectorized.
2025-04-01 18:15:37 +01:00
Fraser Cormack
bcf0f8d8aa
[libclc] Move exp10 to the CLC library ()
The builtin was already nominally in the CLC library; this commit just
moves it over. It also vectorizes the builtin on its way.
2025-04-01 14:39:17 +01:00
Fraser Cormack
13a313fe58
[libclc] Move sinpi/cospi/tanpi to the CLC library ()
Additionally, these builtins are now vectorized.

This also moves the native_recip and native_divide builtins as they are
used by the tanpi builtin.
2025-04-01 12:03:21 +01:00
Fraser Cormack
7a2b160e76
[libclc] Move rootn to the CLC library; optimize ()
The function was already nominally in the CLC namespace; this commit
just moves it over.

This commit also vectorizes the builtin to avoid scalarization.
2025-04-01 09:19:50 +01:00
Fraser Cormack
b52977b868
[libclc] Move pow, powr & pown to the CLC library ()
These functions were already nominally in the CLC library.

Similar to others, these builtins are now vectorized and are not broken
down into scalar types.
2025-03-28 08:23:24 +00:00
Fraser Cormack
d32e71d7c7
[libclc] Move fmod, remainder & remquo to the CLC library ()
These functions were already nominally in the CLC namespace; this commit
just formally moves them over.

Note that 'half' versions of these CLC functions are now provided.
Previously the corresponding OpenCL builtins would forward directly to
the 'float' versions of the CLC builtins. Now the OpenCL builtins call
the 'half' CLC builtins, which themselves call the 'float' CLC versions.
This keeps the interface between the OpenCL and CLC libraries neater and
keeps the CLC library self-contained.

No changes to the generated code for non-SPIR-V targets is observed.
2025-03-27 14:53:19 +00:00
Fraser Cormack
7d048674a4
[libclc] Add license headers to files missing them ()
This commit bulk updates all '.h', '.cl', '.inc', and '.cpp' files to
add any missing license headers.

The remaining files are generally CMake, SOURCES, scripts, markdown,
etc.

There are still some '.ll' files which may benefit from a license
header. I can't find an example of an LLVM IR file with a license header
in the rest of LLVM, but unlike most other (sub)projects, libclc has
examples of LLVM IR as source files, compiled and built into the
library.
2025-03-24 10:10:38 +00:00
Fraser Cormack
82912fd620
[libclc] Update license headers ()
This commit bulk-updates the libclc license headers to the current
Apache-2.0 WITH LLVM-exception license in situations where they were
previously attributed to AMD - and occasionally under an additional
single individual contributor - under an MIT license.

AMD signed the LLVM relicensing agreement and so agreed for their past
contributions under the new LLVM license.

The LLVM project also has had a long-standing, unwritten, policy of not
adding copyright notices to source code. This policy was recently
written up [1]. This commit therefore also removes these copyright
notices at the same time.

Note that there are outstanding copyright notices attributed to others -
and many files missing copyright headers - which will be dealt with in
future work.

[1]
https://llvm.org/docs/DeveloperPolicy.html#embedded-copyright-or-contributed-by-statements
2025-03-20 11:40:09 +00:00
Fraser Cormack
e5d5503e4e
[libclc] Move hypot to CLC library; optimize ()
This was already nominally in the CLC library; this commit just formally
moves it over. It simultaneously optimizes it for vector types by
avoiding scalarization.
2025-03-04 14:16:16 +00:00
Fraser Cormack
d5038b3774
[libclc] Move __clc_ldexp to CLC library ()
This function was already conceptually in the CLC namespace - this just
formally moves it over.

Note however that this commit marks a change in how libclc functions may
be overridden by targets.

Until now we have been using a purely build-system-based approach where
targets could register identically-named files which took responsibility
for the implementation of the builtin in its entirety.

This system wasn't well equipped to deal with AMD's overriding of
__clc_ldexp for only a subset of types, and furthermore conditionally on
a pre-defined macro.

One option for handling this would be to require AMD to duplicate code
for the versions of __clc_ldexp it's *not* interested in overriding. We
could also make it easier for targets to re-define CLC functions through
macros or .inc files. Both of these have obvious downsides. We could
also keep AMD's overriding in the OpenCL layer and bypass CLC
altogether, but this has limited use.

We could use weak linkage on the "base" implementations of CLC
functions, and allow targets to opt-in to providing their own
implementations on a much finer granularity. This commit supports this
as a proof of concept; we could expand it to all CLC builtins if
accepted.

Note that the existing filename-based "claiming" approach is still in
effect, so targets have to name their overrides differently to have both
files compiled. This could also be refined.
2025-02-26 11:20:25 +00:00
Fraser Cormack
e7ad07ffb8
[libclc] Move fma to the CLC library ()
This builtin is a little more involved than others as targets deal with
fma in various different ways.

Fundamentally, the CLC __clc_fma builtin compiles to
__builtin_elementwise_fma, which compiles to the @llvm.fma intrinsic.
However, in the case of fp32 fma some targets call the __clc_sw_fma
function, which provides a software implementation of the builtin. This
in principle is controlled by the __CLC_HAVE_HW_FMA32 macro and may be a
runtime decision, depending on how the target defines that macro.

All targets build the CLC fma functions for all types. This is to the
CLC library can have a reliable internal implementation for its own
purposes.

For AMD/NVPTX targets there are no meaningful changes to the generated
LLVM bytecode. Some blocks of code have moved around, which confounds
llvm-diff.

For the clspv and SPIR-V/Mesa targets, only fp32 fma is of interest. Its
use in libclc is tightly controlled by checking __CLC_HAVE_HW_FMA32
first. This can either be a compile-time constant (1, for clspv) or a
runtime function for SPIR-V/Mesa.

The SPIR-V/Mesa target only provided fp32 fma in the OpenCL layer. It
unconditionally mapped that to the __clc_sw_fma builtin, even though the
generic version in theory had a runtime toggle through
__CLC_HAVE_HW_FMA32 specifically for that target. Callers of fma,
though, would end up using the ExtInst fma, *not* calling the _Z3fmafff
function provided by libclc.

This commit keeps this system in place in the OpenCL layer, by mapping
fma to __clc_sw_fma. Where other builtins would previously call fma
(i.e., result in the ExtInst), they now call __clc_fma. This function
checks the __CLC_HAVE_HW_FMA32 runtime toggle, which selects between the
slow version or the quick version. The quick version is the LLVM fma
intrinsic which llvm-spirv translates to the ExtInst.

The clspv target had its own software implementation of fp32 fma, which
it called unconditionally - even though __CLC_HAVE_HW_FMA32 is 1 for
that target. This is potentially just so its library ships a software
version which it can fall back on. In the OpenCL layer, the target
doesn't provide fp64 fma, and maps fp16 fma to fp32 mad.

This commit keeps this system roughly in place: in the OpenCL layer it
maps fp32 fma to __clc_sw_fma, and fp16 fma to mad. Where builtins would
previously call into fma, they now call __clc_fma, which compiles to the
LLVM intrinsic. If this goes through a translation to SPIR-V it will
become the fma ExtInst, or the intrinsic could be replaced by the
_Z3fmafff software implementation.

The clspv and SPIR-V/Mesa targets could potentially be cleaned up later,
depending on their needs.
2025-02-24 10:10:51 +00:00
Fraser Cormack
ae5785460d
[libclc] Define macros for users of gentype.inc ()
Several users of (mostly math/) gentype.inc rely on types other than the
'gentype'. This is commonly intN as several maths builtins expose this
as a return or paramter type. We were previously explicitly defining
this type for every gentype.

Other implementations rely on integer types of the same size and element
width as the gentype, such as short/ushort for half, long/ulong for
double, etc.

Users might also rely on as_type or convert_type builtins to/from these
types.

The previous method we used to define intN was unscalable if we wanted
to expose more types and helpers.

This commit introduces a simpler system whereby several macros are
defined at the beginning of gentype.inc. These rely on concatenating
with the vector size. To facilitate this system, scalar gentypes now
define an empty vector size. It was previously undefined, which was
dangerous. An added benefit is that it matches how the integer
gentype.inc vector size has been working.

These macros will be especially helpful for the definitions of
logb/ilogb in an upcoming patch.
2025-02-20 15:24:04 +00:00
Fraser Cormack
78b5bb702f
[libclc][NFC] Move key math headers to CLC () 2025-01-28 14:17:23 +00:00
Fraser Cormack
9705500582
[libclc] Move nextafter to the CLC library ()
There were two implementations of this - one that implemented nextafter
in software, and another that called a clang builtin. No in-tree targets
called the builtin, so all targets build the software version. The
builtin version has been removed, and the software version has been
renamed to be the "default".

This commit also optimizes nextafter, to avoid scalarization as much as
possible. Note however that the (CLC) relational builtins still
scalarize; those will be optimized in a separate commit.

Since nextafter is used by some convert_type builtins, the diff to IR
codegen is not limited to the builtin itself.
2025-01-23 12:24:16 +00:00
Fraser Cormack
d2d1b5897e
[libclc] Move clcmacro.h to CLC library. NFC () 2024-11-04 22:00:01 +00:00
Romaric Jodin
7e6a73959a
libclc: increase fp16 support ()
Increase fp16 support to allow clspv to continue to be OpenCL compliant
following the update of the OpenCL-CTS adding more testing on math
functions and conversions with half.

Math functions are implemented by upscaling to fp32 and using the fp32
implementation. It garantees the accuracy required for half-precision
float-point by the CTS.
2024-07-18 12:00:41 +01:00
Kévin Petit
21508fa769 libclc: clspv: fix fma, add vstore and fix inlining issues
https://reviews.llvm.org/D147773

Patch by Romaric Jodin <rjodin@google.com>
2023-05-09 16:52:13 +01:00
Kévin Petit
f6cd46e07f libclc: add more generic implementations to clspv SOURCES
https://reviews.llvm.org/D134887

Patch by: Aaron Greig <aaron.greig@codeplay.com>
2023-02-14 18:11:01 +00:00
Kévin Petit
f11ab8353f libclc: remove sqrt/rsqrt from clspv SOURCES
https://reviews.llvm.org/D134040

Patch by: Aaron Greig <aaron.greig@codeplay.com>
2023-02-13 21:27:40 +00:00
Alan Baker
21427b8eb8 libclc: Add clspv target to libclc
Add clspv as a new target for libclc. clspv is an open-source compiler that compiles OpenCL C to Vulkan SPIR-V. Compiles for the spir target.

The clspv target differs from the the spirv target in the following ways:
* fma is modified to use uint2 instead of ulong for mantissas. This results in lower performance fma, but provides a implementation that can be used on more Vulkan devices where 64-bit integer support is less common.
* Use of a software implementation of nextafter because the generic implementation depends on nextafter being a defined builtin function for which clspv has no definition.
* Full optimization of the library (-O3) and no conversion to SPIR-V

This library is close to what would be produced by running opt -O3 < builtins.opt.spirv-mesa3d-.bc > builtins.opt.clspv--.bc and continuing the build from that point.

Reviewer: jvesely

Differential Revision: https://reviews.llvm.org/D94013
2021-03-04 00:19:10 -05:00