725 Commits

Author SHA1 Message Date
Fraser Cormack
f567c03221 [libclc] Disable external-calls testing for clspv targets (#127529)
These targets don't include all OpenCL builtins, so there will always be
external calls in the final bytecode module.

Fixes #127316.

(cherry picked from commit 9fec0a0942f5a11f4dcfec20aa485a8513661720)
2025-02-18 15:14:35 -08:00
Nikita Popov
e2426cd9e9 [libclc] Allow default path when looking for llvm-spirv (#126071)
This is an external tool, so I don't think there is an expectation that
it has to be in the LLVM tools bindir. It may also be in the default
system bindir (which is not necessarily the same).

(cherry picked from commit 26ecddb05d13c101ccd840a6710eb5f8b82de841)
2025-02-07 14:50:35 -08:00
Fraser Cormack
bb95335982 [libclc][NFC] Clang-format includes 2025-01-28 18:00:25 +00:00
Fraser Cormack
a8c82d5fde
[libclc] Optimize isfpclass-like CLC builtins (#124145)
The builtins we were using to implement __clc_is(finite|inf|nan|normal)
-- __builtin_isfinite, etc. -- don't take vector types so we were
previously scalarizing. The __builtin_isfpclass builtin does take vector
types and thus allows us to keep things in vectors.

There is no change in codegen to the scalar versions of any of these
builtins.
2025-01-28 16:23:52 +00:00
Romaric Jodin
9d8d538e40
libclc: clspv: add missing clc_isnan.cl dependency (#124614)
clc_isnan.cl is needed since
https://github.com/llvm/llvm-project/pull/124097
2025-01-28 14:47:08 +00:00
Fraser Cormack
78b5bb702f
[libclc][NFC] Move key math headers to CLC (#124739) 2025-01-28 14:17:23 +00:00
Fraser Cormack
cfc8ef0ad8
[libclc] Move copysign to CLC library; fix & optimize (#124598)
This commit moves the implementation of the copysign builtin to the CLC
library.

It simultaneously optimizes it for vector types by avoiding
scalarization. It does so by using the __builtin_elementwise_copysign
clang builtins, which can handle vector types.

It also fixes a bug in the half/fp16 implementation of the builtin. This
version was using an incorrect mask (0x7FFFF instead of 0x7FFF) and was
thus preserving the original sign bit, rather than masking it out.
2025-01-28 09:18:34 +00:00
Fraser Cormack
c3a0fcc982
[libclc] Optimize CLC vector any/all builtins (#124568)
By using the vector reduction buitins we can avoid scalarization.
Targets that don't support vector reductions will scalarize later on
anyway. The vector reduction builtins should be well-enough supported by
the middle-end to be a generic solution.

This produces conceptually equivalent code: all vector elements are
OR'd/AND'd together and the final scalar is bit-shifted and masked to
produce the final result.

The 'normalize' builtin uses 'all' so its code has similarly improved in
places.
2025-01-27 16:37:21 +00:00
Fraser Cormack
eaa5897534
[libclc] Optimize CLC vector is(un)ordered builtins (#124546)
These are similar to 347fb208, but these builtins are expressed in terms
of other builtins. The LLVM IR generated features the same fcmp ord/uno
comparisons as before, but consistently in vector form.
2025-01-27 14:41:40 +00:00
Fraser Cormack
347fb208c1
[libclc] Optimize CLC vector relational builtins (#124537)
Clang knows how to perform relational operations on OpenCL vectors, so
we don't need to use the Clang builtins. The builtins we were using
didn't support vector types, so we were previously scalarizing.

This commit generates the same LLVM fcmp operations as before, just
without the scalarization.
2025-01-27 13:25:37 +00:00
Fraser Cormack
9705500582
[libclc] Move nextafter to the CLC library (#124097)
There were two implementations of this - one that implemented nextafter
in software, and another that called a clang builtin. No in-tree targets
called the builtin, so all targets build the software version. The
builtin version has been removed, and the software version has been
renamed to be the "default".

This commit also optimizes nextafter, to avoid scalarization as much as
possible. Note however that the (CLC) relational builtins still
scalarize; those will be optimized in a separate commit.

Since nextafter is used by some convert_type builtins, the diff to IR
codegen is not limited to the builtin itself.
2025-01-23 12:24:16 +00:00
Fraser Cormack
9e0b2b68c2
[libclc] Don't rely on fp16 pragma guards in headers (#122751)
Having the fp16 pragmas enabled in the header file is risky. The macros
defined by that header don't (and can't) include the pragmas that make
fp16 types themselves legal, and another header may disable the fp16
pragma before the macro's use.

The safest thing to do is the use of pragmas surrounding each use of the
macro in the implementation files. This pattern is also far more common
across the codebase.
2025-01-22 09:32:20 +00:00
Fraser Cormack
eaf3e1b0d1
[libclc] Route int bitselect through CLC; add half (#123653)
The half variants were missing. The integer bitselect builtins weren't
going through __clc_bitselect due to an oversight when the CLC version
was introduced.
2025-01-21 10:09:25 +00:00
Fraser Cormack
d96ec48068
[libclc] Route select through __clc_select (#123647)
This was missed during the introduction of select. This also unifies the
various .inc files used for each, as they were essentially identical.

The __clc_select function is now also built for SPIR-V targets.
2025-01-21 10:05:39 +00:00
Fraser Cormack
c8eb865747
[libclc] Move mad to the CLC library (#123607)
All targets build `__clc_mad` -- even SPIR-V targets -- since it
compiles to the optimal `llvm.fmuladd` intrinsic. There is no change to
the bytecode generated for non-SPIR-V targets.

The `mix` builtin, which is implemented as a wrapper around `mad`, is
left as an OpenCL-layer wrapper of `__clc_mad`. I don't know if it's
worth having a specific CLC version of `mix`.

The changes to the other CLC files/functions are moving uses of `mad` to
`__clc_mad`, and reformatting. There is an additional instance of
`trunc` becoming `__clc_trunc`, which was missed before.
2025-01-20 16:27:51 +00:00
Fraser Cormack
8b7bfb417a [libclc] Rename include guards. NFC. 2025-01-20 11:26:02 +00:00
Fraser Cormack
a90b5b1885
[libclc] Move degrees/radians to CLC library & optimize (#123222)
Missing half variants were also added.

The builtins are now consistently emitted in vector form (i.e., with a
splat of the literal to the appropriate vector size).
2025-01-17 12:11:53 +00:00
Fraser Cormack
b7e20147ad
[libclc] Move smoothstep to CLC and optimize its codegen (#123183)
This commit moves the implementation of the smoothstep function to the
CLC library, whilst optimizing the codegen.

This commit also adds support for 'half' versions of smoothstep, which
were previously missing.

The CLC smoothstep implementation now keeps everything in vectors,
rather than recursively splitting vectors by half down to the scalar
base form. This should result in more optimal codegen across the board.

This commit also removes some non-standard overloads of smoothstep with
mixed types, such as 'double smoothstep(float, float, float)'. There
aren't any mixed-(element )type versions of smoothstep as far as I can
see:

    gentype smoothstep(gentype edge0, gentype edge1, gentype x)
    gentypef smoothstep(float edge0, float edge1, gentypef x)
    gentyped smoothstep(double edge0, double edge1, gentyped x)
    gentypeh smoothstep(half edge0, half edge1, gentypeh x)

The CLC library only defines the first type, for simplicity; the OpenCL
layer is responsible for handling the scalar/scalar/vector forms. Note
that the scalar/scalar/vector forms now splat the scalars to the vector
type, rather than recursively split vectors as before. The macro that
used to 'vectorize' smoothstep in this way has been moved out of the
shared clcmacro.h header as it was only used for the smoothstep builtin.

Note that the CLC clamp function is now built for both SPIR-V targets.
This is to help build the CLC smoothstep function for the Mesa SPIR-V
target.
2025-01-16 11:44:09 +00:00
Fraser Cormack
a5b88cb815
[libclc] Add missing includes to CLC headers (#118654)
There's no automatic way of checking these headers are self-contained.

Instead of including these common files many times across the whole
codebase, we can include them in the generic `gentype.inc` and
`floatn.inc` files which are included by most CLC headers.
2025-01-15 10:14:51 +00:00
David Spickett
efd929efa5
[libclc] Add Maintainers.md for libclc (#118309)
This adds a Maintainers.md files to libclc. Recently I needed to find a
libclc maintainer and I had no idea there was one listed in llvm/
instead of in libclc/.
2025-01-06 09:16:26 +00:00
Fraser Cormack
06789ccb16
[libclc] Optimize ceil/fabs/floor/rint/trunc (#119596)
These functions all map to the corresponding LLVM intrinsics, but the
vector intrinsics weren't being generated. The intrinsic mapping from
CLC vector function to vector intrinsic was working correctly, but the
mapping from OpenCL builtin to CLC function was suboptimally recursively
splitting vectors in halves.

For example, with this change, `ceil(float16)` calls `llvm.ceil.v16f32`
directly once optimizations are applied.

Now also, instead of generating LLVM intrinsics through `__asm` we now
call clang elementwise builtins for each CLC builtin. This should be a
more standard way of achieving the same result

The CLC versions of each of these builtins are also now built and
enabled for SPIR-V targets. The LLVM -> SPIR-V translator maps the
intrinsics to the appropriate OpExtInst, so there should be no
difference in semantics, despite the newly introduced indirection from
OpenCL builtin through the CLC builtin to the intrinsic.

The AMDGPU targets make use of the same `_CLC_DEFINE_UNARY_BUILTIN`
macro to override `sqrt`, so those functions also appear more optimal
with this change, calling the vector `llvm.sqrt.vXf32` intrinsics
directly.
2024-12-13 08:47:13 +00:00
Fraser Cormack
76befc86de
Reland "[libclc] Create aliases with custom_command (#115885)" (#116025)
This relands commit 2c980310f67c13dd89c8702d40abeab47a4a2b4b after
fixing an issue.
2024-11-13 11:44:21 +00:00
Sylvestre Ledru
2c980310f6 Revert "[libclc] Create aliases with custom_command (#115885)"
for causing: https://github.com/llvm/llvm-project/issues/115942

This reverts commit 584d1a632f3af0daca4db02f7f3b2c7f48ab0ddf.
2024-11-13 10:05:37 +01:00
Fraser Cormack
7387338007 [libclc] Add some include guards to CLC declarations. NFC 2024-11-12 17:25:40 +00:00
Fraser Cormack
584d1a632f
[libclc] Create aliases with custom_command (#115885)
This in conjunction with a custom target prevents them from being
rebuilt if there are no changes.
2024-11-12 17:22:30 +00:00
Fraser Cormack
a55248789e
[libclc] Avoid using undefined vector3 components (#115857)
Using '.hi' on a vector3 is technically allowed by the spec and is
treated as a 4-element vector with an "undefined" w component. However,
it's more undef/poison code for the compiler to process and remove. We
can easily avoid it with a dedicated macro.
2024-11-12 16:23:52 +00:00
Fraser Cormack
0d2ef7af19
[libclc] Use builtin_convertvector to convert between vector types (#115865)
This keeps values in vectors, rather than scalarizing them and then
reconstituting the vector. The builtin is identical to performing a
C-style cast on each element, which is what we were doing by recursively
splitting the vector down to calling the "base" conversion function on
each element.
2024-11-12 16:18:33 +00:00
Fraser Cormack
6ca50a2593 [libclc] Correct use of CLC macro on two definitions
_CLC_DECL is for declarations and _CLC_DEF for definitions, as the names
imply.

No change to any bitcode module.
2024-11-07 17:47:52 +00:00
Fraser Cormack
b231647475
[libclc] Move relational functions to the CLC library (#115171)
The OpenCL relational functions now call their CLC counterparts, and the
CLC relational functions are defined identically to how the OpenCL
functions were defined.

As usual, clspv and spir-v targets bypass these.

No observable changes to any libclc target (measured with llvm-diff).
2024-11-06 19:28:44 +00:00
Fraser Cormack
b4263ddbe7 [libclc] Use __clc_max in CLC functions 2024-11-06 09:16:36 +00:00
Fraser Cormack
7be30fd533 [libclc] Move abs/abs_diff to CLC library 2024-11-06 09:16:35 +00:00
Fraser Cormack
d2d1b5897e
[libclc] Move clcmacro.h to CLC library. NFC (#114845) 2024-11-04 22:00:01 +00:00
Fraser Cormack
293c78ba0a
[libclc] Move ceil/fabs/floor/rint/trunc to CLC library (#114774)
These functions are all mapped to LLVM intrinsics.

The clspv and spirv targets don't declare or define any of these CLC
functions, and instead map these to their corresponding OpenCL symbols.
2024-11-04 16:35:14 +00:00
Fraser Cormack
b4ef43fc75 [libclc] Format clc_fma.cl. NFC 2024-11-04 11:55:42 +00:00
Fraser Cormack
e28d7f7134 [libclc] Format clc_tan.cl. NFC 2024-11-04 10:52:46 +00:00
Fraser Cormack
f1888e4029 [libclc] Add some include guards and format a file 2024-11-04 10:37:11 +00:00
Fraser Cormack
d12a8da1de
[libclc] Move min/max/clamp into the CLC builtins library (#114386)
These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.

Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins library.

Preprocessor definitions for these targets have also been changed
somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.

There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
2024-10-31 16:45:37 +00:00
Fraser Cormack
86974e15f5 [libclc] Restore header order, which formatting broke 2024-10-31 10:33:47 +00:00
Fraser Cormack
fba9f05ff7 [libclc] Format clc_ldexp.cl and clc_hypot.cl. NFC 2024-10-31 10:18:29 +00:00
Fraser Cormack
b2bdd8bd39 [libclc] Create an internal 'clc' builtins library
Some libclc builtins currently use internal builtins prefixed with
'__clc_' for various reasons, e.g., to avoid naming clashes.

This commit formalizes this concept by starting to isolate the
definitions of these internal clc builtins into a separate
self-contained bytecode library, which is linked into each target's
libclc OpenCL builtins before optimization takes place.

The goal of this step is to allow additional libraries of builtins
that provide entry points (or bindings) that are not written in OpenCL C
but still wish to expose OpenCL-compatible builtins. By moving the
implementations into a separate self-contained library, entry points can
share as much code as possible without going through OpenCL C.

The overall structure of the internal clc library is similar to the
current OpenCL structure, with SOURCES files and targets being able to
override the definitions of builtins as needed. The idea is that the
OpenCL builtins will begin to need fewer target-specific overrides, as
those will slowly move over to the clc builtins instead.

Another advantage of having a separate bytecode library with the CLC
implementations is that we can internalize the symbols when linking it
(separately), whereas currently the CLC symbols make it into the final
builtins library (and perhaps even the final compiled binary).

This patch starts of with 'dot' as it's relatively self-contained, as
opposed to most of the maths builtins which tend to pull in other
builtins.

We can also start to clang-format the builtins as we go, which should
help to modernize the codebase.
2024-10-29 13:09:56 +00:00
Fraser Cormack
183b38eb22 [libclc] Split off library build system into helpers
This splits off several key parts of the build system into utility
methods. This will be used in upcoming patches to help provide
additional sets of target-specific builtin libraries.

Running llvm-diff on the resulting LLVM bytecode binaries, and regular
diff on SPIR-V binaries, shows no differences before and after this
patch.
2024-10-29 13:09:56 +00:00
Carl Ritson
076aac59ac
[AMDGPU] Add a new target for gfx1153 (#113138) 2024-10-23 12:56:58 +09:00
David Spickett
a4de127086
[libclc] Give a helpful error when an unknown target is requested (#111528)
I just tried using LLVM backend names here e.g. NVPTX but libclc want's
targets more like triples. This change adds a mesasge to tell you that.

Before you got:
```
 libclc target 'AMDGCN' is enabled
 CMake Error at CMakeLists.txt:253 (list):
   list index: 1 out of range (-1, 0)

 CMake Error at CMakeLists.txt:254 (list):
   list index: 2 out of range (-1, 0)

 Configuring incomplete, errors occurred!
```
Now you get:

```
 CMake Error at CMakeLists.txt:145 (message):
   Unknown target in LIBCLC_TARGETS_TO_BUILD: "AMDGCN"

   Valid targets are:
   amdgcn--;amdgcn--amdhsa;clspv--;clspv64--;r600--;nvptx--;nvptx64--;nvptx--nvidiacl;nvptx64--nvidiacl;amdgcn-mesa-mesa3d
```
Some of the targets are dynamic based on what is installed, so spirv
isn't here for me because I don't have llvm-spirv installed yet.

So this is not perfect but it's an improvement on the current behaviour.
2024-10-09 09:13:26 +01:00
David Spickett
70e0a7e7e6
[libclc] Convert README to Markdown (#111549)
A bit nicer to read on GitHub and with clickable links.

No content changes purely formatting.
2024-10-08 16:58:02 +01:00
David Spickett
64f7e1b697
[libclc] Update build instructions in readme (#111369)
The configure Python script was removed by
d6e0e6d255a7d54a3873b7a5d048eee00ef6bb6d /
https://reviews.llvm.org/D69966.

The readme was never updated with the cmake way to do it. I couldn't
find any dedicated buildbots for this so I'm making an educated guess.
This is what built locally for me.
2024-10-08 16:23:42 +01:00
David Spickett
0e8555d4db
[libclc] Remove mention of BSD license in readme (#111371)
This seems to be an artifact from the intial import in 2012, but even if
not, folks are better off reading the LICENSE.TXT file for the full
details if they need them.

Fixes #109968
2024-10-07 15:26:04 +01:00
Fraser Cormack
9f3728d157
[libclc] Fix installation w/ ENABLE_RUNTIME_SUBNORMAL (#109926)
The `ARCHIVE` artifact kind is not valid for `install(FILES ...)`.

Additionally, install wasn't resolving the target's `TARGET_FILE`
properly and was trying to find it in the top-level build directory, rather than
in the libclc binary directory. This is because our `TARGET_FILE`
properties were being set to relative paths. The cmake behaviour they
are trying to mimic - `$<TARGET_FILE:$tgt>` - provides an absolute path.

As such this patch updates instances where we set the `TARGET_FILE`
property to return an absolute path.
2024-09-30 10:48:30 +01:00
Harald van Dijk
903d1c6ee5
[libclc] More cross compilation fixes (#97811)
* Move the setup_host_tool calls to the directories of their tool.
Although it works to call it in libclc, it can only appear in a single
location so it fails the "what if everyone did this?" test and causes
problems for downstream code that also wants to use native versions of
these tools from other projects.
* Correct the TARGET "${${tool}_target}" check. "${${tool}_target}" may
be set to the path to the executable, which works in dependencies but
cannot be tested using if(TARGET). For lack of a better alternative,
just check that "${${tool}_target}" is non-empty and trust that if it
is, it is set to a meaningful value. If somehow it turns out to be a
valid target, its value will still show up in error messages anyway.
* Account for llvm-spirv possibly being provided in-tree. Per
https://github.com/KhronosGroup/SPIRV-LLVM-Translator?tab=readme-ov-file#llvm-in-tree-build
it is possible to drop llvm-spirv into LLVM and have it built as part of
LLVM's build. In this configuration, cross builds of LLVM require a
native version of llvm-spirv to be built.
2024-09-03 17:01:20 +01:00
Romaric Jodin
46223b5eae
libclc: add half version of 'sign' (#99841) 2024-07-22 11:08:56 +01:00
Romaric Jodin
d9cb65ff48
libclc: fix convert with half (#99481)
Fix following update of libclc introducing more fp16 support:
7e6a73959a
2024-07-18 15:28:58 +02:00