This adds a Maintainers.md files to libclc. Recently I needed to find a
libclc maintainer and I had no idea there was one listed in llvm/
instead of in libclc/.
These functions all map to the corresponding LLVM intrinsics, but the
vector intrinsics weren't being generated. The intrinsic mapping from
CLC vector function to vector intrinsic was working correctly, but the
mapping from OpenCL builtin to CLC function was suboptimally recursively
splitting vectors in halves.
For example, with this change, `ceil(float16)` calls `llvm.ceil.v16f32`
directly once optimizations are applied.
Now also, instead of generating LLVM intrinsics through `__asm` we now
call clang elementwise builtins for each CLC builtin. This should be a
more standard way of achieving the same result
The CLC versions of each of these builtins are also now built and
enabled for SPIR-V targets. The LLVM -> SPIR-V translator maps the
intrinsics to the appropriate OpExtInst, so there should be no
difference in semantics, despite the newly introduced indirection from
OpenCL builtin through the CLC builtin to the intrinsic.
The AMDGPU targets make use of the same `_CLC_DEFINE_UNARY_BUILTIN`
macro to override `sqrt`, so those functions also appear more optimal
with this change, calling the vector `llvm.sqrt.vXf32` intrinsics
directly.
Using '.hi' on a vector3 is technically allowed by the spec and is
treated as a 4-element vector with an "undefined" w component. However,
it's more undef/poison code for the compiler to process and remove. We
can easily avoid it with a dedicated macro.
This keeps values in vectors, rather than scalarizing them and then
reconstituting the vector. The builtin is identical to performing a
C-style cast on each element, which is what we were doing by recursively
splitting the vector down to calling the "base" conversion function on
each element.
The OpenCL relational functions now call their CLC counterparts, and the
CLC relational functions are defined identically to how the OpenCL
functions were defined.
As usual, clspv and spir-v targets bypass these.
No observable changes to any libclc target (measured with llvm-diff).
These functions are all mapped to LLVM intrinsics.
The clspv and spirv targets don't declare or define any of these CLC
functions, and instead map these to their corresponding OpenCL symbols.
These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.
Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins library.
Preprocessor definitions for these targets have also been changed
somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.
There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
Some libclc builtins currently use internal builtins prefixed with
'__clc_' for various reasons, e.g., to avoid naming clashes.
This commit formalizes this concept by starting to isolate the
definitions of these internal clc builtins into a separate
self-contained bytecode library, which is linked into each target's
libclc OpenCL builtins before optimization takes place.
The goal of this step is to allow additional libraries of builtins
that provide entry points (or bindings) that are not written in OpenCL C
but still wish to expose OpenCL-compatible builtins. By moving the
implementations into a separate self-contained library, entry points can
share as much code as possible without going through OpenCL C.
The overall structure of the internal clc library is similar to the
current OpenCL structure, with SOURCES files and targets being able to
override the definitions of builtins as needed. The idea is that the
OpenCL builtins will begin to need fewer target-specific overrides, as
those will slowly move over to the clc builtins instead.
Another advantage of having a separate bytecode library with the CLC
implementations is that we can internalize the symbols when linking it
(separately), whereas currently the CLC symbols make it into the final
builtins library (and perhaps even the final compiled binary).
This patch starts of with 'dot' as it's relatively self-contained, as
opposed to most of the maths builtins which tend to pull in other
builtins.
We can also start to clang-format the builtins as we go, which should
help to modernize the codebase.
This splits off several key parts of the build system into utility
methods. This will be used in upcoming patches to help provide
additional sets of target-specific builtin libraries.
Running llvm-diff on the resulting LLVM bytecode binaries, and regular
diff on SPIR-V binaries, shows no differences before and after this
patch.
I just tried using LLVM backend names here e.g. NVPTX but libclc want's
targets more like triples. This change adds a mesasge to tell you that.
Before you got:
```
libclc target 'AMDGCN' is enabled
CMake Error at CMakeLists.txt:253 (list):
list index: 1 out of range (-1, 0)
CMake Error at CMakeLists.txt:254 (list):
list index: 2 out of range (-1, 0)
Configuring incomplete, errors occurred!
```
Now you get:
```
CMake Error at CMakeLists.txt:145 (message):
Unknown target in LIBCLC_TARGETS_TO_BUILD: "AMDGCN"
Valid targets are:
amdgcn--;amdgcn--amdhsa;clspv--;clspv64--;r600--;nvptx--;nvptx64--;nvptx--nvidiacl;nvptx64--nvidiacl;amdgcn-mesa-mesa3d
```
Some of the targets are dynamic based on what is installed, so spirv
isn't here for me because I don't have llvm-spirv installed yet.
So this is not perfect but it's an improvement on the current behaviour.
The configure Python script was removed by
d6e0e6d255a7d54a3873b7a5d048eee00ef6bb6d /
https://reviews.llvm.org/D69966.
The readme was never updated with the cmake way to do it. I couldn't
find any dedicated buildbots for this so I'm making an educated guess.
This is what built locally for me.
This seems to be an artifact from the intial import in 2012, but even if
not, folks are better off reading the LICENSE.TXT file for the full
details if they need them.
Fixes#109968
The `ARCHIVE` artifact kind is not valid for `install(FILES ...)`.
Additionally, install wasn't resolving the target's `TARGET_FILE`
properly and was trying to find it in the top-level build directory, rather than
in the libclc binary directory. This is because our `TARGET_FILE`
properties were being set to relative paths. The cmake behaviour they
are trying to mimic - `$<TARGET_FILE:$tgt>` - provides an absolute path.
As such this patch updates instances where we set the `TARGET_FILE`
property to return an absolute path.
* Move the setup_host_tool calls to the directories of their tool.
Although it works to call it in libclc, it can only appear in a single
location so it fails the "what if everyone did this?" test and causes
problems for downstream code that also wants to use native versions of
these tools from other projects.
* Correct the TARGET "${${tool}_target}" check. "${${tool}_target}" may
be set to the path to the executable, which works in dependencies but
cannot be tested using if(TARGET). For lack of a better alternative,
just check that "${${tool}_target}" is non-empty and trust that if it
is, it is set to a meaningful value. If somehow it turns out to be a
valid target, its value will still show up in error messages anyway.
* Account for llvm-spirv possibly being provided in-tree. Per
https://github.com/KhronosGroup/SPIRV-LLVM-Translator?tab=readme-ov-file#llvm-in-tree-build
it is possible to drop llvm-spirv into LLVM and have it built as part of
LLVM's build. In this configuration, cross builds of LLVM require a
native version of llvm-spirv to be built.
Increase fp16 support to allow clspv to continue to be OpenCL compliant
following the update of the OpenCL-CTS adding more testing on math
functions and conversions with half.
Math functions are implemented by upscaling to fp32 and using the fp32
implementation. It garantees the accuracy required for half-precision
float-point by the CTS.
Reviewers of #89153 suggested to break up the patch into per-subproject
patches. This is the libclc part. See #89153 for the entire series and
motivation.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
When performing cross in-tree builds, we need native versions of various
tools, we cannot assume the cross builds that are part of the current
build are executable. LLVM provides the setup_host_tool function to
handle this, either picking up versions of tools from
LLVM_NATIVE_TOOL_DIR, or implicitly building native versions as needed.
Use it for libclc too.
LLVM's setup_host_tool function assumes the project is LLVM, so this
also needs libclc's project() to be conditional on it being built
standalone. Luckily, the only change this needs is using
CMAKE_CURRENT_SOURCE_DIR instead of PROJECT_SOURCE_DIR.
With the Makefile generator and particularly high build parallelism some
intermediate dependencies may be generated redundantly and concurrently,
leading to build failures.
To fix this, arrange for libclc's add_custom_commands to depend on
targets in addition to files.
This follows CMake documentation's[^1] guidance on add_custom_command:
> Do not list the output in more than one independent target that may
> build in parallel or the instances of the rule may conflict. Instead,
> use the add_custom_target() command to drive the command and make the
> other targets depend on that one.
Eliminating the redundant commands also improves build times.
[^1]: https://cmake.org/cmake/help/v3.29/command/add_custom_command.html
Instead add a proper attribute in clang, and add convert it to function
metadata to keep the information in the IR. The goal is to remove the
dependency on __attribute__((assume)) that should have not be there in
the first place.
Ref https://github.com/llvm/llvm-project/pull/84934
We've recently seen the libclc llvm-link invocations become so long that
they exceed the character limits on certain platforms.
Using a 'response file' should solve this by offloading the list of
inputs into a separate file, and using special syntax to pass it to
llvm-link. Note that neither the response file nor syntax aren't
specific to Windows but we restrict it to that platform regardless. We
have the option of expanding it to other platforms in the future.
Commit #87622 broke the build. Ninja was happy with creating the output
directories as necessary, but Unix Makefiles isn't. Ensure they are
always created.
Fixes#88626.
Building the libclc project in-tree with debug tools can be very slow.
This commit adds an option for a user to specify a dierctory from which
to import (e.g., release-built) tools. All tools required by the project
must be imported from the same location, for simplicity.
Original patch downstream authored by @jchlanda.
The previous build system was adding custom "OpenCL" and "LLVM IR"
languages in CMake to build the builtin libraries. This was making it
harder to build in-tree because the tool binaries needed to be present
at configure time.
This commit refactors the build system to use custom commands to build
the bytecode files one by one, and link them all together into the final
bytecode library. It also enables in-tree builds by aliasing the
clang/llvm-link/etc. tool targets to internal targets, which are
imported from the LLVM installation directory when building out of tree.
Diffing (with llvm-diff) all of the final bytecode libraries in an
out-of-tree configuration against those built using the current tip
system shows no changes. Note that there are textual changes to metadata
IDs which confuse regular diff, and that llvm-diff 14 and below may show
false-positives.
This commit also removes a file listed in one of the SOURCEs which
didn't exist and which was preventing the use of
ENABLE_RUNTIME_SUBNORMAL when configuring CMake.