llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-18 13:27:00 +00:00

Author	SHA1	Message	Date
Wenju He	552902455c	[libclc] Add ctz built-in implementation to clc and generic (#135309 )	2025-04-15 15:23:25 +01:00
Wenju He	cbda72a547	[NFC][libclc] Merge atomic extension built-ins with identical name into a single file (#134489 ) llvm-diff shows there is no change to amdgcn--amdhsa.bc. Similar to how cl_khr_fp64 and cl_khr_fp16 implementations are put in a same file for math built-ins, this PR do the same to atom_* built-ins. The main motivation is to prevent that two files with same base name implementats different built-ins. In a follow-up PR, I'd like to relax libclc_configure_lib_source to only compare filename instead of path for overriding, since in our downstream the same category of built-ins, e.g. math, are organized in several different folders.	2025-04-14 10:27:48 +01:00
Fraser Cormack	b0338c3d6c	[libclc] Move shuffle/shuffle2 to the CLC library (#135000 ) This commit moves the shuffle and shuffle2 builtins to the CLC library. In so doing it makes the headers simpler and re-usable for other builtin layers to hook into the CLC functions, if they wish. An additional gentype utility has been made available, which provides a consistent vector-size-or-1 macro for use. The existing __CLC_VECSIZE is defined but empty which is useful in certain applications, such as in concatenation with a type to make a correctly sized scalar or vector type. However, this isn't usable in the same preprocessor lines when wanting to check for specific vector sizes, as e.g., '__CLC_VECSIZE == 2' resolves to '== 2' which is invalid. In local testing this is also useful for the geometric builtins which are only available for scalar types and vector types of 2, 3, or 4 elements. No codegen changes are observed, except the internal shuffle/shuffle2 utility functions are no longer made publicly available.	2025-04-09 15:52:25 +01:00
Fraser Cormack	949bf518fc	[libclc][NFC] Fix up inconsistent copyright headers Some files were accidentally given two copyright headers. Another was missing one. This commit also converts that file's dos line endings to unix ones and reformats a comment.	2025-04-09 12:00:08 +01:00
Fraser Cormack	ddc48fefe3	[libclc] Move native_(exp10\|powr\|tan) to CLC library (#134080 ) These are the three remaining native builtins not yet ported. There are elementwise versions of exp10 and tan which correspond to the intrinsics, which may be preferable to the current versions which route through other native builtins. Those could be changed in a follow-up if desired.	2025-04-02 17:37:17 +01:00
Fraser Cormack	d51525ba36	[libclc] Move lgamma, lgamma_r & tgamma to CLC library (#134053 ) Also enable half-precision variants of tgamma, which were previously missing. Note that unlike recent work, these builtins are not vectorized as part of this commit. Ultimately all three call into lgamma_r, which has heavy control flow (including switch statements) that would be difficult to vectorize. Additionally the lgamma_r algorithm is copyrighted to SunPro so may need a rewrite in the future anyway. There are no codegen changes (to non-SPIR-V targets) with this commit, aside from the new half builtins.	2025-04-02 15:20:32 +01:00
Fraser Cormack	c1efd8b663	[libclc][NFC] Delete two unused headers These should have been deleted when the respective builtins were moved to the CLC library.	2025-04-01 14:54:50 +01:00
Fraser Cormack	7a2b160e76	[libclc] Move rootn to the CLC library; optimize (#133735 ) The function was already nominally in the CLC namespace; this commit just moves it over. This commit also vectorizes the builtin to avoid scalarization.	2025-04-01 09:19:50 +01:00
Fraser Cormack	b52977b868	[libclc] Move pow, powr & pown to the CLC library (#133294 ) These functions were already nominally in the CLC library. Similar to others, these builtins are now vectorized and are not broken down into scalar types.	2025-03-28 08:23:24 +00:00
Fraser Cormack	d32e71d7c7	[libclc] Move fmod, remainder & remquo to the CLC library (#132054 ) These functions were already nominally in the CLC namespace; this commit just formally moves them over. Note that 'half' versions of these CLC functions are now provided. Previously the corresponding OpenCL builtins would forward directly to the 'float' versions of the CLC builtins. Now the OpenCL builtins call the 'half' CLC builtins, which themselves call the 'float' CLC versions. This keeps the interface between the OpenCL and CLC libraries neater and keeps the CLC library self-contained. No changes to the generated code for non-SPIR-V targets is observed.	2025-03-27 14:53:19 +00:00
Fraser Cormack	7d048674a4	[libclc] Add license headers to files missing them (#132239 ) This commit bulk updates all '.h', '.cl', '.inc', and '.cpp' files to add any missing license headers. The remaining files are generally CMake, SOURCES, scripts, markdown, etc. There are still some '.ll' files which may benefit from a license header. I can't find an example of an LLVM IR file with a license header in the rest of LLVM, but unlike most other (sub)projects, libclc has examples of LLVM IR as source files, compiled and built into the library.	2025-03-24 10:10:38 +00:00
Fraser Cormack	82912fd620	[libclc] Update license headers (#132070 ) This commit bulk-updates the libclc license headers to the current Apache-2.0 WITH LLVM-exception license in situations where they were previously attributed to AMD - and occasionally under an additional single individual contributor - under an MIT license. AMD signed the LLVM relicensing agreement and so agreed for their past contributions under the new LLVM license. The LLVM project also has had a long-standing, unwritten, policy of not adding copyright notices to source code. This policy was recently written up [1]. This commit therefore also removes these copyright notices at the same time. Note that there are outstanding copyright notices attributed to others - and many files missing copyright headers - which will be dealt with in future work. [1] https://llvm.org/docs/DeveloperPolicy.html#embedded-copyright-or-contributed-by-statements	2025-03-20 11:40:09 +00:00
Fraser Cormack	e5d5503e4e	[libclc] Move hypot to CLC library; optimize (#129551 ) This was already nominally in the CLC library; this commit just formally moves it over. It simultaneously optimizes it for vector types by avoiding scalarization.	2025-03-04 14:16:16 +00:00
Fraser Cormack	285b411e46	[libclc] Move sqrt to CLC library (#128748 ) This is fairly straightforward for most targets. We use the element-wise sqrt builtin by default. We also remove a legacy pre-filtering of the input argument, which the intrinsic now officially handles. AMDGPU provides its own implementation of sqrt for double types. This commit moves this into the implementation of CLC sqrt. It uses weak linkage on the 'default' CLC sqrt to allow AMDGPU to only override the builtin for the types it cares about.	2025-02-27 12:30:24 +00:00
Fraser Cormack	d5038b3774	[libclc] Move __clc_ldexp to CLC library (#126078 ) This function was already conceptually in the CLC namespace - this just formally moves it over. Note however that this commit marks a change in how libclc functions may be overridden by targets. Until now we have been using a purely build-system-based approach where targets could register identically-named files which took responsibility for the implementation of the builtin in its entirety. This system wasn't well equipped to deal with AMD's overriding of __clc_ldexp for only a subset of types, and furthermore conditionally on a pre-defined macro. One option for handling this would be to require AMD to duplicate code for the versions of __clc_ldexp it's not interested in overriding. We could also make it easier for targets to re-define CLC functions through macros or .inc files. Both of these have obvious downsides. We could also keep AMD's overriding in the OpenCL layer and bypass CLC altogether, but this has limited use. We could use weak linkage on the "base" implementations of CLC functions, and allow targets to opt-in to providing their own implementations on a much finer granularity. This commit supports this as a proof of concept; we could expand it to all CLC builtins if accepted. Note that the existing filename-based "claiming" approach is still in effect, so targets have to name their overrides differently to have both files compiled. This could also be refined.	2025-02-26 11:20:25 +00:00
Fraser Cormack	1e0e4169dd	[libclc][NFC] Remove unused intrinsics helpers (#128708 ) We want to move away from using asm declarations to define builtins.	2025-02-25 14:29:35 +00:00
Fraser Cormack	f8948d3c47	[libclc] Move log/log2/log10 to CLC library (#128540 ) This commit also enables fp16 log, which was previously missing. Other than that, no changes to codegen for AMDGPU/Nvidia targets. Note that for simplicity this commit doesn't try to refactor or optimize the implementations. Notably, each log is only implementated for scalar types; vector types are scalarized. It doesn't look too difficult to make the implementations suitable for vector codegen, so I'll try that in a future commit. There's also an unused implementation of log in clc_log_base.h, whereas the implementation currently used by libclc targets re-uses log2 with an additional multiplication. That should also be cleaned up as on first inspection it looks a more optimal implementation, though it would have to be checked against the OpenCL CTS for good measure.	2025-02-25 11:44:59 +00:00
Fraser Cormack	e7ad07ffb8	[libclc] Move fma to the CLC library (#126052 ) This builtin is a little more involved than others as targets deal with fma in various different ways. Fundamentally, the CLC __clc_fma builtin compiles to __builtin_elementwise_fma, which compiles to the @llvm.fma intrinsic. However, in the case of fp32 fma some targets call the __clc_sw_fma function, which provides a software implementation of the builtin. This in principle is controlled by the __CLC_HAVE_HW_FMA32 macro and may be a runtime decision, depending on how the target defines that macro. All targets build the CLC fma functions for all types. This is to the CLC library can have a reliable internal implementation for its own purposes. For AMD/NVPTX targets there are no meaningful changes to the generated LLVM bytecode. Some blocks of code have moved around, which confounds llvm-diff. For the clspv and SPIR-V/Mesa targets, only fp32 fma is of interest. Its use in libclc is tightly controlled by checking __CLC_HAVE_HW_FMA32 first. This can either be a compile-time constant (1, for clspv) or a runtime function for SPIR-V/Mesa. The SPIR-V/Mesa target only provided fp32 fma in the OpenCL layer. It unconditionally mapped that to the __clc_sw_fma builtin, even though the generic version in theory had a runtime toggle through __CLC_HAVE_HW_FMA32 specifically for that target. Callers of fma, though, would end up using the ExtInst fma, not calling the _Z3fmafff function provided by libclc. This commit keeps this system in place in the OpenCL layer, by mapping fma to __clc_sw_fma. Where other builtins would previously call fma (i.e., result in the ExtInst), they now call __clc_fma. This function checks the __CLC_HAVE_HW_FMA32 runtime toggle, which selects between the slow version or the quick version. The quick version is the LLVM fma intrinsic which llvm-spirv translates to the ExtInst. The clspv target had its own software implementation of fp32 fma, which it called unconditionally - even though __CLC_HAVE_HW_FMA32 is 1 for that target. This is potentially just so its library ships a software version which it can fall back on. In the OpenCL layer, the target doesn't provide fp64 fma, and maps fp16 fma to fp32 mad. This commit keeps this system roughly in place: in the OpenCL layer it maps fp32 fma to __clc_sw_fma, and fp16 fma to mad. Where builtins would previously call into fma, they now call __clc_fma, which compiles to the LLVM intrinsic. If this goes through a translation to SPIR-V it will become the fma ExtInst, or the intrinsic could be replaced by the _Z3fmafff software implementation. The clspv and SPIR-V/Mesa targets could potentially be cleaned up later, depending on their needs.	2025-02-24 10:10:51 +00:00
Fraser Cormack	ae5785460d	[libclc] Define macros for users of gentype.inc (#128012 ) Several users of (mostly math/) gentype.inc rely on types other than the 'gentype'. This is commonly intN as several maths builtins expose this as a return or paramter type. We were previously explicitly defining this type for every gentype. Other implementations rely on integer types of the same size and element width as the gentype, such as short/ushort for half, long/ulong for double, etc. Users might also rely on as_type or convert_type builtins to/from these types. The previous method we used to define intN was unscalable if we wanted to expose more types and helpers. This commit introduces a simpler system whereby several macros are defined at the beginning of gentype.inc. These rely on concatenating with the vector size. To facilitate this system, scalar gentypes now define an empty vector size. It was previously undefined, which was dangerous. An added benefit is that it matches how the integer gentype.inc vector size has been working. These macros will be especially helpful for the definitions of logb/ilogb in an upcoming patch.	2025-02-20 15:24:04 +00:00
Fraser Cormack	079115e6ea	[libclc] Move modf to the CLC library (#127828 ) The "generic" unary_(def\|decl)_with_ptr files are intended to be re-used by the sincos and fract builtins in the future as they share an identical type signature.	2025-02-20 08:36:46 +00:00
Fraser Cormack	25c0554166	[libclc] Move conversion builtins to the CLC library (#124727 ) This commit moves the implementations of conversion builtins to the CLC library. It keeps the dichotomy of regular vs. clspv implementations of the conversions. However, for the sake of a consistent interface all CLC conversion routines are built, even the ones that clspv opts out of in the user-facing OpenCL layer. It simultaneously updates the python script to use f-strings for formatting.	2025-02-12 08:55:02 +00:00
Fraser Cormack	64735ad639	[libclc] Move sign to the CLC builtins library (#115699 ) This commit moves the sign builtin's implementation to the CLC library. It simultaneously optimizes it (for vector types) by removing control-flow from the implementation. The __CLC_INTERNAL preprocessor definition has been repurposed (without the leading underscores) to be passed when building the internal CLC library. It was only used in one other place to guard an extra maths preprocessor definition, which we can do unconditionally.	2025-02-11 11:14:49 +00:00
Fraser Cormack	7441e87fe0	[libclc] Move several integer functions to CLC library (#116786 ) This commit moves over the OpenCL clz, hadd, mad24, mad_hi, mul24, mul_hi, popcount, rhadd, and upsample builtins to the CLC library. This commit also optimizes the vector forms of the mul_hi and upsample builtins to consistently remain in vector types, instead of recursively splitting vectors down to the scalar form. The OpenCL mad_hi builtin wasn't previously publicly available from the CLC libraries, as it was hash-defined to mul_hi in the header files. That issue has been fixed, and mad_hi is now exposed. The custom AMD implementation/workaround for popcount has been removed as it was only required for clang < 7. There are still two integer functions which haven't been moved over. The OpenCL mad_sat builtin uses many of the other integer builtins, and would benefit from optimization for vector types. That can take place in a follow-up commit. The rotate builtin could similarly use some more dedicated focus, potentially using clang builtins.	2025-01-29 13:45:33 +00:00
Fraser Cormack	78b5bb702f	[libclc][NFC] Move key math headers to CLC (#124739 )	2025-01-28 14:17:23 +00:00
Fraser Cormack	9705500582	[libclc] Move nextafter to the CLC library (#124097 ) There were two implementations of this - one that implemented nextafter in software, and another that called a clang builtin. No in-tree targets called the builtin, so all targets build the software version. The builtin version has been removed, and the software version has been renamed to be the "default". This commit also optimizes nextafter, to avoid scalarization as much as possible. Note however that the (CLC) relational builtins still scalarize; those will be optimized in a separate commit. Since nextafter is used by some convert_type builtins, the diff to IR codegen is not limited to the builtin itself.	2025-01-23 12:24:16 +00:00
Fraser Cormack	d96ec48068	[libclc] Route select through __clc_select (#123647 ) This was missed during the introduction of select. This also unifies the various .inc files used for each, as they were essentially identical. The __clc_select function is now also built for SPIR-V targets.	2025-01-21 10:05:39 +00:00
Fraser Cormack	c8eb865747	[libclc] Move mad to the CLC library (#123607 ) All targets build `__clc_mad` -- even SPIR-V targets -- since it compiles to the optimal `llvm.fmuladd` intrinsic. There is no change to the bytecode generated for non-SPIR-V targets. The `mix` builtin, which is implemented as a wrapper around `mad`, is left as an OpenCL-layer wrapper of `__clc_mad`. I don't know if it's worth having a specific CLC version of `mix`. The changes to the other CLC files/functions are moving uses of `mad` to `__clc_mad`, and reformatting. There is an additional instance of `trunc` becoming `__clc_trunc`, which was missed before.	2025-01-20 16:27:51 +00:00
Fraser Cormack	a5b88cb815	[libclc] Add missing includes to CLC headers (#118654 ) There's no automatic way of checking these headers are self-contained. Instead of including these common files many times across the whole codebase, we can include them in the generic `gentype.inc` and `floatn.inc` files which are included by most CLC headers.	2025-01-15 10:14:51 +00:00
Fraser Cormack	b231647475	[libclc] Move relational functions to the CLC library (#115171 ) The OpenCL relational functions now call their CLC counterparts, and the CLC relational functions are defined identically to how the OpenCL functions were defined. As usual, clspv and spir-v targets bypass these. No observable changes to any libclc target (measured with llvm-diff).	2024-11-06 19:28:44 +00:00
Fraser Cormack	d2d1b5897e	[libclc] Move clcmacro.h to CLC library. NFC (#114845 )	2024-11-04 22:00:01 +00:00
Fraser Cormack	293c78ba0a	[libclc] Move ceil/fabs/floor/rint/trunc to CLC library (#114774 ) These functions are all mapped to LLVM intrinsics. The clspv and spirv targets don't declare or define any of these CLC functions, and instead map these to their corresponding OpenCL symbols.	2024-11-04 16:35:14 +00:00
Fraser Cormack	f1888e4029	[libclc] Add some include guards and format a file	2024-11-04 10:37:11 +00:00
Fraser Cormack	d12a8da1de	[libclc] Move min/max/clamp into the CLC builtins library (#114386 ) These functions are "shared" between integer and floating-point types, hence the directory name. They are used in several CLC internal functions such as __clc_ldexp. Note that clspv and spirv targets don't want to define these functions, so pre-processor macros replace calls to __clc_min with regular min, for example. This means they can use as much of the generic CLC source files as possible, but where CLC functions would usually call out to an external __clc_min symbol, they call out to an external min symbol. Then they opt out of defining __clc_min itself in their CLC builtins library. Preprocessor definitions for these targets have also been changed somewhat: what used to be CLC_SPIRV (the 32-bit target) is now CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or CLC_SPIRV64. Same goes for CLC_CLSPV. There are no differences (measured with llvm-diff) in any of the final builtins libraries for nvptx, amdgpu, or clspv. Neither are there differences in the SPIR-V targets' LLVM IR before it's actually lowered to SPIR-V.	2024-10-31 16:45:37 +00:00
Fraser Cormack	b2bdd8bd39	[libclc] Create an internal 'clc' builtins library Some libclc builtins currently use internal builtins prefixed with '__clc_' for various reasons, e.g., to avoid naming clashes. This commit formalizes this concept by starting to isolate the definitions of these internal clc builtins into a separate self-contained bytecode library, which is linked into each target's libclc OpenCL builtins before optimization takes place. The goal of this step is to allow additional libraries of builtins that provide entry points (or bindings) that are not written in OpenCL C but still wish to expose OpenCL-compatible builtins. By moving the implementations into a separate self-contained library, entry points can share as much code as possible without going through OpenCL C. The overall structure of the internal clc library is similar to the current OpenCL structure, with SOURCES files and targets being able to override the definitions of builtins as needed. The idea is that the OpenCL builtins will begin to need fewer target-specific overrides, as those will slowly move over to the clc builtins instead. Another advantage of having a separate bytecode library with the CLC implementations is that we can internalize the symbols when linking it (separately), whereas currently the CLC symbols make it into the final builtins library (and perhaps even the final compiled binary). This patch starts of with 'dot' as it's relatively self-contained, as opposed to most of the maths builtins which tend to pull in other builtins. We can also start to clang-format the builtins as we go, which should help to modernize the codebase.	2024-10-29 13:09:56 +00:00
Romaric Jodin	d9cb65ff48	libclc: fix convert with half (#99481 ) Fix following update of libclc introducing more fp16 support: `7e6a73959a`	2024-07-18 15:28:58 +02:00
Romaric Jodin	7e6a73959a	libclc: increase fp16 support (#98149 ) Increase fp16 support to allow clspv to continue to be OpenCL compliant following the update of the OpenCL-CTS adding more testing on math functions and conversions with half. Math functions are implemented by upscaling to fp32 and using the fp32 implementation. It garantees the accuracy required for half-precision float-point by the CTS.	2024-07-18 12:00:41 +01:00
Romaric Jodin	932ca85680	libclc: remove __attribute__((assume)) for clspv targets (#92126 ) Instead add a proper attribute in clang, and add convert it to function metadata to keep the information in the IR. The goal is to remove the dependency on __attribute__((assume)) that should have not be there in the first place. Ref https://github.com/llvm/llvm-project/pull/84934	2024-05-17 06:13:32 -07:00
Kévin Petit	21508fa769	libclc: clspv: fix fma, add vstore and fix inlining issues https://reviews.llvm.org/D147773 Patch by Romaric Jodin <rjodin@google.com>	2023-05-09 16:52:13 +01:00
Kévin Petit	1da2085a51	libclc: add clspv to targets exempt from alwaysinline https://reviews.llvm.org/D132362 Patch by: Aaron Greig <aaron.greig@codeplay.com>	2023-02-14 18:26:42 +00:00
Dave Airlie	c37145cab1	libclc: Add Mesa/SPIR-V target Add targets to emit SPIR-V targeted to Mesa's OpenCL support, using SPIR-V 1.1. Substantially based on Dave Airlie's earlier work. libclc: spirv: remove step/smoothstep apis not defined for SPIR-V libclc: disable inlines for SPIR-V builds Reviewed By: jvesely, tstellar, jenatali Differential Revision: https://reviews.llvm.org/D77589	2020-08-17 14:01:46 -07:00
Daniel Stone	3d21fa56f5	libclc: Make all built-ins overloadable The SPIR spec states that all OpenCL built-in functions should be overloadable and mangled, to ensure consistency. Add the overload attribute to functions which were missing them: work dimensions, memory barriers and fences, and events. Reviewed By: tstellar, jenatali Differential Revision: https://reviews.llvm.org/D82078	2020-08-17 13:55:48 -07:00
Boris Brezillon	3a7051d9c2	libclc: Fix FP_ILOGBNAN definition Fix FP_ILOGBNAN definition to match the opencl-c-base.h one and guarantee that FP_ILOGBNAN and FP_ILOGB0 are different. Doing that implies fixing ilogb() implementation to return the right value. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed By: jvesely Differential Revision: https://reviews.llvm.org/D83473	2020-08-17 13:45:43 -07:00
Jan Vesely	4b23a2e8e9	libclc: Move rsqrt implementation to a .cl file Reviewer: awatry Differential Revision: https://reviews.llvm.org/D74013	2020-02-09 14:42:09 -05:00
Jan Vesely	4a725996e5	sincos: Simplify declaration headers. This follows the same pattern as modf and fract. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356028	2019-03-13 07:13:34 +00:00
Jan Vesely	e7c0c37a31	fdim: Use binary_decl_tt.inc instead of custom inc file. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356027	2019-03-13 07:13:32 +00:00
Jan Vesely	5b0600c277	nextafter: Use binary_decl_tt.inc instead of custom inc file. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356026	2019-03-13 07:13:30 +00:00
Jan Vesely	e438b58cd0	copysign: Use binary_decl_tt.inc instead of custom inc file. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356025	2019-03-13 07:13:28 +00:00
Jan Vesely	81bc9ee81c	atan2pi: Use binary_decl_tt.inc instead of custom inc file. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356024	2019-03-13 07:13:26 +00:00
Jan Vesely	9526e02021	atan2: Use binary_decl_tt.inc instead of custom inc file. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356023	2019-03-13 07:13:24 +00:00
Jan Vesely	8985c9c212	hypot: Use binary_decl_tt.inc instead of custom inc file Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356022	2019-03-13 07:13:22 +00:00

1 2 3 4 5 ...

370 Commits