131 Commits

Author SHA1 Message Date
Nick Desaulniers
631a6e0004
[libc][wchar] implement wcslen (#124150)
Update string_utils' string_length to work with char* or wchar_t*, so that it
may be reusable when implementing wmemchr, wcspbrk, wcsrchr, wcsstr.

Link: #121183
Link: #124027

Co-authored-by: Nick Desaulniers <ndesaulniers@google.com>

---------

Co-authored-by: Tristan Ross <tristan.ross@midstall.com>
2025-01-23 13:33:04 -08:00
Joseph Huber
f4bab06c97
[libc] Split AMDGPU and NVPTX configs into separate folders (#120153)
Summary:
This is a holdover from when these targets were merged. They're
basically the same but there's no reason they should be treated as
identical. I think we will live with a little duplication.
2025-01-06 13:49:19 -06:00
Joseph Huber
3bbd53ce97
[libc] Remove old RPC host call extension for GPU (#120800)
Summary:
This was originally a hacked together function that served to just
implement some features for OpenMP. That has been moved into OpenMP
itself now that we have exported RPC properly. This can now be deleted.
2024-12-20 19:34:05 -06:00
Joseph Huber
958de20b30
[libc] Enable 'timespec_get' for the GPU build (#120304)
Summary:
Currently fails to build libc++ because this is missing.
2024-12-17 15:31:07 -06:00
Nick Desaulniers
431ea2d076
[libc] move bcmp, bzero, bcopy, index, rindex, strcasecmp, strncasecmp to strings.h (#118899)
docgen relies on the convention that we have a file foo.cpp in
libc/src/\<header\>/. Because the above functions weren't in libc/src/strings/
but rather libc/src/string/, docgen could not find that we had implemented
these.

Rather than add special carve outs to docgen, let's fix up our sources for
these 7 functions to stick with the existing conventions the rest of the
codebase follows.

Link: #118860
Fixes: #118875
2024-12-10 08:58:45 -08:00
Nick Desaulniers
e0ae7793fc
[libc] delete hdrgen (#117220)
Thanks to the effort of @RoseZhang03 and @aaryanshukla under the
guidance of
@michaelrj-google and @amykhuang, we now have newhdrgen and no longer
have a
dependency on TableGen and thus LLVM in order to start bootstrapping a
full
build.

This PR removes:
- LIBC_HDRGEN_EXE; the in tree newhdrgen is the only hdrgen that can be
used.
- LIBC_USE_NEW_HEADER_GEN; newhdrgen is the default and only option.
- LIBC_HDRGEN_ONLY; there is no need to have a distinct build step for
old
  hdrgen.
- libc-api-test and libc-api-test-tidy build targets.
- Deletes all .td files.

It does not rename newhdrgen to just hdrgen. Will follow up with a
distinct PR
for that.

Link: #117209
Link: #117254
Fixes: #117208
2024-12-03 12:34:26 -08:00
OverMighty
d97f6d1ae9
[libc][math][c23] Add sqrtf16 C23 math function (#112406)
Part of #95250.
2024-10-19 01:41:52 +02:00
OverMighty
69d3a44ede
[libc][math][c23] Add log10f16 C23 math function (#106091)
Part of #95250.
2024-10-19 01:40:40 +02:00
OverMighty
6d347fdfbd
[libc][math][c23] Add log2f16 C23 math function (#106084)
Part of #95250.
2024-10-19 01:10:32 +02:00
OverMighty
65cf7afb6d
[libc][math][c23] Add logf16 C23 math function (#106072)
Part of #95250.
2024-10-18 22:35:12 +02:00
OverMighty
fdd7c0353f
[libc][math][c23] Add tanhf16 C23 math function (#106006)
Part of #95250.
2024-10-18 14:22:45 +02:00
OverMighty
ed3d051782
[libc][math][c23] Add sinhf16 and coshf16 C23 math functions (#105947)
Part of #95250.
2024-10-17 20:44:23 +02:00
OverMighty
95c24cb9de
[libc][math][c23] Add exp10m1f16 C23 math function (#105706)
Part of #95250.
2024-10-16 16:33:13 +02:00
Joseph Huber
fe6a3d46aa
[libc] Implement the 'rename' function on the GPU (#109814)
Summary:
Straightforward implementation like the other `stdio.h` functions.
2024-09-24 09:32:42 -07:00
Joseph Huber
3bbe0f90f3
[libc] Add 'strings.h' header on the GPU (#109661)
Summary:
These are GNU extensions but still show up, the entrypoints were enabled
but we weren't emitting the header so they couldn't be used.
2024-09-23 14:19:33 -07:00
Joseph Huber
16d11e26f3
[libc] Add GPU support for the 'system' function (#109687)
Summary:
This function can easily be implemented by forwarding it to the host
process. This shows up in a few places that we might want to test the
GPU so it should be provided. Also, I find the idea of the GPU
offloading work to the CPU via `system` very funny.
2024-09-23 14:04:28 -07:00
Michael Jones
f009f72df5
[libc] Add printf strerror conversion (%m) (#105891)
This patch adds the %m conversion to printf, which prints the
strerror(errno). Explanation of why is below, this patch also updates
the docs, tests, and build system to accomodate this.

The standard for syslog in posix specifies it uses the same format as
printf, but adds %m which prints the error message string for the
current value of errno. For ease of implementation, it's standard
practice for libc implementers to just add %m to printf instead of
creating a separate parser for syslog.
2024-09-19 10:48:08 -07:00
Joseph Huber
5c019bdb7a
[libc] Add support for 'string.h' locale variants (#105719)
Summary:
This adds the locale variants of the string functions. As previously,
these do not use the locale information at all and simply copy the
non-locale version which expects the "C" locale.
2024-08-29 14:20:15 -05:00
Joseph Huber
a87105121d
[libc] Implement locale variants for 'stdlib.h' functions (#105718)
Summary:
This provides the `_l` variants for the `stdlib.h` functions. These are
just copies of the same entrypoint and don't do anything with the locale
information.
2024-08-29 14:18:37 -05:00
Joseph Huber
439d7de14d [libc] Disable failing scanf test on AMDGPU temporarily
Summary:
This test currently fails in the `amdgpu-attributor` pass. I haven't
figured out anything beyond that yet as it's difficult to reduce.
2024-08-28 07:04:15 -05:00
Joseph Huber
856dadb33c [libc] Add ctype.h locale variants (#102711)
Summary:
This patch adds all the libc ctype variants. These ignore the locale
ingormation completely, so they're pretty much just stubs. Because these
use locale information, which is system scope, we do not enable building
them outisde of full build mode.
2024-08-22 13:51:54 -05:00
Joseph Huber
78d8ab2ab9
[libc] Initial support for 'locale.h' in the LLVM libc (#102689)
Summary:
This patch adds the macros and entrypoints associated with the
`locale.h` entrypoints.  These are mostly stubs, as we (for now and the
forseeable future) only expect to support the C and maybe C.UTF-8
locales in the LLVM libc.
2024-08-22 12:58:46 -05:00
Joseph Huber
2f4232db0b Revert " [libc] Add ctype.h locale variants (#102711)"
This reverts commit 8f005f8306dc52577b3b9482d271fb463f0152a5.
2024-08-22 12:45:16 -05:00
Joseph Huber
8f005f8306
[libc] Add ctype.h locale variants (#102711)
Summary:
This patch adds all the libc ctype variants. These ignore the locale
ingormation completely, so they're pretty much just stubs. Because these
use locale information, which is system scope, we do not enable building
them outisde of full build mode.
2024-08-22 12:41:20 -05:00
Joseph Huber
6b98a72365
[libc] Add scanf support to the GPU build (#104812)
Summary:
The `scanf` function has a "system file" configuration, which is pretty
much what the GPU implementation does at this point. So we should be
able to use it in much the same way.
2024-08-21 18:02:04 -05:00
Joseph Huber
bd9f2c2ba0
[libc] Add missing math definitions for round and scal for GPU (#104636)
Summary:
These can be enabled
2024-08-16 16:27:03 -05:00
Joseph Huber
55aa4ea1c7
[libc] Add definition for atan2l on 64-bit long double platforms (#104489)
Summary:
This just adds `atan2l` for platforms that can implement it as an alias
to `atan2`.
2024-08-15 14:59:28 -05:00
Joseph Huber
dc2f39e96c
[libc] Enable all supported math functions on the GPU (#102563)
Summary:
Simply copies the x64 versions to the GPU directory. Ignoring f128 for
now, but adding long double entrypoints which are identical to `double`
on the target.
2024-08-12 13:12:44 -05:00
aaryanshukla
d0fe470fd2
[libc][math] Add scalbln{,f,l,f128} math functions (#102219)
Co-authored-by: OverMighty <its.overmighty@gmail.com>
2024-08-08 14:33:50 -07:00
Joseph Huber
1a92cc5a0a
[libc] Implement 'getenv' on the GPU target (#102376)
Summary:
This patch implements 'getenv'. I was torn on how to implement this,
since realistically we only have access to this environment pointer in
the "loader" interface. An alternative would be to use an RPC call every
time, but I think that's overkill for what this will be used for. A
better solution is just to emit a common `DataEnvironment` that contains
all of the host visible resources to initialize. Right now this is the
`env_ptr`, `clock_freq`, and `rpc_client`.

I did this by making the `app.h` interface that Linux uses more general,
could possibly move that into a separate patch, but I figured it's
easier to see with the usage.
2024-08-08 06:45:42 -05:00
Joseph Huber
3645ca58f4
[libc] Enable quick_exit routines on the GPU (#102242)
Summary:
We should be able to use these on the GPU just like exit.
2024-08-07 08:01:11 -05:00
Joseph Huber
88d288489e
[libc] Add lgamma and lgamma_r stubs for the GPU (#102019)
Summary:
These functions are used by the <random> implementation in libc++ and
cause a lot of tests to fail. For now we provide these through the
vendor abstraction until we have a real version. The NVPTX version
doesn't even update the output correctly so these are just temporary.
2024-08-05 14:53:05 -05:00
Joseph Huber
c4ec19b985
[libc] Add support for 'features.h' when targeting the GPU (#102037)
Summary:
`features.h` provides some information about the C library, provide this
on the GPU so external users can tell if it's the LLVM C library.
2024-08-05 14:52:44 -05:00
Joseph Huber
bde51232ba
[libc] Provide 'signal.h' header for the GPU (#101996)
Summary:
This header is practically useless, but we provide it mostly for the
macros so that applications can compile. I'm only doing this for the
`libc++` unittests that want it, and it is part of the C standard
technically. I just made an RPC call to do `raise`. Anything more isn't
going to work since it'd be way too annoying to make the CPU call into
some signal handler the GPU registered.
2024-08-05 14:52:14 -05:00
Joseph Huber
97f723bab0 [libc] Fix 'vasprintf' not working in non-fullbuild mode 2024-08-01 15:36:29 -05:00
Job Henandez Lara
ed12f80ff0
[libc][math][c23] add entrypoints and tests for getpayload{,f,f128} (#101285) 2024-07-31 23:16:42 -04:00
Joseph Huber
38ef6929a3
[libc] Add vsscanf function (#101402)
Summary:
Adds support for the `vsscanf` function similar to `sscanf`.
Based off of https://github.com/llvm/llvm-project/pull/97529.
2024-07-31 16:53:25 -05:00
Joseph Huber
bf42a7860a
[libc] Implement placeholder memory functions on the GPU (#101082)
Summary:
These functions are needed for `libc++` to link successfully. We can't
implement them well currently, so simply provide some stand-in
implementations. `realloc` will currently copy garbage and potentially
fault and `aligned_alloc` will work unless your alignment is more than
4K alignment. However, these should work in practice to get tests
running. I will write a real allocator soon™.
2024-07-30 10:15:30 -05:00
Joseph Huber
dbb8b7a0f4 Reapply "[OpenMP][libc] Remove special handling for OpenMP printf (#98940)"
This reverts commit fea5914c926e2f013a8b5e27eaa74c7047fb2c71.
2024-07-26 17:21:56 -05:00
Joseph Huber
fea5914c92 Revert "[OpenMP][libc] Remove special handling for OpenMP printf (#98940)"
This reverts commit 069e8bcd82c4420239f95c7e6a09e1f756317cfc.

Summary:
Some tests failing, revert this for now.
2024-07-26 16:39:12 -05:00
Joseph Huber
069e8bcd82
[OpenMP][libc] Remove special handling for OpenMP printf (#98940)
Summary:
Currently there are several layers to handle `printf`. Since we now have
varargs and an implementation of `printf` this can be heavily
simplified.

1. The frontend renames `printf` into `omp_vprintf` and gives it an
   argument buffer.

Removing 1. triggered some code in the AMDGPU backend menat for HIP /
OpenCL, so I hadded an exception to it.

2. Forward this to CUDA vprintf or ignore it.

We no longer need special handling for it since we have varargs. So now
we just forward this to CUDA vprintf if we have libc, otherwise just
leave `printf` as an external function and expect that `libc` will be
linked in.
2024-07-26 16:03:36 -05:00
OverMighty
81ce796095
[libc][math][c23] Enable C23 _Float16 math functions on GPUs (#99248) 2024-07-25 21:09:49 +02:00
Joseph Huber
2e3ee31d29 [libc] Enable 'sscanf' on the GPU #100211
Summary:
We can enable the sscanf function on the GPU now. This required adding
the configs to the scanf list so that the GPU build didn't do float
conversions.
2024-07-24 14:16:57 -05:00
Joseph Huber
9914609468 Revert "[libc] Enable 'sscanf' on the GPU (#100211)"
Summary:
This fails tests in some situations, revert until it can be fixed.
This reverts commit 445bb35f954ecd5c60ced71523f6b32fc306d557.
2024-07-24 07:46:39 -05:00
Joseph Huber
445bb35f95
[libc] Enable 'sscanf' on the GPU (#100211)
Summary:
We can enable the `sscanf` function on the GPU now.
2024-07-24 07:41:32 -05:00
Joseph Huber
e0649a5dfc
[NVPTX] Fix internal indirect call prototypes not obeying the ABI (#100131)
Summary:
The NVPTX backend optimizes the ABI for functions that are internal,
however, this is not legal for indirect call prototypes. Previously, we
would modify the ABI on an aggregate byval type passed to an indirect
call prototype, which would make PTXAS error. This patch just passes the
function as a nullptr to force strict ABI compliance without
modification in the helper function.

Fixes https://github.com/llvm/llvm-project/issues/100055
2024-07-23 12:54:00 -05:00
Joseph Huber
e7a2405383
[libc] Remove workarounds for lack of functional NVPTX linker (#96972)
Summary:
Currently we have several hacks to work around the fact that the NVPTX
linker, 'nvlink', does not support static libraries or LTO linking.
The patch in https://github.com/llvm/llvm-project/pull/96561 introduces
a wrapper in the toolchain that allows us to use a standard `ld.lld`
like interface. This means all the divergence with this target can be
removed.

Depends on https://github.com/llvm/llvm-project/pull/96561
2024-07-22 22:16:50 -05:00
aaryanshukla
a2f61ba08b
[libc][math]fadd implementation (#99694)
- **[libc] math fadd**
- **[libc][math] implemented fadd**
2024-07-19 14:40:34 -07:00
Joseph Huber
38f1dd2e45 [libc] Remove strerror_r on the GPU for now
Summary:
This function has conflicting definitions, which makes it difficult to
use in an offloading setting. Disable it for now.
2024-07-18 06:54:03 -05:00
lntue
7fc9fb9f3f
[libc][math] Implement double precision cbrt correctly rounded to all rounding modes. (#99262)
Division-less Newton iterations algorithm for cube roots.

1. **Range reduction**

For `x = (-1)^s * 2^e * (1.m)`, we get 2 reduced arguments `x_r` and `a`
as:
```
  x_r = 1.m
  a   = (-1)^s * 2^(e % 3) * (1.m)
```
Then `cbrt(x) = x^(1/3)` can be computed as:
```
  x^(1/3) = 2^(e / 3) * a^(1/3).
```

In order to avoid division, we compute `a^(-2/3)` using Newton method
and then
multiply the results by a:
```
  a^(1/3) = a * a^(-2/3).
```

2. **First approximation to a^(-2/3)**

First, we use a degree-7 minimax polynomial generated by Sollya to
approximate `x_r^(-2/3)` for `1 <= x_r < 2`.
```
  p = P(x_r) ~ x_r^(-2/3),
```
with relative errors bounded by:
```
  | p / x_r^(-2/3) - 1 | < 1.16 * 2^-21.
```

Then we multiply with `2^(e % 3)` from a small lookup table to get:
```
  x_0 = 2^(-2*(e % 3)/3) * p
      ~ 2^(-2*(e % 3)/3) * x_r^(-2/3)
      = a^(-2/3)
```
with relative errors:
```
  | x_0 / a^(-2/3) - 1 | < 1.16 * 2^-21.
```
This step is done in double precision.

3. **First Newton iteration**

We follow the method described in:
Sibidanov, A. and Zimmermann, P., "Correctly rounded cubic root
evaluation
in double precision", https://core-math.gitlabpages.inria.fr/cbrt64.pdf
to derive multiplicative Newton iterations as below:
Let `x_n` be the nth approximation to `a^(-2/3)`. Define the n^th error
as:
```
  h_n = x_n^3 * a^2 - 1
```
Then:
```
  a^(-2/3) = x_n / (1 + h_n)^(1/3)
           = x_n * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3 + ...)
```
using the Taylor series expansion of `(1 + h_n)^(-1/3)`.

Apply to `x_0` above:
```
  h_0 = x_0^3 * a^2 - 1
      = a^2 * (x_0 - a^(-2/3)) * (x_0^2 + x_0 * a^(-2/3) + a^(-4/3)),
```
it's bounded by:
```
  |h_0| < 4 * 3 * 1.16 * 2^-21 * 4 < 2^-17.
```
So in the first iteration step, we use:
```
  x_1 = x_0 * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3)
```
Its relative error is bounded by:
```
  | x_1 / a^(-2/3) - 1 | < 35/242 * |h_0|^4 < 2^-70.
```
Then we perform Ziv's rounding test and check if the answer is exact.
This step is done in double-double precision.

4. **Second Newton iteration**

If the Ziv's rounding test from the previous step fails, we define the
error
term:
```
  h_1 = x_1^3 * a^2 - 1,
```
And perform another iteration:
```
  x_2 = x_1 * (1 - h_1 / 3)
```
with the relative errors exceed the precision of double-double.
We then check the Ziv's accuracy test with relative errors < 2^-102 to
compensate for rounding errors.

5. **Final iteration**
 
If the Ziv's accuracy test from the previous step fails, we perform
another
iteration in 128-bit precision and check for exact outputs.
2024-07-17 12:23:14 -04:00