mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-27 06:26:08 +00:00
[libc][Docs] Begin improving documentation for the GPU libc
This patch updates some of the documentation for the GPU libc project. There is a lot of work still to be done, but this sets the general outline. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149194
This commit is contained in:
parent
b56b15ed71
commit
807f058487
18
libc/docs/gpu/index.rst
Normal file
18
libc/docs/gpu/index.rst
Normal file
@ -0,0 +1,18 @@
|
||||
.. _libc_gpu:
|
||||
|
||||
=============
|
||||
libc for GPUs
|
||||
=============
|
||||
|
||||
.. note:: This feature is very experimental and may change in the future.
|
||||
|
||||
The *GPU* support for LLVM's libc project aims to make a subset of the standard
|
||||
C library available on GPU based accelerators. Navigate using the links below to
|
||||
learn more about this project.
|
||||
|
||||
.. toctree::
|
||||
|
||||
using
|
||||
support
|
||||
testing
|
||||
rpc
|
17
libc/docs/gpu/rpc.rst
Normal file
17
libc/docs/gpu/rpc.rst
Normal file
@ -0,0 +1,17 @@
|
||||
.. _libc_gpu_rpc:
|
||||
|
||||
======================
|
||||
Remote Procedure Calls
|
||||
======================
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:depth: 4
|
||||
:local:
|
||||
|
||||
Remote Procedure Call Implementation
|
||||
====================================
|
||||
|
||||
Certain features from the standard C library, such as allocation or printing,
|
||||
require support from the operating system. We instead implement a remote
|
||||
procedure call (RPC) interface to allow submitting work from the GPU to a host
|
||||
server that forwards it to the host system.
|
88
libc/docs/gpu/support.rst
Normal file
88
libc/docs/gpu/support.rst
Normal file
@ -0,0 +1,88 @@
|
||||
.. _libc_gpu_support:
|
||||
|
||||
===================
|
||||
Supported Functions
|
||||
===================
|
||||
|
||||
.. include:: ../check.rst
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:depth: 4
|
||||
:local:
|
||||
|
||||
The following functions and headers are supported at least partially on the
|
||||
device. Some functions are implemented fully on the GPU, while others require a
|
||||
`remote procedure call <libc_gpu_rpc>`.
|
||||
|
||||
ctype.h
|
||||
-------
|
||||
|
||||
============= ========= ============
|
||||
Function Name Available RPC Required
|
||||
============= ========= ============
|
||||
isalnum |check|
|
||||
isalpha |check|
|
||||
isascii |check|
|
||||
isblank |check|
|
||||
iscntrl |check|
|
||||
isdigit |check|
|
||||
isgraph |check|
|
||||
islower |check|
|
||||
isprint |check|
|
||||
ispunct |check|
|
||||
isspace |check|
|
||||
isupper |check|
|
||||
isxdigit |check|
|
||||
toascii |check|
|
||||
tolower |check|
|
||||
toupper |check|
|
||||
============= ========= ============
|
||||
|
||||
string.h
|
||||
--------
|
||||
|
||||
============= ========= ============
|
||||
Function Name Available RPC Required
|
||||
============= ========= ============
|
||||
bcmp |check|
|
||||
bzero |check|
|
||||
memccpy |check|
|
||||
memchr |check|
|
||||
memcmp |check|
|
||||
memcpy |check|
|
||||
memmove |check|
|
||||
mempcpy |check|
|
||||
memrchr |check|
|
||||
memset |check|
|
||||
stpcpy |check|
|
||||
stpncpy |check|
|
||||
strcat |check|
|
||||
strchr |check|
|
||||
strcmp |check|
|
||||
strcpy |check|
|
||||
strcspn |check|
|
||||
strlcat |check|
|
||||
strlcpy |check|
|
||||
strlen |check|
|
||||
strncat |check|
|
||||
strncmp |check|
|
||||
strncpy |check|
|
||||
strnlen |check|
|
||||
strpbrk |check|
|
||||
strrchr |check|
|
||||
strspn |check|
|
||||
strstr |check|
|
||||
strtok |check|
|
||||
strtok_r |check|
|
||||
strdup
|
||||
strndup
|
||||
============= ========= ============
|
||||
|
||||
stdlib.h
|
||||
--------
|
||||
|
||||
============= ========= ============
|
||||
Function Name Available RPC Required
|
||||
============= ========= ============
|
||||
atoi |check|
|
||||
============= ========= ============
|
32
libc/docs/gpu/testing.rst
Normal file
32
libc/docs/gpu/testing.rst
Normal file
@ -0,0 +1,32 @@
|
||||
.. _libc_gpu_testing:
|
||||
|
||||
|
||||
============================
|
||||
Testing the GPU libc library
|
||||
============================
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:depth: 4
|
||||
:local:
|
||||
|
||||
Testing Infrastructure
|
||||
======================
|
||||
|
||||
The testing support in LLVM's libc implementation for GPUs is designed to mimic
|
||||
the standard unit tests as much as possible. We use the `remote procedure call
|
||||
<libc_gpu_rpc>` support to provide the necessary utilities like printing from
|
||||
the GPU. Execution is performed by emitting a ``_start`` kernel from the GPU
|
||||
that is then called by an external loader utility. This is an example of how
|
||||
this can be done manually:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=gfx90a -flto
|
||||
$> ./amdhsa_loader --threads 1 --blocks 1 a.out
|
||||
Test Passed!
|
||||
|
||||
Unlike the exported ``libcgpu.a``, the testing architecture can only support a
|
||||
single architecture at a time. This is either detected automatically, or set
|
||||
manually by the user using ``LIBC_GPU_TEST_ARCHITECTURE``. The latter is useful
|
||||
in cases where the user does not build LLVM's libc on machine with the GPU to
|
||||
use for testing.
|
87
libc/docs/gpu/using.rst
Normal file
87
libc/docs/gpu/using.rst
Normal file
@ -0,0 +1,87 @@
|
||||
.. _libc_gpu_usage:
|
||||
|
||||
|
||||
===================
|
||||
Using libc for GPUs
|
||||
===================
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:depth: 4
|
||||
:local:
|
||||
|
||||
Building the GPU library
|
||||
========================
|
||||
|
||||
LLVM's libc GPU support *must* be built with an up-to-date ``clang`` compiler
|
||||
due to heavy reliance on ``clang``'s GPU support. This can be done automatically
|
||||
using the ``LLVM_ENABLE_RUNTIMES=libc`` option. To enable libc for the GPU,
|
||||
enable the ``LIBC_GPU_BUILD`` option. By default, ``libcgpu.a`` will be built
|
||||
using every supported GPU architecture. To restrict the number of architectures
|
||||
build, either set ``LLVM_LIBC_GPU_ARCHITECTURES`` to the list of desired
|
||||
architectures manually or use ``native`` to detect the GPUs on your system. A
|
||||
typical ``cmake`` configuration will look like this:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> cd llvm-project # The llvm-project checkout
|
||||
$> mkdir build
|
||||
$> cd build
|
||||
$> cmake ../llvm -G Ninja \
|
||||
-DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
|
||||
-DLLVM_ENABLE_RUNTIMES="libc;openmp" \
|
||||
-DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type
|
||||
-DLIBC_GPU_BUILD=ON \ # Build in GPU mode
|
||||
-DLLVM_LIBC_GPU_ARCHITECTURES=all \ # Build all supported architectures
|
||||
-DCMAKE_INSTALL_PREFIX=<PATH> \ # Where 'libcgpu.a' will live
|
||||
$> ninja install
|
||||
|
||||
Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our
|
||||
toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built
|
||||
using a compatible compiler and to support ``openmp`` offloading, we list them
|
||||
in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the
|
||||
newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation
|
||||
directory in which to install the ``libcgpu.a`` library and headers along with
|
||||
LLVM. The generated headers will be placed in ``include/gpu-none-llvm``.
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
Once the ``libcgpu.a`` static archive has been built it can be linked directly
|
||||
with offloading applications as a standard library. This process is described in
|
||||
the `clang documentation <https://clang.llvm.org/docs/OffloadingDesign.html>`_.
|
||||
This linking mode is used by the OpenMP toolchain, but is currently opt-in for
|
||||
the CUDA and HIP toolchains through the ``--offload-new-driver``` and
|
||||
``-fgpu-rdc`` flags. A typical usage will look this this:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu
|
||||
|
||||
The ``libcgpu.a`` static archive is a fat-binary containing LLVM-IR for each
|
||||
supported target device. The supported architectures can be seen using LLVM's
|
||||
``llvm-objdump`` with the ``--offloading`` flag:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> llvm-objdump --offloading libcgpu.a
|
||||
libcgpu.a(strcmp.cpp.o): file format elf64-x86-64
|
||||
|
||||
OFFLOADING IMAGE [0]:
|
||||
kind llvm ir
|
||||
arch gfx90a
|
||||
triple amdgcn-amd-amdhsa
|
||||
producer none
|
||||
|
||||
Because the device code is stored inside a fat binary, it can be difficult to
|
||||
inspect the resulting code. This can be done using the following utilities:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> llvm-ar x libcgpu.a strcmp.cpp.o
|
||||
$> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc
|
||||
$> opt -S out.bc
|
||||
...
|
||||
|
||||
Please note that this fat binary format is provided for compatibility with
|
||||
existing offloading toolchains. The implementation in ``libc`` does not depend
|
||||
on any existing offloading languages and is completely freestanding.
|
@ -1,169 +0,0 @@
|
||||
.. _GPU_mode:
|
||||
|
||||
==============
|
||||
GPU Mode
|
||||
==============
|
||||
|
||||
.. include:: check.rst
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:depth: 4
|
||||
:local:
|
||||
|
||||
.. note:: This feature is very experimental and may change in the future.
|
||||
|
||||
The *GPU* mode of LLVM's libc is an experimental mode used to support calling
|
||||
libc routines during GPU execution. The goal of this project is to provide
|
||||
access to the standard C library on systems running accelerators. To begin using
|
||||
this library, build and install the ``libcgpu.a`` static archive following the
|
||||
instructions in :ref:`building_gpu_mode` and link with your offloading
|
||||
application.
|
||||
|
||||
.. _building_gpu_mode:
|
||||
|
||||
Building the GPU library
|
||||
========================
|
||||
|
||||
LLVM's libc GPU support *must* be built using the same compiler as the final
|
||||
application to ensure relative LLVM bitcode compatibility. This can be done
|
||||
automatically using the ``LLVM_ENABLE_RUNTIMES=libc`` option. Furthermore,
|
||||
building for the GPU is only supported in :ref:`fullbuild_mode`. To enable the
|
||||
GPU build, set the target OS to ``gpu`` via ``LLVM_LIBC_TARGET_OS=gpu``. By
|
||||
default, ``libcgpu.a`` will be built using every supported GPU architecture. To
|
||||
restrict the number of architectures build, set ``LLVM_LIBC_GPU_ARCHITECTURES``
|
||||
to the list of desired architectures or use ``all``. A typical ``cmake``
|
||||
configuration will look like this:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> cd llvm-project # The llvm-project checkout
|
||||
$> mkdir build
|
||||
$> cd build
|
||||
$> cmake ../llvm -G Ninja \
|
||||
-DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
|
||||
-DLLVM_ENABLE_RUNTIMES="libc;openmp" \
|
||||
-DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type
|
||||
-DLLVM_LIBC_FULL_BUILD=ON \ # We need the full libc
|
||||
-DLIBC_GPU_BUILD=ON \ # Build in GPU mode
|
||||
-DLLVM_LIBC_GPU_ARCHITECTURES=all \ # Build all supported architectures
|
||||
-DCMAKE_INSTALL_PREFIX=<PATH> \ # Where 'libcgpu.a' will live
|
||||
$> ninja install
|
||||
|
||||
Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our
|
||||
toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built
|
||||
using a compatible compiler and to support ``openmp`` offloading, we list them
|
||||
in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the
|
||||
newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation
|
||||
directory in which to install the ``libcgpu.a`` library along with LLVM.
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
Once the ``libcgpu.a`` static archive has been built in
|
||||
:ref:`building_gpu_mode`, it can be linked directly with offloading applications
|
||||
as a standard library. This process is described in the `clang documentation
|
||||
<https://clang.llvm.org/docs/OffloadingDesign.html>_`. This linking mode is used
|
||||
by the OpenMP toolchain, but is currently opt-in for the CUDA and HIP toolchains
|
||||
using the ``--offload-new-driver``` and ``-fgpu-rdc`` flags. A typical usage
|
||||
will look this this:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu
|
||||
|
||||
The ``libcgpu.a`` static archive is a fat-binary containing LLVM-IR for each
|
||||
supported target device. The supported architectures can be seen using LLVM's
|
||||
objdump with the ``--offloading`` flag:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> llvm-objdump --offloading libcgpu.a
|
||||
libcgpu.a(strcmp.cpp.o): file format elf64-x86-64
|
||||
|
||||
OFFLOADING IMAGE [0]:
|
||||
kind llvm ir
|
||||
arch gfx90a
|
||||
triple amdgcn-amd-amdhsa
|
||||
producer <none>
|
||||
|
||||
Because the device code is stored inside a fat binary, it can be difficult to
|
||||
inspect the resulting code. This can be done using the following utilities:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
$> llvm-ar x libcgpu.a strcmp.cpp.o
|
||||
$> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc
|
||||
$> opt -S out.bc
|
||||
...
|
||||
|
||||
Supported Functions
|
||||
===================
|
||||
|
||||
The following functions and headers are supported at least partially on the
|
||||
device. Currently, only basic device functions that do not require an operating
|
||||
system are supported on the device. Supporting functions like `malloc` using an
|
||||
RPC mechanism is a work-in-progress.
|
||||
|
||||
ctype.h
|
||||
-------
|
||||
|
||||
============= =========
|
||||
Function Name Available
|
||||
============= =========
|
||||
isalnum |check|
|
||||
isalpha |check|
|
||||
isascii |check|
|
||||
isblank |check|
|
||||
iscntrl |check|
|
||||
isdigit |check|
|
||||
isgraph |check|
|
||||
islower |check|
|
||||
isprint |check|
|
||||
ispunct |check|
|
||||
isspace |check|
|
||||
isupper |check|
|
||||
isxdigit |check|
|
||||
toascii |check|
|
||||
tolower |check|
|
||||
toupper |check|
|
||||
============= =========
|
||||
|
||||
string.h
|
||||
--------
|
||||
|
||||
============= =========
|
||||
Function Name Available
|
||||
============= =========
|
||||
bcmp |check|
|
||||
bzero |check|
|
||||
memccpy |check|
|
||||
memchr |check|
|
||||
memcmp |check|
|
||||
memcpy |check|
|
||||
memmove |check|
|
||||
mempcpy |check|
|
||||
memrchr |check|
|
||||
memset |check|
|
||||
stpcpy |check|
|
||||
stpncpy |check|
|
||||
strcat |check|
|
||||
strchr |check|
|
||||
strcmp |check|
|
||||
strcpy |check|
|
||||
strcspn |check|
|
||||
strlcat |check|
|
||||
strlcpy |check|
|
||||
strlen |check|
|
||||
strncat |check|
|
||||
strncmp |check|
|
||||
strncpy |check|
|
||||
strnlen |check|
|
||||
strpbrk |check|
|
||||
strrchr |check|
|
||||
strspn |check|
|
||||
strstr |check|
|
||||
strtok |check|
|
||||
strtok_r |check|
|
||||
strdup
|
||||
strndup
|
||||
============= =========
|
@ -52,7 +52,7 @@ stages there is no ABI stability in any form.
|
||||
usage_modes
|
||||
overlay_mode
|
||||
fullbuild_mode
|
||||
gpu_mode
|
||||
gpu/index.rst
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
Loading…
x
Reference in New Issue
Block a user