2021-12-30 16:41:36 -05:00
|
|
|
====================
|
|
|
|
Clang Linker Wrapper
|
|
|
|
====================
|
|
|
|
|
|
|
|
.. contents::
|
|
|
|
:local:
|
|
|
|
|
|
|
|
.. _clang-linker-wrapper:
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
============
|
|
|
|
|
2022-07-08 14:29:04 -04:00
|
|
|
This tool works as a wrapper of the normal host linking job. This tool is used
|
|
|
|
to create linked device images for offloading and the necessary runtime calls to
|
|
|
|
register them. It works by first scanning the linker's input for embedded device
|
|
|
|
offloading data stored at the ``.llvm.offloading`` section. This section
|
|
|
|
contains binary data created by the :doc:`ClangOffloadPackager`. The extracted
|
|
|
|
device files will then be linked. The linked modules will then be wrapped into a
|
|
|
|
new object file containing the code necessary to register it with the offloading
|
|
|
|
runtime.
|
2021-12-30 16:41:36 -05:00
|
|
|
|
|
|
|
Usage
|
|
|
|
=====
|
|
|
|
|
2022-07-08 14:29:04 -04:00
|
|
|
This tool can be used with the following options. Any arguments not intended
|
|
|
|
only for the linker wrapper will be forwarded to the wrapped linker job.
|
2021-12-30 16:41:36 -05:00
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2022-07-08 14:29:04 -04:00
|
|
|
USAGE: clang-linker-wrapper [options] -- <options to passed to the linker>
|
[clang][NFC] Remove trailing whitespaces and enforce it in lib, include and docs
A lot of editors remove trailing whitespaces. This patch removes any trailing whitespaces and makes sure that no new ones are added.
Reviewed By: erichkeane, paulkirth, #libc, philnik
Spies: wangpc, aheejin, MaskRay, pcwang-thead, cfe-commits, libcxx-commits, dschuff, nemanjai, arichardson, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, s.egerton, sameer.abuasal, apazos, luismarques, martong, frasercrmck, steakhal, luke
Differential Revision: https://reviews.llvm.org/D151963
2023-06-25 18:59:56 -07:00
|
|
|
|
2021-12-30 16:41:36 -05:00
|
|
|
OPTIONS:
|
2022-07-08 14:29:04 -04:00
|
|
|
--cuda-path=<dir> Set the system CUDA path
|
|
|
|
--device-debug Use debugging
|
|
|
|
--device-linker=<value> or <triple>=<value>
|
|
|
|
Arguments to pass to the device linker invocation
|
|
|
|
--dry-run Print program arguments without running
|
|
|
|
--help-hidden Display all available options
|
|
|
|
--help Display available options (--help-hidden for more)
|
|
|
|
--host-triple=<triple> Triple to use for the host compilation
|
|
|
|
--linker-path=<path> The linker executable to invoke
|
|
|
|
-L <dir> Add <dir> to the library search path
|
|
|
|
-l <libname> Search for library <libname>
|
|
|
|
--opt-level=<O0, O1, O2, or O3>
|
|
|
|
Optimization level for LTO
|
2024-05-14 18:39:48 -05:00
|
|
|
--override-image=<kind=file>
|
|
|
|
Uses the provided file as if it were the output of the device link step
|
2022-07-08 14:29:04 -04:00
|
|
|
-o <path> Path to file to write output
|
|
|
|
--pass-remarks-analysis=<value>
|
|
|
|
Pass remarks for LTO
|
|
|
|
--pass-remarks-missed=<value>
|
|
|
|
Pass remarks for LTO
|
|
|
|
--pass-remarks=<value> Pass remarks for LTO
|
|
|
|
--print-wrapped-module Print the wrapped module's IR for testing
|
|
|
|
--ptxas-arg=<value> Argument to pass to the 'ptxas' invocation
|
2024-02-07 08:20:07 -06:00
|
|
|
--relocatable Link device code to create a relocatable offloading application
|
2022-07-08 14:29:04 -04:00
|
|
|
--save-temps Save intermediate results
|
|
|
|
--sysroot<value> Set the system root
|
|
|
|
--verbose Verbose output from tools
|
|
|
|
--v Display the version number and exit
|
|
|
|
-- The separator for the wrapped linker arguments
|
|
|
|
|
2024-02-07 08:20:07 -06:00
|
|
|
Relocatable Linking
|
|
|
|
===================
|
|
|
|
|
2024-02-07 13:03:14 -06:00
|
|
|
The ``clang-linker-wrapper`` handles linking embedded device code and then
|
|
|
|
registering it with the appropriate runtime. Normally, this is only done when
|
|
|
|
the executable is created so other files containing device code can be linked
|
|
|
|
together. This can be somewhat problematic for users who wish to ship static
|
|
|
|
libraries that contain offloading code to users without a compatible offloading
|
2024-02-07 08:20:07 -06:00
|
|
|
toolchain.
|
|
|
|
|
2024-02-07 13:03:14 -06:00
|
|
|
When using a relocatable link with ``-r``, the ``clang-linker-wrapper`` will
|
|
|
|
perform the device linking and registration eagerly. This will remove the
|
|
|
|
embedded device code and register it correctly with the runtime. Semantically,
|
|
|
|
this is similar to creating a shared library object. If standard relocatable
|
|
|
|
linking is desired, simply do not run the binaries through the
|
|
|
|
``clang-linker-wrapper``. This will simply append the embedded device code so
|
2024-02-07 08:20:07 -06:00
|
|
|
that it can be linked later.
|
2021-12-30 16:41:36 -05:00
|
|
|
|
2024-02-12 08:14:54 -06:00
|
|
|
Matching
|
|
|
|
========
|
|
|
|
|
|
|
|
The linker wrapper will link extracted device code that is compatible with each
|
|
|
|
other. Generally, this requires that the target triple and architecture match.
|
|
|
|
An exception is made when the architecture is listed as ``generic``, which will
|
|
|
|
cause it be linked with any other device code with the same target triple.
|
|
|
|
|
2024-05-14 18:39:48 -05:00
|
|
|
Debugging
|
|
|
|
=========
|
|
|
|
|
|
|
|
The linker wrapper performs a lot of steps internally, such as input matching,
|
|
|
|
symbol resolution, and image registration. This makes it difficult to debug in
|
|
|
|
some scenarios. The behavior of the linker-wrapper is controlled mostly through
|
|
|
|
metadata, described in `clang documentation
|
|
|
|
<https://clang.llvm.org/docs/OffloadingDesign.html>`_. Intermediate output can
|
|
|
|
be obtained from the linker-wrapper using the ``--save-temps`` flag. These files
|
|
|
|
can then be modified.
|
|
|
|
|
|
|
|
.. code-block:: sh
|
|
|
|
|
|
|
|
$> clang openmp.c -fopenmp --offload-arch=gfx90a -c
|
|
|
|
$> clang openmp.o -fopenmp --offload-arch=gfx90a -Wl,--save-temps
|
|
|
|
$> ; Modify temp files.
|
|
|
|
$> llvm-objcopy --update-section=.llvm.offloading=out.bc openmp.o
|
|
|
|
|
|
|
|
Doing this will allow you to override one of the input files by replacing its
|
|
|
|
embedded offloading metadata with a user-modified version. However, this will be
|
|
|
|
more difficult when there are multiple input files. For a very large hammer, the
|
|
|
|
``--override-image=<kind>=<file>`` flag can be used.
|
|
|
|
|
|
|
|
In the following example, we use the ``--save-temps`` to obtain the LLVM-IR just
|
|
|
|
before running the backend. We then modify it to test altered behavior, and then
|
|
|
|
compile it to a binary. This can then be passed to the linker-wrapper which will
|
|
|
|
then ignore all embedded metadata and use the provided image as if it were the
|
|
|
|
result of the device linking phase.
|
|
|
|
|
|
|
|
.. code-block:: sh
|
|
|
|
|
|
|
|
$> clang openmp.c -fopenmp --offload-arch=gfx90a -Wl,--save-temps
|
|
|
|
$> ; Modify temp files.
|
|
|
|
$> clang --target=amdgcn-amd-amdhsa -mcpu=gfx90a -nogpulib out.bc -o a.out
|
|
|
|
$> clang openmp.c -fopenmp --offload-arch=gfx90a -Wl,--override-image=openmp=a.out
|
|
|
|
|
2021-12-30 16:41:36 -05:00
|
|
|
Example
|
|
|
|
=======
|
|
|
|
|
|
|
|
This tool links object files with offloading images embedded within it using the
|
|
|
|
``-fembed-offload-object`` flag in Clang. Given an input file containing the
|
|
|
|
magic section we can pass it to this tool to extract the data contained at that
|
|
|
|
section and run a device linking job on it.
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2022-07-08 14:29:04 -04:00
|
|
|
clang-linker-wrapper --host-triple=x86_64 --linker-path=/usr/bin/ld -- <Args>
|