This PR is one of the many PRs in the SYCL upstreaming effort focusing
on device code linking during the SYCL offload compilation process. RFC:
https://discourse.llvm.org/t/rfc-offloading-design-for-sycl-offload-kind-and-spir-targets/74088
Approved PRs so far:
1. [Clang][SYCL] Introduce clang-sycl-linker to link SYCL offloading
device code (Part 1 of many) -
[Link](https://github.com/llvm/llvm-project/pull/112245)
2. [clang-sycl-linker] Replace llvm-link with API calls -
[Link](https://github.com/llvm/llvm-project/pull/133797)
3. [SYCL][SPIR-V Backend][clang-sycl-linker] Add SPIR-V backend support
inside clang-sycl-linker -
[Link](https://github.com/llvm/llvm-project/pull/133967)
This PR adds SYCL device code linking support to clang-linker-wrapper.
### Summary for this PR
Device code linking happens inside clang-linker-wrapper. In the current
implementation, clang-linker-wrapper does the following:
1. Extracts device code. Input_1, Input_2,.....
5. Group device code according to target devices Inputs[triple_1] = ....
Inputs[triple_2] = ....
6. For each group, i.e. Inputs[triple_i], a. Gather all the offload
kinds found inside those inputs in ActiveOffloadKinds b. Link all images
inside Inputs[triple_i] by calling clang --target=triple_i .... c.
Create a copy of that linked image for each offload kind and add it to
Output[Kind] list.
In SYCL compilation flow, there is a deviation in Step 3b. We call
device code splitting inside the 'clang --target=triple_i ....' call and
the output is now a 'packaged' file containing multiple device images.
This deviation requires us to capture the OffloadKind during the linking
stage and pass it along to the linking function (clang), so that clang
can be called with a unique option '--sycl-link' that will help us to
call 'clang-sycl-linker' under the hood (clang-sycl-linker will do SYCL
specific linking).
Our current objective is to implement an end-to-end SYCL offloading flow
and get it working. We will eventually merge our approach with the
community flow.
Thanks
---------
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
We use the `clang-offload-packager` too bundle many files into a single
binary format containing metadata. This is used for offloading
compilation which may contain multiple device binaries of different
types and architectures in a single file. We use this special binary
format to store these files along with some necessary metadata around
them. We use this format because of the difficulty of determining the
filesize of the various binary inputs that will be passed to the
offloading toolchain rather than engineering a solution for each input.
Previously we only support packaing many files into a single binary.
This patch adds support for doing the reverse by using the same
`--image=` syntax. To unpackage a binary we now present an input file
instead of an output.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D129507
Summary:
This patch adds more in-depth documentation to the
clang-offload-packacker's binary format. This format is used to create
fat binaries and link them.
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125165