108 lines
3.1 KiB
C
Raw Normal View History

[libc] Add a loader utility for AMDHSA architectures for testing This is the first attempt to get some testing support for GPUs in LLVM's libc. We want to be able to compile for and call generic code while on the device. This is difficult as most GPU applications also require the support of large runtimes that may contain their own bugs (e.g. CUDA / HIP / OpenMP / OpenCL / SYCL). The proposed solution is to provide a "loader" utility that allows us to execute a "main" function on the GPU. This patch implements a simple loader utility targeting the AMDHSA runtime called `amdhsa_loader` that takes a GPU program as its first argument. It will then attempt to load a predetermined `_start` kernel inside that image and launch execution. The `_start` symbol is provided by a `start` utility function that will be linked alongside the application. Thus, this should allow us to run arbitrary code on the user's GPU with the following steps for testing. ``` clang++ Start.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -ffreestanding -nogpulib -nostdinc -nostdlib -c clang++ Main.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -nogpulib -nostdinc -nostdlib -c clang++ Start.o Main.o --target=amdgcn-amd-amdhsa -o image amdhsa_loader image <args, ...> ``` We determine the `-mcpu` value using the `amdgpu-arch` utility provided either by `clang` or `rocm`. If `amdgpu-arch` isn't found or returns an error we shouldn't run the tests as the machine does not have a valid HSA compatible GPU. Alternatively we could make this utility in-source to avoid the external dependency. This patch provides a single test for this untility that simply checks to see if we can compile an application containing a simple `main` function and execute it. The proposed solution in the future is to create an alternate implementation of the LibcTest.cpp source that can be compiled and launched using this utility. This approach should allow us to use the same test sources as the other applications. This is primarily a prototype, suggestions for how to better integrate this with the existing LibC infastructure would be greatly appreciated. The loader code should also be cleaned up somewhat. An implementation for NVPTX will need to be written as well. Reviewed By: sivachandra, JonChesterfield Differential Revision: https://reviews.llvm.org/D139839
2023-02-06 11:01:20 -06:00
//===-- Generic device loader interface -----------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_LIBC_UTILS_GPU_LOADER_LOADER_H
#define LLVM_LIBC_UTILS_GPU_LOADER_LOADER_H
#include "utils/gpu/server/Server.h"
#include <cstddef>
#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <cstring>
[libc] Add a loader utility for AMDHSA architectures for testing This is the first attempt to get some testing support for GPUs in LLVM's libc. We want to be able to compile for and call generic code while on the device. This is difficult as most GPU applications also require the support of large runtimes that may contain their own bugs (e.g. CUDA / HIP / OpenMP / OpenCL / SYCL). The proposed solution is to provide a "loader" utility that allows us to execute a "main" function on the GPU. This patch implements a simple loader utility targeting the AMDHSA runtime called `amdhsa_loader` that takes a GPU program as its first argument. It will then attempt to load a predetermined `_start` kernel inside that image and launch execution. The `_start` symbol is provided by a `start` utility function that will be linked alongside the application. Thus, this should allow us to run arbitrary code on the user's GPU with the following steps for testing. ``` clang++ Start.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -ffreestanding -nogpulib -nostdinc -nostdlib -c clang++ Main.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -nogpulib -nostdinc -nostdlib -c clang++ Start.o Main.o --target=amdgcn-amd-amdhsa -o image amdhsa_loader image <args, ...> ``` We determine the `-mcpu` value using the `amdgpu-arch` utility provided either by `clang` or `rocm`. If `amdgpu-arch` isn't found or returns an error we shouldn't run the tests as the machine does not have a valid HSA compatible GPU. Alternatively we could make this utility in-source to avoid the external dependency. This patch provides a single test for this untility that simply checks to see if we can compile an application containing a simple `main` function and execute it. The proposed solution in the future is to create an alternate implementation of the LibcTest.cpp source that can be compiled and launched using this utility. This approach should allow us to use the same test sources as the other applications. This is primarily a prototype, suggestions for how to better integrate this with the existing LibC infastructure would be greatly appreciated. The loader code should also be cleaned up somewhat. An implementation for NVPTX will need to be written as well. Reviewed By: sivachandra, JonChesterfield Differential Revision: https://reviews.llvm.org/D139839
2023-02-06 11:01:20 -06:00
/// Generic launch parameters for configuration the number of blocks / threads.
struct LaunchParameters {
uint32_t num_threads_x;
uint32_t num_threads_y;
uint32_t num_threads_z;
uint32_t num_blocks_x;
uint32_t num_blocks_y;
uint32_t num_blocks_z;
};
/// The arguments to the '_begin' kernel.
struct begin_args_t {
int argc;
void *argv;
void *envp;
void *rpc_shared_buffer;
};
/// The arguments to the '_start' kernel.
struct start_args_t {
int argc;
void *argv;
void *envp;
void *ret;
};
/// The arguments to the '_end' kernel.
struct end_args_t {
int argc;
};
[libc] Add a loader utility for AMDHSA architectures for testing This is the first attempt to get some testing support for GPUs in LLVM's libc. We want to be able to compile for and call generic code while on the device. This is difficult as most GPU applications also require the support of large runtimes that may contain their own bugs (e.g. CUDA / HIP / OpenMP / OpenCL / SYCL). The proposed solution is to provide a "loader" utility that allows us to execute a "main" function on the GPU. This patch implements a simple loader utility targeting the AMDHSA runtime called `amdhsa_loader` that takes a GPU program as its first argument. It will then attempt to load a predetermined `_start` kernel inside that image and launch execution. The `_start` symbol is provided by a `start` utility function that will be linked alongside the application. Thus, this should allow us to run arbitrary code on the user's GPU with the following steps for testing. ``` clang++ Start.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -ffreestanding -nogpulib -nostdinc -nostdlib -c clang++ Main.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -nogpulib -nostdinc -nostdlib -c clang++ Start.o Main.o --target=amdgcn-amd-amdhsa -o image amdhsa_loader image <args, ...> ``` We determine the `-mcpu` value using the `amdgpu-arch` utility provided either by `clang` or `rocm`. If `amdgpu-arch` isn't found or returns an error we shouldn't run the tests as the machine does not have a valid HSA compatible GPU. Alternatively we could make this utility in-source to avoid the external dependency. This patch provides a single test for this untility that simply checks to see if we can compile an application containing a simple `main` function and execute it. The proposed solution in the future is to create an alternate implementation of the LibcTest.cpp source that can be compiled and launched using this utility. This approach should allow us to use the same test sources as the other applications. This is primarily a prototype, suggestions for how to better integrate this with the existing LibC infastructure would be greatly appreciated. The loader code should also be cleaned up somewhat. An implementation for NVPTX will need to be written as well. Reviewed By: sivachandra, JonChesterfield Differential Revision: https://reviews.llvm.org/D139839
2023-02-06 11:01:20 -06:00
/// Generic interface to load the \p image and launch execution of the _start
/// kernel on the target device. Copies \p argc and \p argv to the device.
/// Returns the final value of the `main` function on the device.
int load(int argc, char **argv, char **evnp, void *image, size_t size,
const LaunchParameters &params);
/// Return \p V aligned "upwards" according to \p Align.
template <typename V, typename A> inline V align_up(V val, A align) {
return ((val + V(align) - 1) / V(align)) * V(align);
}
/// Copy the system's argument vector to GPU memory allocated using \p alloc.
template <typename Allocator>
void *copy_argument_vector(int argc, char **argv, Allocator alloc) {
size_t argv_size = sizeof(char *) * (argc + 1);
size_t str_size = 0;
for (int i = 0; i < argc; ++i)
str_size += strlen(argv[i]) + 1;
// We allocate enough space for a null terminated array and all the strings.
void *dev_argv = alloc(argv_size + str_size);
if (!dev_argv)
return nullptr;
// Store the strings linerally in the same memory buffer.
void *dev_str = reinterpret_cast<uint8_t *>(dev_argv) + argv_size;
for (int i = 0; i < argc; ++i) {
size_t size = strlen(argv[i]) + 1;
std::memcpy(dev_str, argv[i], size);
static_cast<void **>(dev_argv)[i] = dev_str;
dev_str = reinterpret_cast<uint8_t *>(dev_str) + size;
}
// Ensure the vector is null terminated.
reinterpret_cast<void **>(dev_argv)[argv_size] = nullptr;
return dev_argv;
};
/// Copy the system's environment to GPU memory allocated using \p alloc.
template <typename Allocator>
void *copy_environment(char **envp, Allocator alloc) {
int envc = 0;
for (char **env = envp; *env != 0; ++env)
++envc;
return copy_argument_vector(envc, envp, alloc);
};
inline void handle_error(const char *msg) {
fprintf(stderr, "%s\n", msg);
exit(EXIT_FAILURE);
}
inline void handle_error(rpc_status_t) {
handle_error("Failure in the RPC server\n");
}
#endif