llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 04:46:07 +00:00

Author	SHA1	Message	Date
Joseph Huber	21b1d55c04	[Libomptarget] Add correct relative path for the nexgen plugin Summary: I forgot that this file "borrowed" the source from the other file tree. Fix that.	2023-01-25 14:05:53 -06:00
Joseph Huber	84d0243d21	[Libomptarget] Clean up CUDA plugin CMake files Clean up this file after changing it in D142568. Depends on D142568 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D142573	2023-01-25 13:58:02 -06:00
Joseph Huber	c568622046	[Libomptarget] Remove find_package(CUDA) as it has been deprecated Since D137724 and the LLVM 17 release we have updated to CMake version 3.20. This means that `find_package(CUDA)` is officially deprecated and can be replaced with `find_package(CUDAToolkit)` instead. This patch does this and also cleans up a bit of the CMake. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D142568	2023-01-25 13:58:01 -06:00
Tom Stellard	603c286334	Bump the trunk major version to 17	2023-01-24 22:57:27 -08:00
Shilei Tian	5ba8ecb6cc	[Clang][OpenMP] Find the type `omp_allocator_handle_t` from identifier table In Clang, in order to determine the type of `omp_allocator_handle_t`, Clang checks the type of those predefined allocators. The first one it checks is `omp_null_allocator`. If the language is C, and the system is 64-bit, what Clang gets is a `int`, instead of an enum of size 8, given the fact how we define `omp_allocator_handle_t` in `omp.h`. If the allocator is captured by a region, let's say a parallel region, the allocator will be privatized. Because Clang deems `omp_allocator_handle_t` as an `int`, it will first cast the value returned by the runtime library (for `libomp` it is a `void *`) to `int`, and then in the outlined function, it casts back to `omp_allocator_handle_t`. This two casts completely shaves the first 32-bit of the pointer value returned from `libomp`, and when the private "new" pointer is fed to another runtime function `__kmpc_allocate()`, it causes segment fault. That is the root cause of PR54082. I have no idea why `-fno-pic` could hide this bug. In this patch, we detect `omp_allocator_handle_t` using roughly the same method as `omp_event_handle_t`, by looking it up into the identifier table. Fix #54082. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D142297	2023-01-24 22:49:05 -05:00
Shilei Tian	dafebd5b5a	[OpenMP] Create a temp file in /tmp if /dev/shm is not accessible When `libomp` is initialized, it creates a temp file in `/dev/shm` to store registration flag. Some systems, like Android, don't have `/dev/shm`, then this feature is disabled by the macro `KMP_USE_SHM`, though most Linux distributions have that. However, some customized distribution, such as the one reported in https://github.com/llvm/llvm-project/issues/53955, doesn't support it either. It causes a core dump. In this patch, if it is the case, we will try to create a temporary file in `/tmp`, and if it still doesn't make it, then we error out. Note that we don't consider in this patch if the temporary directory has been set to `TMPDIR` in this patch. If `/tmp` is not accessible, we error out. Fix #53955. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142175	2023-01-24 21:45:38 -05:00
Kevin Sala	2a539ee17d	[OpenMP][libomptarget] Implement memory lock/unlock API in NextGen plugins This patch implements the memory lock/unlock API, introduced in patch https://reviews.llvm.org/D139208, in the NextGen plugins. Locked buffers feature reference counting and we allow certain overlapping. Given an already locked buffer A, other buffers that are fully contained inside A can be locked again, even if they are smaller than A. In this case, the reference count of locked buffer A will be incremented. However, extending an existing locked buffer is not allowed. The original buffer is actually unlocked once all its users have released the locked buffer and sub-buffers (i.e., the reference counter becomes zero). Differential Revision: https://reviews.llvm.org/D141227	2023-01-25 00:11:38 +01:00
Joseph Huber	5d1dc9fa04	[OpenMP] Do not link the bitcode OpenMP runtime when targeting AMDGPU. The AMDGPU target can only emit LLVM-IR, so we can always rely on LTO to link the static version of the runtime optimally. Using the static library only has a few advantages. Namely, it avoids several known bugs and allows us to optimize out more functions. This is legal since the changes in D142486 and D142484 Depends on D142486 D142484 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142491	2023-01-24 17:01:37 -06:00
Giorgis Georgakoudis	4b88bf5c70	[OpenMP][docs] Update for record-and-replay Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142492	2023-01-24 14:36:37 -08:00
Shilei Tian	7e89420116	[OpenMP] Disable tests that are not supported by GCC if it is used for testing GCC doesn't support `-fopenmp-version`, causing test failure if the compiler used for testing is GCC. GCC's OpenMP 5.2 support is very limited yet. Disable those tests requiring 5.2 feature for GCC as well. We might want to take a look at all `libomp` tests and mark those tests that don't support GCC yet. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D142173	2023-01-24 17:00:15 -05:00
Johannes Doerfert	62bc222875	[OpenMP][NFC] Augment release notes	2023-01-24 13:23:15 -08:00
Kevin Sala	9dea83d4af	[OpenMP][Doc] Update release notes with NextGen plugins	2023-01-24 22:15:49 +01:00
Guilherme Valarini	7cf63ee80c	[OpenMP][Docs] Add non-blocking target nowait environment variables	2023-01-24 16:30:34 -03:00
Shilei Tian	31c95e5a4d	[OpenMP][Doc] Update release note for 16 release	2023-01-24 14:04:28 -05:00
Joseph Huber	c9c5a076b3	[OpenMP][Docs] Add some release notes for OpenMP	2023-01-24 12:35:58 -06:00
Slava Zakharin	8743e1e369	Revert "[OpenMP][Archer] Use dlsym rather than weak symbols for TSan annotations" OpenMP buildbots are failing: https://lab.llvm.org/buildbot/#/builders/193/builds/25434 https://lab.llvm.org/buildbot/#/builders/193/builds/25420 This reverts commit 7fbf12210007a66f7b62beadc0e5a52561cc0ab3.	2023-01-24 10:17:35 -08:00
Joachim Protze	7fbf122100	[OpenMP][Archer] Use dlsym rather than weak symbols for TSan annotations This patch fix issues reported for Ubuntu and possibly other platforms: https://github.com/llvm/llvm-project/issues/45290 The latest comment on this issue points out that using dlsym rather than the weak symbol approach to call TSan annotation functions fixes the issue for Ubuntu. Differential Revision: https://reviews.llvm.org/D142378	2023-01-24 15:14:51 +01:00
Johannes Doerfert	5d9cb20f40	[OpenMP] Run the Attributor as part of the device runtime optimization This will help us propagate assumptions to call sites, among other things.	2023-01-23 22:45:47 -08:00
Joseph Huber	2a8c9d7c8a	[Libomptarget] Use the nextgen plugins by default. The next-gen plugins are complete drop-in replacements for the old versions. We should strive to replace the old ones as quickly as possible now that we have a viable alternative. The only test failing is the `prelock.cpp` test as the support has not landed in the next-gen plugins. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D142399	2023-01-23 17:30:46 -06:00
Scott Linder	25c0ea2a53	[NFC] Consolidate llvm::CodeGenOpt::Level handling Add free functions llvm::CodeGenOpt::{getLevel,getID,parseLevel} to provide common implementations for functionality that has been duplicated in many places across the codebase. Differential Revision: https://reviews.llvm.org/D141968	2023-01-23 22:50:49 +00:00
Martin Storsjö	c3737a6522	[docs] Add release notes for news in 16.x done by me, or otherwise relating to MinGW targets Differential Revision: https://reviews.llvm.org/D142346	2023-01-23 22:12:32 +02:00
Joseph Huber	b280e12a3d	[Libomptarget][NFC] Address a few warnings in libomptarget Summary: Fix a few minor warnings that show up in `libomptarget`.	2023-01-23 08:56:03 -06:00
Joseph Huber	716bae0b48	[Libomptarget] Include "hsa/hsa.h" instead Summary: Recently AMD moved the "hsa.h" include to "hsa/hsa.h". This causes several warning. This patch checks to see if we can include that one instead. This should hopefully keep things backwards compatible while silencing the warnings.	2023-01-23 08:56:03 -06:00
Joseph Huber	11908c20cd	[Libomptarget][NFC] Silence unknown CUDA version warnings Summary: These warnings are very loud considering they get repeated at least 30 times each build. This patch just silences them.	2023-01-23 08:56:03 -06:00
Shilei Tian	693358d787	[OpenMP][DeviceRTL][NFC] Use `OMPTgtExecModeFlags` from `llvm/include/llvm/Frontend/OpenMP/OMPDeviceConstants.h` This patch makes preparation for a series that will enable per-kernel information used in both host and device runtime. Some variables/enums, such as `OMPTgtExecModeFlags`, have to be shared by both of them. A new header `OMPDeviceConstants.h` is added, containing code that will be shared by them. We will introduce more variables soon. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142320	2023-01-22 19:10:54 -05:00
Johannes Doerfert	e68313f100	[OpenMP][FIX] Use thread id not team id for masked section	2023-01-22 15:45:00 -08:00
Johannes Doerfert	c175c07d90	[OpenMP][FIX] Split test into amdgpu and nvptx specific ones This avoids running the test for the host.	2023-01-21 20:12:04 -08:00
Johannes Doerfert	40f9bf082f	[OpenMP] Introduce the `ompx_dyn_cgroup_mem(<N>)` clause Dynamic memory allows users to allocate fast shared memory when a kernel is launched. We support a single size for all kernels via the `LIBOMPTARGET_SHARED_MEMORY_SIZE` environment variable but now we can control it per kernel invocation, hence allow computed values. Note: Only the nextgen plugins will allocate memory based on the clause, the old plugins will silently miscompile. Differential Revision: https://reviews.llvm.org/D141233	2023-01-21 18:46:36 -08:00
Johannes Doerfert	3820d0eaaf	[OpenMP][FIX] Runtime args are not kernel args Clang passes `KernelArgs.NumArgs` to the runtime but not all are kernel arguments. This ensures we fallback to the old logic. In a follow up we should introduce a new `KernelArgs.NumKernelArgs` field and set it in the runtime.	2023-01-21 13:43:10 -08:00
Johannes Doerfert	16a385ba21	[OpenMP] Modernize the kernel launching interface and APIs We already created a versioned `__tgt_kernel_arguments` struct but it was only briefly used and its content was passed in isolation anyway. This makes it hard to add more information in the future. With this patch we fully embrace the struct as means to pass information from the compiler to the plugin as part of a kernel launch. The patch also extends and renames the struct, bumping the version number to 2. Version 1 entries are auto-upgraded. This is in preparation for "bare" kernel launches, per kernel dynamic shared memory, CUDA/HIP lowering, etc. The `__tgt_target_kernel_nowait` interface was deprecated as it was unused. Once we actually implement support for something like that, we can add an appropriate API. Note: Only plugins with the `launch_kernel` interface are now supported. That means that a new clang won't be able to use an old runtime. An old clang can still use the new runtime since the libomptarget interface did not change. Differential Revision: https://reviews.llvm.org/D141232	2023-01-21 11:16:21 -08:00
Jon Chesterfield	2257e3d2e5	[openmp] Workaround for HSA in issue 60119 Move plugin initialization to libomptarget initialization. Removes the call_once control, probably fractionally faster overall. Fixes issue 60119 because the plugin initialization, which might try to dlopen unrelated shared libraries, is no longer nested within a call from application code. Fixes #60119 Reviewed By: Maetveis, jhuber6 Differential Revision: https://reviews.llvm.org/D142249	2023-01-21 12:01:14 +00:00
Joseph Huber	c7af1d19f3	[OpenMP] Remove unfinished and unused 'Analyzer' tool Summary: This patch removes a tool that was never finished and has no plans of being picked up again. It does not need to live in LLVM source in an unusable state.	2023-01-20 17:34:26 -06:00
Terry Wilmarth	4c58e5a28f	[OpenMP] Fix for distributed barrier. Distributed barrier was found to cause hangs in some test cases. Found that a section updating the barrier size was improperly shifted to a different code section during patching. Restored to original location, all tests run to completion. Differential Revision: https://reviews.llvm.org/D141618	2023-01-20 13:54:25 -06:00
Shilei Tian	50d2a193a7	[OpenMP] Only test kmp_atomic_float10_max_min.c on X86 The test `openmp/runtime/test/atomic/kmp_atomic_float10_max_min.c` uses a compiler flag `-mlong-double-80` that might not be supported by all targets. Currently it requires `x86-registered-target`, but that requirement can be true when LLVM supports X86 while the actual `libomp` arch is not X86. For example, when LLVM is built on AArch64 with all targets enabled, `x86-registered-target` can be met. If `libomp` is built with native target, aka. AArch64, the test will still be enabled, causing test failure. This patch only enables the test if the actual target is X86. The actual target is determined by `LIBOMP_ARCH`. Fix #53696. Reviewed By: jlpeyton Differential Revision: https://reviews.llvm.org/D142172	2023-01-20 10:52:53 -05:00
Kevin Sala	097f42602d	[OpenMP][libomptarget] Fix deinit of NextGen AMDGPU plugin This patch fixes a segfault that was appearing when the plugin fails to initialize and then is deinitialized. Also, do not call hsa_shut_down if the hsa_init failed. Differential Revision: https://reviews.llvm.org/D142145	2023-01-20 13:17:32 +01:00
Nikita Popov	1b4fdf18bc	[libomp] Explicitly include <string> header (NFC) This is required to build against libstdc++ 13. Debug.h uses std::stoi() from <string> without explicitly including it.	2023-01-20 10:39:27 +01:00
Ye Luo	9fecd58e5e	[OpenMP] Build device runtimes for sm_89 and sm_90	2023-01-19 15:39:05 -06:00
Gilles Gouaillardet	3a362a9f38	[OpenMP][libomp] Insert correct HWLOC version guards Put needed HWLOC version guards around relevant HWLOC API. Tested OpenMP host runtime build with HWLOC 1.11.13, 2.0-2.9. Differential Revision: https://reviews.llvm.org/D142152 Fix #54951	2023-01-19 14:30:43 -06:00
Guilherme Valarini	e0b3b6cec7	[OpenMP][Fix] Track all threads that may delete an entry The entries inside a "target data end" is processed in three steps: 1. Query internal data maps for the entries and dispatch any necessary device-side operations (i.e., data retrieval); 2. Synchronize the such operations; 3. Update the host-side pointers and remove any entry which reference counter reached zero. Such steps may be executed by multiple threads which may even operate on the same entries. The current implementation (D121058) tries to synchronize these threads by tracking the "owner" for the deletion of each entry using their thread ID. Unfortunately it may failed to do so because of the following reasons: 1. The owner is always assigned at the first step only if the reference count is 0 when the map is queried. This does not work when such owner thread is faster than a previous one that is also processing the same entry on another "target data end", leading to user-after-free problems. 2. The entry is only added for post-processing (step 3) if its reference count was 0 at query time (step 1). This does not allow for threads to exchange responsibility for the deletion, leading again to user-after-free problems. 3. An entry may appear multiple times in the arguments array of a "target data end", which may lead to deleting the entry prematurely, leading, again, to user-after-free problems. This patch addresses these problems by tracking all the threads that are using an entry at "target data end" region through a counter, ensuring only the last one deletes it when needed. It also ensures that all entries that are successfully found inside the data maps in step 1 are also processed in step 3, regardless if their reference count was zeroed or not at query time. This ensures the deletion ownership may be passed to any thread that is using such entry. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D132676	2023-01-19 12:11:52 -03:00
Shilei Tian	97ae7d83e3	[OpenMP][OMPT] Expect failure from tool_available_search.c on macOS D91464 introduced verbose tool loading, but the test check only considers Linux. On macOS, the outputs are totally different, causing the regression afterwards. This patch simply sets the test to XFAIL on macOS. Fix #56833. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142045	2023-01-18 20:09:06 -05:00
Shilei Tian	3ff1726cf8	[OpenMP][AMDGPU] Get rid of redundant macro def The next gen plugin adds the def of `DEBUG_PREFIX` in CMake, causing compiler warning that `DEBUG_PREFIX` is defined multiple times. This patch simply guards the macro def. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142064	2023-01-18 20:08:18 -05:00
Shilei Tian	a4f246a83e	[OpenMP] Fix inconsistent task state if hot team is not used This patch fixes the inconsistent task state when hot team is not used. When the primary thread executes `__kmp_join_call`, it calls `__kmp_free_team`, where worker threads will get destroyed if not using hot team. The destroy of worker threads also reset their task state. However, the primary thread's is not reset. When the next parallel region is encountered, in `__kmp_task_team_sync`, the task state of thread will be flipped. Since the state of primary thread is not reset, it is still 1, but all the worker threads will be 0, this leads to the inconsistent task state, causing those threads are using completely different task team. Fix #59190. Reviewed By: tlwilmar Differential Revision: https://reviews.llvm.org/D141979	2023-01-18 18:21:57 -05:00
Giorgis Georgakoudis	0f4b4e8e4d	[OpenMP] RecordReplay saves bitcode when JIT-ing This patch enables to store bitcode images when JIT is enabled for the record-and-replay functionality (see https://reviews.llvm.org/D138931). Credits to @jdoerfert for refactoring the code. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D141986	2023-01-18 11:25:25 -08:00
Shilei Tian	57662cb2e3	[OpenMP] Disable building `libomptarget` on 32-bit systems There are plenty of assumptions in `libomptarget` and the device runtime about the pointer size or `size_t`, etc. 32-bit systems are not supported. There is no point to refine whole things to make it portable. This patch simply disables building on 32-bit systems. Fix #60121. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142023	2023-01-18 13:40:08 -05:00
Jonathan Peyton	2aea0a9de0	[OpenMP][libomp] Switch Intel topology type values: module, tile According to Software Developer Manual, modules should be value 3 and tile should be value 4.	2023-01-18 12:11:43 -06:00
Jonathan Peyton	f311922535	[OpenMP][libomp] Fix stats-gathering for new MSVC sections API Differential Revision: https://reviews.llvm.org/D139867	2023-01-18 11:59:12 -06:00
Joseph Huber	ea0eee80d8	[Libomptarget] Only build GPU tests if a GPU is found on the system Currently we build tests as long as the libraries are found on the machine. This doesn't necessarily mean there is a GPU to use though. This patch changes it to where we only will build the tests if we found a compatible GPU via `nvptx-arch` or `amdgpu-arch`. The only downside to this I could see if someone were to build LLVM on a home node of a cluster and then wished to run the tests after switching to a compute node. For this I think we should allow it to be overridden. I think that's better than allowing us to run tests that will fail by default. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D142018	2023-01-18 10:35:37 -06:00
Giorgis Georgakoudis	94c772dc92	[OpenMP] Support kernel record and replay This patch adds functionality for recording and replaying the execution of OpenMP offload kernels, based on an original implementation by Steve Rangel. The patch extends libomptarget to extract a json description of the kernel, the device image binary, and a device memory snapshot before and after the execution of a recorded kernel. Kernel recording/replaying in libomptarget is controlled through env vars (LIBOMPTARGET_RECORD, LIBOMPTARGET_REPLAY). It provides a tool, llvm-omp-kernel-replay, for replaying a kernel using the extracted information with the ability to verify replayed execution using the post-execution device memory snapshot, also supporting changing the number of teams/threads for replaying. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D138931	2023-01-17 16:29:03 -08:00
Joseph Huber	566ecc2231	[Libomptarget][NFC] Rename device environment variable This variable is used by the runtime. Before kernel launch we set it to indicate several configuration options from the host. This patch renames it to be more in-line with the rest of the named exported from the runtime. This is better because this is the only symbol visible to the host from the runtime, so it should have a reserved name. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D141960	2023-01-17 14:28:04 -06:00
Joseph Huber	83af411ca7	[Libomptarget] Replace Nvidia arch lookup with 'nvptx-arch' This method to look up the CUDA architecture is deprecated in newer versions of CMake. We also have our own way to query this information that we control now via the `nvptx-arch` program, which should always be present in LLVM builds with clang going forward. This is currently only used for testing so I think we should be okay with the dependency. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D141933	2023-01-17 12:38:34 -06:00

1 2 3 4 5 ...

2633 Commits