3457 Commits

Author SHA1 Message Date
Fangrui Song
9936ac3083
[docs] Prefer --gcc-install-dir= to deprecated GCC_INSTALL_PREFIX (#85458)
Setting GCC_INSTALL_PREFIX leads to a warning (#77537).

Link:
https://discourse.llvm.org/t/add-gcc-install-dir-deprecate-gcc-toolchain-and-remove-gcc-install-prefix/65091
Link:
https://discourse.llvm.org/t/correct-cmake-parameters-for-building-clang-and-lld-for-riscv/72833
2024-03-18 13:11:44 -07:00
nicebert
20f5bcfb1a
[OpenMP] Add OpenMP extension API to dump mapping tables (#85381)
This adds an API call ompx_dump_mapping_tables.
This allows users to debug the mapping tables and can be especially
useful for unified shared memory applications to check if the code
behaves in the way it should. The implementation reuses code already
present to dump mapping tables (in a debug setting).

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-03-18 14:09:20 -05:00
David CARLIER
6d3cec01a6
Revert "[openmp] __kmp_x86_cpuid fix for i386/PIC builds." (#85526)
Reverts llvm/llvm-project#84626
2024-03-16 13:41:12 +00:00
Joseph Huber
470040bd4d [Libomptarget][NFC] Remove warning on return value const 2024-03-15 18:50:33 -05:00
Felipe Cabarcas
0e21672d99
Fixing LIBOMPTARGET_INFO message, for Copying data from device to host (#85444)
When running OpenMP offloading application with LIBOMPTARGET_INFO=-1,
the addresses of the Copying data from **device** to **host**, the
address are swap.
As an example, Currently the address would be
```
omptarget device 0 info: Mapping exists with HstPtrBegin=0x00007ffddaf0fb90, TgtPtrBegin=0x00007fb385404000, Size=8000, DynRefCount=0 (decremented, delayed deletion), HoldRefCount=0
omptarget device 0 info: Copying data from device to host, TgtPtr=0x00007ffddaf0fb90, HstPtr=0x00007fb385404000, Size=8000, Name=d
```
And it should be
```
omptarget device 0 info: Copying data from device to host, TgtPtr=0x00007fb385404000, HstPtr=0x00007ffddaf0fb90, Size=8000, Name=d
```

---------

Co-authored-by: fel-cab <fel-cab@github.com>
2024-03-15 15:25:14 -04:00
Ulrich Weigand
c9062e8f78 Reapply [libomptarget] Build plugins-nextgen for SystemZ (#83978)
The plugin was not getting built as the build_generic_elf64 macro
assumes the LLVM triple processor name matches the CMake processor name,
which is unfortunately not the case for SystemZ.

Fix this by providing two separate arguments instead.

Actually building the plugin exposed a number of other issues causing
various test failures. Specifically, I've had to add the SystemZ target
to
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
- LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt
- a check_plugin_target call in libomptarget/src/CMakeLists.txt

Finally, I've had to set a number of test cases to UNSUPPORTED on
s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED
for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on
s390x for what seem to be the same reason.

In addition, this also requires support for BE ELF files in
plugins-nextgen: https://github.com/llvm/llvm-project/pull/85246
2024-03-15 19:06:43 +01:00
Ulrich Weigand
2210c85a66 Reapply [libomptarget] Support BE ELF files in plugins-nextgen (#85246)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to use generic types, in
particular by using (a unique_ptr to) ObjectFile instead of
ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-15 18:28:28 +01:00
Ulrich Weigand
4c8714efc5 Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#85246)"
This reverts commit 611c62b30d160375b46b7afedc04965ee6f67d1a.
2024-03-14 18:38:13 +01:00
Ulrich Weigand
611c62b30d
[libomptarget] Support BE ELF files in plugins-nextgen (#85246)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to use generic types, in
particular by using (a unique_ptr to) ObjectFile instead of
ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-14 18:19:12 +01:00
Andrew Brown
d83660827f
[openmp][wasm] Fix microtask type mismatch (#84355)
When OpenMP is compiled for WebAssembly (see #71297), it invokes a
microtask via a `switch` statement that dispatches to the `void *`
microtask pointer with spelled-out arguments (not varargs). As #83329
points out, however, this can result in a type mismatch when the
indirect call is executed by WebAssembly; WebAssembly expects the called
pointer to have the precise type of the call site. This change fixes the
issue by bringing back the approach in [D142593] of type-casting all the
`switch` arms to the precise type. This fixes #83329.

[D142593]: https://reviews.llvm.org/D142593
2024-03-14 10:23:44 -05:00
MessyHack
ea848d0a6d
[OpenMP] Sort topology after adding processor group layer. (#83943)
Various behavior around creating affinity masks and detecting uniform
topology depends on the topology being sorted.

resort topology after adding processor group layer to ensure that the
updated topology reflects the newly added processor group info.

Observed that the topology was not sorted correctly on high core count
AMD Epyc Genoa (2 sockets, 96 cores, 2 threads) using NUMA (NPS 2+).
2024-03-13 16:22:23 -05:00
Joseph Huber
cd8843f87a
[OpenMP] Disable flaky barrier fence test (#85093)
Summary:
This test is flaky on all targets I know of. We should disable it for
now so running the test suite doesn't randomly fail 50% of the time.
2024-03-13 15:05:22 -05:00
Jonathan Peyton
6272500e0b
[OpenMP] Remove unused logical/physical CPUID information (#83298) 2024-03-12 11:37:01 -07:00
Jonathan Peyton
3303be63fc
[OpenMP] Make sure mask is set to nullptr (#83299) 2024-03-12 11:36:43 -07:00
Jonathan Peyton
f5334f5da5
[OpenMP] Add debug checks for divide by zero (#83300) 2024-03-12 11:36:19 -07:00
Joseph Huber
9f69d3cf88
[Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (#84928)
Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.
2024-03-12 10:39:40 -05:00
Jonathan Peyton
9b1c496898
[OpenMP] Fixup while loops to avoid bad NULL check (#83302) 2024-03-11 10:28:12 -05:00
Jonathan Peyton
de4d7015d0
[OpenMP] Remove unnecessary check of ap (#83303) 2024-03-11 10:27:53 -05:00
Jonathan Peyton
1ed463d961
[OpenMP] Make sure ptr is used after NULL check (#83304) 2024-03-11 10:27:31 -05:00
Jonathan Peyton
b4e39ad117
[OpenMP] Remove dead code of checking int > INT_MAX (#83305) 2024-03-11 10:26:53 -05:00
David CARLIER
facb89ae12
[openmp] __kmp_x86_cpuid fix for i386/PIC builds. (#84626) 2024-03-11 13:15:43 +00:00
Joseph Huber
8a79003307 [libc] Move RPC opcodes include out of the header
Summary:
This header isn't strictly necessary, and is currently broken because we
install these to separate locations.
2024-03-10 14:07:47 -05:00
David CARLIER
fa4cc39255
[openmp] adding affinity support to DragonFlyBSD. (#84672) 2024-03-10 09:56:55 +00:00
Vadim Paretsky
110141b378
[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540)
MSVC does not define __BYTE_ORDER__ making the check for BigEndian
erroneously evaluate to true and breaking the struct definitions in MSVC
compiled builds correspondingly. The fix adds an additional check for
whether __BYTE_ORDER__ is defined by the compiler to fix these.

---------

Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-09 10:47:31 -08:00
David CARLIER
11cd2a33f1
[openmp] porting affinity feature to netbsd. (#84618)
netbsd supports the portable hwloc's layer as well. for a hardware with
4 cpus, a cpu set is 4 and maxcpus is 256.
2024-03-09 11:45:07 +00:00
David CARLIER
05280b582a
[OpenMP] Implements __kmp_is_address_mapped for Solaris/Illumos. (#82930)
Also fixing OpenMP build itself for this platform.
2024-03-08 20:34:43 +00:00
vadikp-intel
fcd2d48325
[OpenMP] runtime support for efficient partitioning of collapsed triangular loops (#83939)
This PR adds OMP runtime support for more efficient partitioning of
certain types of collapsed loops that can be used by compilers that
support loop collapsing (i.e. MSVC) to achieve more optimal thread load
balancing.

In particular, this PR addresses double nested upper and lower isosceles
triangular loops of the following types

1. lower triangular 'less_than'
   for (int i=0; i<N; i++)
     for (int j=0; j<i; j++)
2. lower triangular 'less_than_equal'
   for (int i=0; i<N; j++)
     for (int j=0; j<=i; j++)
3. upper triangular
   for (int i=0; i<N; i++)
     for (int j=i; j<N; j++)

Includes tests for the three supported loop types.

---------

Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-07 16:28:03 -08:00
Gheorghe-Teodor Bercea
5c752df1e1
[libomptarget][nextgen-plugin][NFC] Clean-up InputSignal checks (#83458)
Clean-up InputSignal checks.
2024-03-07 12:01:42 -05:00
Anchu Rajendran S
c03fd37d9b
[flang] Changes to map variables in link clause of declare target (#83643)
As per the OpenMP standard, "If a variable appears in a link clause on a
declare target directive that does not have a device_type clause with
the nohost device-type-description then it is treated as if it had
appeared in a map clause with a map-type of tofrom" is an implicit
mapping rule. Before this change, such variables were mapped as to by
default.
2024-03-07 08:23:58 -08:00
Ulrich Weigand
fb7cc73975 Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#83976)"
This reverts commit 15b7b3182cc28f4f0b950bd73d931caa27b833ec.
2024-03-06 21:37:45 +01:00
Ulrich Weigand
70677c81de Revert "[libomptarget] Build plugins-nextgen for SystemZ (#83978)"
This reverts commit 3ecd38c8e1d34b1e4639a1de9f0cb56c7957cbd2.
2024-03-06 21:37:43 +01:00
Ulrich Weigand
d4f4f80236 Revert "[libomptarget] Fix CUDA plugin build regression"
This reverts commit b64482e23eefaef7738fde35d0b7c4174aaa6597.
2024-03-06 21:37:35 +01:00
Ulrich Weigand
b64482e23e [libomptarget] Fix CUDA plugin build regression
After 3ecd38c8e, the Handler.getELFObjectFile routine is no
longer available.  Call ELF64LEObjectFile::create directly,
which should always be suitable for CUDA images.
2024-03-06 21:00:25 +01:00
Ulrich Weigand
3ecd38c8e1
[libomptarget] Build plugins-nextgen for SystemZ (#83978)
The plugin was not getting built as the build_generic_elf64 macro
assumes the LLVM triple processor name matches the CMake processor name,
which is unfortunately not the case for SystemZ.

Fix this by providing two separate arguments instead.

Actually building the plugin exposed a number of other issues causing
various test failures. Specifically, I've had to add the SystemZ target
to
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
- LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt
- a check_plugin_target call in libomptarget/src/CMakeLists.txt

Finally, I've had to set a number of test cases to UNSUPPORTED on
s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED
for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on
s390x for what seem to be the same reason.

In addition, this also requires support for BE ELF files in
plugins-nextgen: https://github.com/llvm/llvm-project/pull/83976
2024-03-06 20:50:01 +01:00
Ulrich Weigand
15b7b3182c
[libomptarget] Support BE ELF files in plugins-nextgen (#83976)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to push all ELF specific
types into Utils/ELF. Specifically, this patch removes both the
getSymbol and getSymbolAddress routines and replaces them with a
single findSymbolInImage, which gets a StringRef identifying the
raw object file image as input, and returns a StringRef covering
the data addressed by the symbol (address and size) if found, or
std::nullopt otherwise.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-06 20:49:12 +01:00
Jonathan Schilling
20459ddc82
[openmp] Clarify error message if TSan is missing (#70916)
For an uninformed user, the error message might refer to a missing "TSan
stopping operation", rather than indicating that TSan is missing **and
therefore** operation is stopped.
2024-03-06 16:11:56 +05:30
Ye Luo
0fa04b6e2c [libomptarget] Fix libomptarget.rtl.amdgpu.so installation
If AMD GPUs don't exist when building libomptarget, the early return causes skipping the plugin installation.
2024-03-05 21:32:48 -06:00
Daniel Martinez
aa6ebf9be1
Replace some C headers with C++ ones (#82697)
#81434

Replaced some C headers with C++ ones

Co-authored-by: Daniel Martinez <danielmartinez@cock.li>
2024-03-04 01:21:31 -05:00
Joseph Huber
2fb764d2da
[libomptarget] Fix 'libomptarget' libraries being installed twice (#83624)
Summary:
We use `add_llvm_library` as a shorthand for setting up all the
dependencies and libraries we need for the OpenMP offloading runtime as
they depend on a lot of the LLVM utilities. However, we always
explicitly installed these manually. Behind the scenes the function
would then install it again. This was unnoticed because until now the
destinations matched. Now that we want it to optionally go into the
other directory it is duplicating them. Fix this by stating that this is
a build tree only library so we can handle it ourselves.
2024-03-01 15:52:20 -06:00
Joseph Huber
1977404d20
[OpenMP] Respect LLVM per-target install directories (#83282)
Summary:
One recurring problem we have with the OpenMP libraries is that they are
potentially conflicting with ones found on the system, this occurs when
there are two copies and one is used for linking that it not attached to
the correspoding clang compiler. LLVM already uses target specific
directories for this, like with libc++, which are always searched first.
This patch changes the install directory to be
`lib/x86_64-unknown-linux-gnu` for example.

Notable changes would be that users will need to change their
LD_LIBRARY_PATH settings optionally, or use default rt-rpath options.
This should fix problems were users are linking the wrong versions of
static libraries
2024-02-28 15:39:27 -06:00
Jonathan Peyton
0e0bee26e7
[OpenMP] Fix distributed barrier hang for OMP_WAIT_POLICY=passive (#83058)
The resume thread logic inside __kmp_free_team() is faulty. Only
checking b_go for sleep status doesn't wake up distributed barrier.
Change to generic check for th_sleep_loc and calling
__kmp_null_resume_wrapper().

Fixes: #80664
2024-02-27 14:15:48 -06:00
Joachim
822142ffdf
[OpenMP][OMPD] libompd must not link libomp (#83119)
Fixes a regression introduced in 91ccd8248.
The code for libompd includes kmp.h for enum kmp_sched. The dependency
to hwloc is not necessary. Avoid the dependency by skipping the
definitions in kmp.h using types from hwloc.h.

Fixes #80750
2024-02-27 16:24:55 +01:00
Michael Klemm
99335a646b
[flang][OpenMP] Add missing implementation for 'omp_is_initial_device' (#83056)
Resolve issue #82047
2024-02-26 13:17:33 -08:00
Xing Xue
a4dcfbcb78
[OpenMP][AIX] XFAIL capacity tests on AIX in 32-bit (#83014)
This patch XFAILs two capacity tests on AIX in 32-bit because running
out resource with `4 x omp_get_max_threads()` in 32-bit mode.
2024-02-26 13:13:05 -05:00
Michael Halkenhäuser
e521752c04
[OpenMP][OMPT] Add OMPT callback for device data exchange 'Device-to-Device' (#81991)
Since there's no `ompt_target_data_transfer_tofrom_device` (within
ompt_target_data_op_t enum) or something other that conveys the meaning
of inter-device data exchange we decided to indicate a Device-to-Device
transfer by using: optype == ompt_target_data_transfer_from_device (=3)

Hence, a device transfer may be identified e.g. by checking for: (optype
== 3) &&
(src_device_num < omp_get_num_devices()) &&
(dest_device_num < omp_get_num_devices())

Fixes: #66478
2024-02-26 11:16:25 +01:00
David CARLIER
9e7c0b1385
[OpenMP] Implement __kmp_is_address_mapped on DragonFlyBSD. (#82895)
implement internal __kmp_is_address_mapped.
2024-02-25 14:13:04 +00:00
Joseph Huber
1a2ecbb398
[libc] Remove 'llvm-gpu-none' directory from build (#82816)
Summary:
This directory is leftover from when we handled both AMDGPU and NVPTX in
the same build and merged them into a pseudo triple. Now the only thing
it contains is the RPC server header. This gets rid of it, but now that
it's in the base install directory we should make it clear that it's an
LLVM libc header.
2024-02-23 14:11:31 -06:00
Joseph Huber
87b4108211
[Libomptarget][NFC] Remove concept of optional plugin functions (#82681)
Summary:
Ever since the introduction of the new plugins we haven't exercised the
concept of "optional" plugin functions. This is done in perparation for
making the plugins use a static interface as it will greatly simplify
the implementation if we assert that every function has the entrypoints.
Currently some unsupported functions will just return failure or some
other default value, so this shouldn't change anything.
2024-02-22 16:49:21 -06:00
Joseph Huber
47b7c91abe
[libc] Rework the GPU build to be a regular target (#81921)
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
https://github.com/llvm/llvm-project/pull/81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly superior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Some certain utilities need to be built with 
`--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine
workaround as we
will always assume that the GPU `libc` is a cross-build with a
functioning host.

Depends on https://github.com/llvm/llvm-project/pull/81557
2024-02-22 15:29:29 -06:00
Daniel Martinez
45fe67dd61
Fix build on musl by including stdint.h (#81434)
openmp fails to build on musl since it lacks the defines for int32_t

Co-authored-by: Daniel Martinez <danielmartinez@cock.li>
2024-02-22 13:14:27 -08:00