3472 Commits

Author SHA1 Message Date
Joseph Huber
dcbddc2525
[Libomptarget] Unify and simplify plugin CMake (#86191)
Summary:
This patch reworks the CMake handling for building plugins. All this
does is pull a lot of shared and common logic into a single helper
function.
This also simplifies the OMPT libraries from being built separately
instead of just added.
2024-03-22 16:13:58 -05:00
Xing Xue
d394f3a162
[OpenMP][AIX] Affinity implementation for AIX (#84984)
This patch implements `affinity` for AIX, which is quite different from
platforms such as Linux.
- Setting CPU affinity through masks and related functions are not
supported. System call `bindprocessor()` is used to bind a thread to one
CPU per call.
- There are no system routines to get the affinity info of a thread. The
implementation of `get_system_affinity()` for AIX gets the mask of all
available CPUs, to be used as the full mask only.
- Topology is not available from the file system. It is obtained through
system SRAD (Scheduler Resource Allocation Domain).

This patch has run through the libomp LIT tests successfully with
`affinity` enabled.
2024-03-22 15:25:08 -04:00
agozillon
8612fa0d84
[MLIR][OpenMP] Refactor bounds offsetting and fix to apply to all directives (#84349)
This PR refactors bounds offsetting by combining the two differing
implementations (one applying to initial derived type member map
implementation for descriptors and the other for regular arrays,
effectively allocatable array vs regular array in fortran) now that it's
a little simpler to do.

The PR also moves the utilization of createAlteredByCaptureMap into
genMapInfoOp, where it will be correctly applied to all MapInfoData,
appropriately offsetting and altering Pointer data set in the kernel
argument structure on the host. This primarily means bounds offsets will
now correctly apply to enter/exit/update map clauses as opposed to just
the Target directive that is currently the case. A few fortran runtime
tests have been added to verify this new behavior.

This PR depends on: https://github.com/llvm/llvm-project/pull/84328 and
is an extraction of the larger derived type member map PR stack (so a
requirement for it to land).
2024-03-22 15:32:39 +01:00
Ulrich Weigand
cb071942f8 [OpenMP] Fix SystemZ build failure
Commit a7d5f73a03c81cab8df64dbd099e8acb40f5dfe1 introduced an
error in a target_compile_definitions on the SystemZ, causing
the build to break.  Fixed by adding the missing "PRIVATE".
2024-03-21 12:02:50 +01:00
Joseph Huber
a7d5f73a03
[Libomptarget] Consolidate CPU offloading into 'host' directory (#86014)
Summary:
All of these CPU targets use the same underlying implementation. We
should consolidate them into a single target to make it easier to update
this to a static library based approach. I have decided to call this the
'host' target so it can be given a single name. We still only build
these if the system processor matches and we are on Linux.
2024-03-20 17:41:53 -05:00
Gheorghe-Teodor Bercea
c25e77436e
Revert "[libomptarget][nextgen-plugin] Use SCRELEASE/SCACQUIRE in packet headers" (#85950)
Reverts llvm/llvm-project#85678
2024-03-20 11:40:12 -04:00
Gheorghe-Teodor Bercea
927308a52b
[libomptarget][nextgen-plugin] Use SCRELEASE/SCACQUIRE in packet headers (#85678)
This patch updates the construction of packet headers to replace the
usage of ACQUIRE/RELEASE with SCACQUIRE/SCRELEASE which is now
recommended.
The patch also ensures consistency across kernel dispatches.
2024-03-20 11:22:01 -04:00
Michael Klemm
fb5fd2d82f
[flang][OpenMP] Compile proper omp_lib.mod from the openmp/src/include sources (#80874)
This PR changes the build system to use use the sources for the module
`omp_lib` and the `omp_lib.h` include file from the `openmp` runtime
project and not from a separate copy of these files. This will greatly
reduce potential for inconsistencies when adding features to the OpenMP
runtime implementation.

When the OpenMP subproject is not configured, this PR also disables the
corresponding LIT tests with a "REQUIRES" directive at the beginning of
the OpenMP test files.

---------

Co-authored-by: Valentin Clement (バレンタイン クレメン) <clementval@gmail.com>
2024-03-20 13:47:26 +01:00
Ulrich Weigand
fe1341248d [OpenMP] Disable workshare_chunk.c test case on SystemZ
The test added by https://github.com/llvm/llvm-project/pull/83261
has been consistently failing.  Mark as UNSUPPORTED just like on
x86_64 and aarch64.
2024-03-20 11:12:26 +01:00
dhruvachak
b5d02bbd0d
[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)
A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.

If approved, this patch should be backported to release 18.
2024-03-19 16:40:22 -07:00
Joseph Huber
5cccc405e3 Revert "[OpenMP] Disable flaky barrier fence test (#85093)"
This reverts commit cd8843f87af2f04a85dda12b37738596cbf4cd5e.

Originally disabled to try to unstick the AMD build bot, didn't make a
difference after a week so it goes back in.
2024-03-19 14:46:03 -05:00
Dominik Adamski
b9a41b9e9b
[NFC][OpenMP] Add test checking clang offload chunking policy (#83261)
Verify how clang handles `dist_schedule(static, block_chunk)` and
`schedule(static, thread_chunk)` clauses for OpenMP offload loop
workshare pragmas.
2024-03-19 10:29:35 -07:00
Brad Smith
c7de4a39d5
[OpenMP] Enable the affinity tests on FreeBSD, NetBSD and DragonFly (#85500)
FreeBSD, NetBSD and DragonFly also have affinity support. So enable the tests there as well.
2024-03-19 13:29:19 -04:00
Akash Banerjee
e9da5f0083
[OpenMP] Fix target data region codegen being omitted for device pass (#85218)
This patch enables the BodyCodeGen callback to still trigger for the
TargetData nested region during the device pass. There maybe Target code
nested within the TargetData region for which this is required.

Also add tests for the same.
2024-03-19 13:04:23 +00:00
nicebert
29849d589c
[OpenMP] Fix ompx_dump_mapping_tables lit test (#85754)
Fixes ompx_dump_mapping_tables test by only using one device after
breaking built bots
2024-03-19 10:26:41 +01:00
Fangrui Song
9936ac3083
[docs] Prefer --gcc-install-dir= to deprecated GCC_INSTALL_PREFIX (#85458)
Setting GCC_INSTALL_PREFIX leads to a warning (#77537).

Link:
https://discourse.llvm.org/t/add-gcc-install-dir-deprecate-gcc-toolchain-and-remove-gcc-install-prefix/65091
Link:
https://discourse.llvm.org/t/correct-cmake-parameters-for-building-clang-and-lld-for-riscv/72833
2024-03-18 13:11:44 -07:00
nicebert
20f5bcfb1a
[OpenMP] Add OpenMP extension API to dump mapping tables (#85381)
This adds an API call ompx_dump_mapping_tables.
This allows users to debug the mapping tables and can be especially
useful for unified shared memory applications to check if the code
behaves in the way it should. The implementation reuses code already
present to dump mapping tables (in a debug setting).

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-03-18 14:09:20 -05:00
David CARLIER
6d3cec01a6
Revert "[openmp] __kmp_x86_cpuid fix for i386/PIC builds." (#85526)
Reverts llvm/llvm-project#84626
2024-03-16 13:41:12 +00:00
Joseph Huber
470040bd4d [Libomptarget][NFC] Remove warning on return value const 2024-03-15 18:50:33 -05:00
Felipe Cabarcas
0e21672d99
Fixing LIBOMPTARGET_INFO message, for Copying data from device to host (#85444)
When running OpenMP offloading application with LIBOMPTARGET_INFO=-1,
the addresses of the Copying data from **device** to **host**, the
address are swap.
As an example, Currently the address would be
```
omptarget device 0 info: Mapping exists with HstPtrBegin=0x00007ffddaf0fb90, TgtPtrBegin=0x00007fb385404000, Size=8000, DynRefCount=0 (decremented, delayed deletion), HoldRefCount=0
omptarget device 0 info: Copying data from device to host, TgtPtr=0x00007ffddaf0fb90, HstPtr=0x00007fb385404000, Size=8000, Name=d
```
And it should be
```
omptarget device 0 info: Copying data from device to host, TgtPtr=0x00007fb385404000, HstPtr=0x00007ffddaf0fb90, Size=8000, Name=d
```

---------

Co-authored-by: fel-cab <fel-cab@github.com>
2024-03-15 15:25:14 -04:00
Ulrich Weigand
c9062e8f78 Reapply [libomptarget] Build plugins-nextgen for SystemZ (#83978)
The plugin was not getting built as the build_generic_elf64 macro
assumes the LLVM triple processor name matches the CMake processor name,
which is unfortunately not the case for SystemZ.

Fix this by providing two separate arguments instead.

Actually building the plugin exposed a number of other issues causing
various test failures. Specifically, I've had to add the SystemZ target
to
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
- LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt
- a check_plugin_target call in libomptarget/src/CMakeLists.txt

Finally, I've had to set a number of test cases to UNSUPPORTED on
s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED
for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on
s390x for what seem to be the same reason.

In addition, this also requires support for BE ELF files in
plugins-nextgen: https://github.com/llvm/llvm-project/pull/85246
2024-03-15 19:06:43 +01:00
Ulrich Weigand
2210c85a66 Reapply [libomptarget] Support BE ELF files in plugins-nextgen (#85246)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to use generic types, in
particular by using (a unique_ptr to) ObjectFile instead of
ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-15 18:28:28 +01:00
Ulrich Weigand
4c8714efc5 Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#85246)"
This reverts commit 611c62b30d160375b46b7afedc04965ee6f67d1a.
2024-03-14 18:38:13 +01:00
Ulrich Weigand
611c62b30d
[libomptarget] Support BE ELF files in plugins-nextgen (#85246)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to use generic types, in
particular by using (a unique_ptr to) ObjectFile instead of
ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-14 18:19:12 +01:00
Andrew Brown
d83660827f
[openmp][wasm] Fix microtask type mismatch (#84355)
When OpenMP is compiled for WebAssembly (see #71297), it invokes a
microtask via a `switch` statement that dispatches to the `void *`
microtask pointer with spelled-out arguments (not varargs). As #83329
points out, however, this can result in a type mismatch when the
indirect call is executed by WebAssembly; WebAssembly expects the called
pointer to have the precise type of the call site. This change fixes the
issue by bringing back the approach in [D142593] of type-casting all the
`switch` arms to the precise type. This fixes #83329.

[D142593]: https://reviews.llvm.org/D142593
2024-03-14 10:23:44 -05:00
MessyHack
ea848d0a6d
[OpenMP] Sort topology after adding processor group layer. (#83943)
Various behavior around creating affinity masks and detecting uniform
topology depends on the topology being sorted.

resort topology after adding processor group layer to ensure that the
updated topology reflects the newly added processor group info.

Observed that the topology was not sorted correctly on high core count
AMD Epyc Genoa (2 sockets, 96 cores, 2 threads) using NUMA (NPS 2+).
2024-03-13 16:22:23 -05:00
Joseph Huber
cd8843f87a
[OpenMP] Disable flaky barrier fence test (#85093)
Summary:
This test is flaky on all targets I know of. We should disable it for
now so running the test suite doesn't randomly fail 50% of the time.
2024-03-13 15:05:22 -05:00
Jonathan Peyton
6272500e0b
[OpenMP] Remove unused logical/physical CPUID information (#83298) 2024-03-12 11:37:01 -07:00
Jonathan Peyton
3303be63fc
[OpenMP] Make sure mask is set to nullptr (#83299) 2024-03-12 11:36:43 -07:00
Jonathan Peyton
f5334f5da5
[OpenMP] Add debug checks for divide by zero (#83300) 2024-03-12 11:36:19 -07:00
Joseph Huber
9f69d3cf88
[Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (#84928)
Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.
2024-03-12 10:39:40 -05:00
Jonathan Peyton
9b1c496898
[OpenMP] Fixup while loops to avoid bad NULL check (#83302) 2024-03-11 10:28:12 -05:00
Jonathan Peyton
de4d7015d0
[OpenMP] Remove unnecessary check of ap (#83303) 2024-03-11 10:27:53 -05:00
Jonathan Peyton
1ed463d961
[OpenMP] Make sure ptr is used after NULL check (#83304) 2024-03-11 10:27:31 -05:00
Jonathan Peyton
b4e39ad117
[OpenMP] Remove dead code of checking int > INT_MAX (#83305) 2024-03-11 10:26:53 -05:00
David CARLIER
facb89ae12
[openmp] __kmp_x86_cpuid fix for i386/PIC builds. (#84626) 2024-03-11 13:15:43 +00:00
Joseph Huber
8a79003307 [libc] Move RPC opcodes include out of the header
Summary:
This header isn't strictly necessary, and is currently broken because we
install these to separate locations.
2024-03-10 14:07:47 -05:00
David CARLIER
fa4cc39255
[openmp] adding affinity support to DragonFlyBSD. (#84672) 2024-03-10 09:56:55 +00:00
Vadim Paretsky
110141b378
[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540)
MSVC does not define __BYTE_ORDER__ making the check for BigEndian
erroneously evaluate to true and breaking the struct definitions in MSVC
compiled builds correspondingly. The fix adds an additional check for
whether __BYTE_ORDER__ is defined by the compiler to fix these.

---------

Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-09 10:47:31 -08:00
David CARLIER
11cd2a33f1
[openmp] porting affinity feature to netbsd. (#84618)
netbsd supports the portable hwloc's layer as well. for a hardware with
4 cpus, a cpu set is 4 and maxcpus is 256.
2024-03-09 11:45:07 +00:00
David CARLIER
05280b582a
[OpenMP] Implements __kmp_is_address_mapped for Solaris/Illumos. (#82930)
Also fixing OpenMP build itself for this platform.
2024-03-08 20:34:43 +00:00
vadikp-intel
fcd2d48325
[OpenMP] runtime support for efficient partitioning of collapsed triangular loops (#83939)
This PR adds OMP runtime support for more efficient partitioning of
certain types of collapsed loops that can be used by compilers that
support loop collapsing (i.e. MSVC) to achieve more optimal thread load
balancing.

In particular, this PR addresses double nested upper and lower isosceles
triangular loops of the following types

1. lower triangular 'less_than'
   for (int i=0; i<N; i++)
     for (int j=0; j<i; j++)
2. lower triangular 'less_than_equal'
   for (int i=0; i<N; j++)
     for (int j=0; j<=i; j++)
3. upper triangular
   for (int i=0; i<N; i++)
     for (int j=i; j<N; j++)

Includes tests for the three supported loop types.

---------

Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-07 16:28:03 -08:00
Gheorghe-Teodor Bercea
5c752df1e1
[libomptarget][nextgen-plugin][NFC] Clean-up InputSignal checks (#83458)
Clean-up InputSignal checks.
2024-03-07 12:01:42 -05:00
Anchu Rajendran S
c03fd37d9b
[flang] Changes to map variables in link clause of declare target (#83643)
As per the OpenMP standard, "If a variable appears in a link clause on a
declare target directive that does not have a device_type clause with
the nohost device-type-description then it is treated as if it had
appeared in a map clause with a map-type of tofrom" is an implicit
mapping rule. Before this change, such variables were mapped as to by
default.
2024-03-07 08:23:58 -08:00
Ulrich Weigand
fb7cc73975 Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#83976)"
This reverts commit 15b7b3182cc28f4f0b950bd73d931caa27b833ec.
2024-03-06 21:37:45 +01:00
Ulrich Weigand
70677c81de Revert "[libomptarget] Build plugins-nextgen for SystemZ (#83978)"
This reverts commit 3ecd38c8e1d34b1e4639a1de9f0cb56c7957cbd2.
2024-03-06 21:37:43 +01:00
Ulrich Weigand
d4f4f80236 Revert "[libomptarget] Fix CUDA plugin build regression"
This reverts commit b64482e23eefaef7738fde35d0b7c4174aaa6597.
2024-03-06 21:37:35 +01:00
Ulrich Weigand
b64482e23e [libomptarget] Fix CUDA plugin build regression
After 3ecd38c8e, the Handler.getELFObjectFile routine is no
longer available.  Call ELF64LEObjectFile::create directly,
which should always be suitable for CUDA images.
2024-03-06 21:00:25 +01:00
Ulrich Weigand
3ecd38c8e1
[libomptarget] Build plugins-nextgen for SystemZ (#83978)
The plugin was not getting built as the build_generic_elf64 macro
assumes the LLVM triple processor name matches the CMake processor name,
which is unfortunately not the case for SystemZ.

Fix this by providing two separate arguments instead.

Actually building the plugin exposed a number of other issues causing
various test failures. Specifically, I've had to add the SystemZ target
to
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
- LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt
- a check_plugin_target call in libomptarget/src/CMakeLists.txt

Finally, I've had to set a number of test cases to UNSUPPORTED on
s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED
for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on
s390x for what seem to be the same reason.

In addition, this also requires support for BE ELF files in
plugins-nextgen: https://github.com/llvm/llvm-project/pull/83976
2024-03-06 20:50:01 +01:00
Ulrich Weigand
15b7b3182c
[libomptarget] Support BE ELF files in plugins-nextgen (#83976)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to push all ELF specific
types into Utils/ELF. Specifically, this patch removes both the
getSymbol and getSymbolAddress routines and replaces them with a
single findSymbolInImage, which gets a StringRef identifying the
raw object file image as input, and returns a StringRef covering
the data addressed by the symbol (address and size) if found, or
std::nullopt otherwise.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-06 20:49:12 +01:00