3402 Commits

Author SHA1 Message Date
Gheorghe-Teodor Bercea
f0567702aa
[OpenMP] Add missing export for dynamic tracking patch (#97419)
Add missing export for OpenMP non-offloading builds.
2024-07-02 10:09:27 -04:00
dhruvachak
946f5d111d
[OpenMP] [OMPT] Callback registration should not depend on the device init callback. (#96371)
Even if the device init callback is not registered, a tool should be
allowed to register other callbacks.
2024-07-01 10:07:05 -07:00
Gheorghe-Teodor Bercea
1a478a69bc
[OpenMP][offload] Fix dynamic schedule tracking (#97065)
This patch fixes the dynamic schedule tracking.
2024-07-01 10:23:11 -04:00
Terry Wilmarth
ac9f06c2a8
[OpenMP] Fix test omp_parallel_num_threads_list.c to require fewer threads. (#96916)
Original test case used too many threads for some environments. This update
reduces to a max of 36 threads.
2024-06-28 00:17:11 +03:00
Terry Wilmarth
d30b082fd4
[OpenMP] Add num_threads clause list format and strict modifier support (#85466)
Add support to the runtime for 6.0 spec features that allow num_threads
clause to take a list, and also make use of the strict modifier.
Provides new compiler interface functions for these features.
2024-06-24 15:39:18 -04:00
Jonathan Peyton
88dae3d5d0
[OpenMP][libomp] Remove Perl in favor of Python (#95307)
* Removes all Perl scripts and modules
* Adds Python3 scripts which mimic the behavior of the Perl scripts
* Removes Perl from CMake; Adds Python3 requirement to CMake
* The check-instruction-set.pl script is Knights Corner specific. The
script is removed and not replicated with a corresponding Python3
script.

Relevant Discourse:

https://discourse.llvm.org/t/error-compiling-clang-with-offloading-support/79223/4

Fixes: https://github.com/llvm/llvm-project/issues/62289
2024-06-20 12:54:49 -05:00
Tim Gymnich
597d2f7662
[OpenMP] Add Environment Variable to disable Reuse of Blocks for High Loop Trip Counts (#89239)
Sometimes it might be beneficial to spawn more thread blocks instead of
reusing existing for multiple loop iterations.

**Alternatives considered:**

Make `DefaultNumBlocks` settable via an environment variable.

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-06-14 07:35:23 -07:00
Jay Foad
d4a0154902
[llvm-project] Fix typo "seperate" (#95373) 2024-06-13 20:20:27 +01:00
estewart08
89c92b0bcf
[OpenMP][Offload] - Ensure OPENMP_STANDALONE_BUILD is defined (#94801)
Without a value set conditional checks like
if(NOT ${OPENMP_STANDALONE_BUILD})
will not be able to evaluate to true.
Fixes issue introduced from PR #93463, which did not allow the OMPT
variable to be propogated up to offload during a runtimes build.
2024-06-07 15:37:42 -05:00
Joachim
ff77f67c47
[OpenMP][NFC] Fix warning for OpenMP standalone build (#93463)
PR #75125 introduced upward propagation of some OMPT-related CMake
variables.
For stand-alone builds this results in a warning that `SCOPE_PARENT` has
no meaning in a top-level directory.
2024-06-06 16:13:50 +02:00
Joachim
2464f1cef3
[OpenMP][OMPT] Add missing callbacks for asynchronous target tasks (#93472)
- The first hidden-helper-thread did not trigger thread-begin
- The "detaching" from a target-task when waiting for completion missed
to call task-switch
- Target tasks identified themself as explicit task

Co-authored-by: Kaloyan Ignatov <kaloyan.ignatov@rwth-aachen.de>
2024-06-04 17:07:15 +02:00
Shilei Tian
b448efb8ea
Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311)" (#94139) 2024-06-03 11:17:36 -04:00
Joseph Huber
df9701bfee
[OpenMP] Fix multiply installing libomp.so (#93685)
Summary:
The `add_llvm_library` interface handles installing the llvm libraries,
however we want to do our own handling. Otherwise, this will install
into the `./lib` location instead of the `./lib/<target>` one.
2024-05-29 08:57:16 -05:00
Franklin Zhang
c09787b7d0
[OMPT] Set default values for tsan function pointers (#93568)
Avoid calling NULL function pointers in cases where ompt_start_tool
succeeds but those tsan functions
do not really exist.

Fix https://github.com/llvm/llvm-project/issues/93524

---------

Co-authored-by: Joachim <protze@rz.rwth-aachen.de>
2024-05-28 19:39:35 +02:00
Shilei Tian
cf9eeb67e5 Revert "Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311)""
This reverts commit 7b4865582299294455bc816358fd88a9c6e5e0be.
2024-05-26 01:04:39 -04:00
Shilei Tian
7b48655822 Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311)"
This reverts commit 9b31cc71d66064dfaf2afabf4a835211321bb4a0.
2024-05-26 00:57:50 -04:00
Michael Kruse
8bdc577667
[openmp] Revise IDE folder structure (#89750)
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
2024-05-25 17:34:28 +02:00
Joseph Huber
9b31cc71d6 Revert "[OpenMP][OMPX] Add shfl_down_sync (#93311)"
This reverts commit 098c6dfa8157681699a71fce9e3d94515e66311f.
This reverts commit 8c718a3a91df4ab68dc3f1ca3887ea730c9aed84.
This reverts commit 4fb02de9d490d0773441aa30124bb4d1272230d3.
2024-05-24 19:07:53 -05:00
Xing Xue
2669ee1174
[OpenMP][AIX] Extend LIT test timeout limit (#93319)
When buildbots are crowded, the libomp LIT tests may hit timeouts so
extend the limit from 1800 to 3000 seconds.
2024-05-24 16:00:51 -04:00
Shilei Tian
098c6dfa81 [NFC][OpenMP][OMPX] Remove ';' that is outside of a function 2024-05-24 14:21:54 -04:00
Shilei Tian
8c718a3a91 [OpenMP][OMPX] No default argument for C API 2024-05-24 14:15:50 -04:00
Shilei Tian
4fb02de9d4
[OpenMP][OMPX] Add shfl_down_sync (#93311) 2024-05-24 14:00:43 -04:00
Shilei Tian
7eeec8e6d1
[OpenMP][OMPX] Add ballot_sync (#91297)
This patch adds the support for `ballot_sync` in ompx.
2024-05-24 09:54:54 -04:00
Joseph Huber
c618ae1734
[Offload] Rework handling for loading vendor runtimes (#93073)
Summary:
We previously had multiple options for this, this patch replaces them
with `LIBOMPTARGET_DLOPEN_PLUGINS=` to be a list of plugins to
dynamically use. It defaults to everything right now. This ignores the
`host` plugin because the `libffi` dependency is going to be removed
soon hopefully in https://github.com/llvm/llvm-project/pull/91264.
2024-05-22 13:04:52 -05:00
Sirraide
c44fa3e8a9
[Clang] Refactor __attribute__((assume)) (#84934)
This is a followup to #81014 and #84582: Before this patch, Clang 
would accept `__attribute__((assume))` and `[[clang::assume]]` as 
nonstandard spellings for the `[[omp::assume]]` attribute; this 
resulted in a potentially very confusing name clash with C++23’s 
`[[assume]]` attribute (and GCC’s `assume` attribute with the same
semantics).

This pr replaces every usage of `__attribute__((assume))`  with 
`[[omp::assume]]` and makes `__attribute__((assume))` and 
`[[clang::assume]]` alternative spellings for C++23’s `[[assume]]`; 
this shouldn’t cause any problems due to differences in appertainment
and because almost no-one was using this variant spelling to begin
with (a use in libclc has already been changed to use a different
attribute).
2024-05-22 17:58:48 +02:00
Michael Kruse
9120562dfc
[Clang][OpenMP] Enable tile/unroll on iterator- and foreach-loops (#91459)
OpenMP loop transformation did not work on a for-loop using an iterator
or range-based for-loops. The first reason is that it combined the
iterator's type for generated loops with the type of `NumIterations` as
generated for any `OMPLoopBasedDirective` which is an integer. Fixed by
basing all generated loop variables on `NumIterations`.

Second, C++11 range-based for-loops include syntactic sugar that needs
to be executed before the loop. This additional code is now added to the
construct's Pre-Init lists.

Third, C++20 added an initializer statement to range-based for-loops
which is also added to the pre-init statement. PreInits used to be a
`DeclStmt` which made it difficult to add arbitrary statements from
`CXXRangeForStmt`'s syntactic sugar, especially the for-loops init
statement which does not need to be a declaration. Change it to be a
general `Stmt` that can be a `CompoundStmt` to hold arbitrary Stmts,
including DeclStmts. This also avoids the `PointerUnion` workaround used
by `checkTransformableLoopNest`.

End-to-end tests are added to verify the expected number and order of
loop execution and evaluations of expressions (such as iterator
dereference). The order and number of evaluations of expressions in
canonical loops is explicitly undefined by OpenMP but checked here for
clarification and for changes to be noticed.
2024-05-22 14:30:31 +02:00
Joseph Huber
f60c699d37 [OpenMP] Fix intermediate header locations for OpenMP
Summary:
A previous patch moved the code here and accidentally overrwrote the
include path that the LSP interface used. This caused incorrect errors
when using clangd with the offload project. This patch removes the
unnecessary header and makes sure we include the correct folder.
2024-05-15 20:45:19 -05:00
Joseph Huber
2ec85713bd [OpenMP] Add back in `ENABLE_LIBOMPTARGET' definition
Summary:
Even though we moved `libomptarget` this is still present in `omp.h` and
can't be removed.
2024-05-15 11:58:16 -05:00
Joseph Huber
332de4b267
[Offload] Correctly reject building on unsupported architectures (#92276)
Summary:
Previously we had this `LIBOMPTARGET_ENABLED` variable which controlled
including `libomptarget`. This is now redundant since it's controlled by
`LLVM_ENABLE_RUNTIMES`. However, this had the extra effect of not
building it when given unsupported targets. THis was lost during the
move to `offload`. This patch moves this logic back and makes the
`offload` target just quit without doing anything if used on an
unsupported architecture.

https://github.com/llvm/llvm-project/issues/91881
https://github.com/llvm/llvm-project/issues/91819

---------

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2024-05-15 11:38:41 -05:00
Michael Kruse
b0b6c16b47
[Clang][OpenMP][Tile] Allow non-constant tile sizes. (#91345)
Allow non-constants in the `sizes` clause such as
```
#pragma omp tile sizes(a)
for (int i = 0; i < n; ++i)
```
This is permitted since tile was introduced in [OpenMP
5.1](https://www.openmp.org/spec-html/5.1/openmpsu53.html#x78-860002.11.9).

It is possible to sneak-in negative numbers at runtime as in
```
int a = -1;
#pragma omp tile sizes(a)
```
Even though it is not well-formed, it should still result in every loop
iteration to be executed exactly once, an invariant of the tile
construct that we should ensure. `ParseOpenMPExprListClause` is
extracted-out to be reused by the `permutation` clause of the
`interchange` construct. Some care was put into ensuring correct behavior
in template contexts.
2024-05-13 16:10:58 +02:00
Xing Xue
561b6ab96e
[OpenMP][AIX] Implement __kmp_get_load_balance() for AIX (#91520)
AIX has the `/proc` filesystem where `/proc/<pid>/lwp/<tid>/lwpsinfo` has
the thread state in binary, similar to Linux's
`/proc/<pid>/task/<tid>/stat` where the state is in ASCII. However, the
definition of state info `R` in `lwpsinfo` is `runnable`. In Linux,
state `R` means the thread is `running`. Therefore, `lwpsinfo` is not
ideal for our purpose of getting the current load of the system. This
patch uses `perfstat_cpu()` in AIX system library `libperfstat.a` to
obtain the number of threads current running on logical CPUs.
2024-05-10 09:23:02 -04:00
chandan singh
2a57657d55
[OpenMP] [Flang] Resolved Issue llvm#76121: Implemented Check for Unhandled Arguments in __kmpc_fork_call_if (#82221)
Root cause: Segmentation fault is caused by null pointer dereference
inside the __kmpc_fork_call_if function at
https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Linux_asm.S#L1186
. __kmpc_fork_call_if is missing case to handle argc=0 .

Fix: Added a check inside the __kmp_invoke_microtask function to handle
the case when argc is 0.

---------

Co-authored-by: Singh <chasingh@amd.com>
2024-05-09 11:11:04 +05:30
Jonathan Peyton
73bb8d9d92
[OpenMP] Fix child processes to use affinity_none (#91391)
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.

The test trying to catch this only testing explicit setting of
KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting.

Fixes: #91098
2024-05-08 09:23:50 -05:00
Jonathan Peyton
41ca9104ac
[OpenMP] Fix task state and taskteams for serial teams (#86859)
* Serial teams now use a stack (similar to dispatch buffers)
* Serial teams always use `t_task_team[0]` as the task team and the
second pointer is a next pointer for the stack

`t_task_team[1]` is interpreted as a stack of task teams where each
level is a nested level

```
 inner serial team                   outer serial team
[ t_task_team[0] ] -> (task_team)    [ t_task_team[0] ] -> (task_team)
[ next           ] ----------------> [ next           ] -> ...
```

* Remove the task state memo stack from thread structure.
* Instead of a thread-private stack, use team structure to store
th_task_state of the primary thread. When coming out of a parallel,
restore the primary thread's task state. The new field in the team
structure doesn't cause sizeof(team) to change and is in the cache line
which is only read/written by the primary thread.

Fixes: #50602
Fixes: #69368
Fixes: #69733
Fixes: #79416
2024-05-07 08:41:51 -05:00
Shilei Tian
02ce8227ac [NFC][OpenMP][OMPX] Move declare variant up 2024-05-06 23:46:18 -04:00
David Tenty
144091b361
[OpenMP][CMake] Revert standalone build LIBOMP_HEADERS_INSTALL_PATH (#91243)
Revert the portion of https://github.com/llvm/llvm-project/pull/75125
which modified the LIBOMP_HEADERS_INSTALL_PATH in standalone build.

This change is harmful for real standalone builds (i.e. builds where we
build openmp by itself), since it tries to overwrite the `omp.h` inside
the build compiler. For all-in-one builds with clang, testing shows this
change is unnecessary as https://github.com/llvm/llvm-project/pull/88007
already set up that build configuration so that omp.h will be put into
the project build's `clang` resource directory.
2024-05-06 17:12:38 -04:00
Xing Xue
928db7e7ed
[OpenMP][AIX] Implement __kmp_is_address_mapped() for AIX (#90516)
This patch implements `__kmp_is_address_mapped()` for AIX by calling
`loadquery()` to get the load info of the process and then checking if
the address falls within the range of the data segment of one of the
loaded modules.
2024-04-30 16:58:47 -04:00
Xing Xue
690c929b6c
[OpenMP][AIX] Use syssmt() to get the number of SMTs per physical CPU (#89985)
This patch changes to use system call `syssmt()` instead of
`lpar_get_info()` to get the number of SMTs (logical processors) per
physical processor for AIX. `lpar_get_info()` gives the max number of
SMTs that the physical processor can support while `syssmt()` returns
the number that is currently configured.
2024-04-26 13:23:33 -04:00
Johannes Doerfert
330d8983d2
[Offload] Move /openmp/libomptarget to /offload (#75125)
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -> `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>
2024-04-22 09:51:33 -07:00
Xing Xue
0a8cd1ed1f
[OpenMP] Use half of available logical processors for collapse tests (#88319)
The new collapse test cases define `MAX_THREADS` to be 256 and use all
available threads/logical processors on the system. This triples the
testing time on an AIX machine that has 128 logical processors. This
patch changes to use half of available logical processors to avoid over
subscribing because there are other libomp tests running at the same
time, including 2 other such collapse tests.
2024-04-19 09:08:31 -04:00
Jan Patrick Lehr
49b209d0d1
Revert "[Libomptarget] Rework Record & Replay to be a plugin member" (#89028)
Reverts llvm/llvm-project#88928

This broke the AMDGPU buildbots:
https://lab.llvm.org/buildbot/#/builders/193/builds/50201 
https://lab.llvm.org/staging/#/builders/185/builds/5565
https://lab.llvm.org/buildbot/#/builders/259/builds/2955
2024-04-17 09:11:43 +02:00
Joseph Huber
9a0a28f838
[Libomptarget] Rework Record & Replay to be a plugin member (#88928)
Summary:
Previously, the R&R support was global state initialized by a global
constructor. This is bad because it prevents us from adequately
constraining the lifetime of the library. Additionally, we want to
minimize the amount of global state floating around.

This patch moves the R&R support into a plugin member like everything
else. This means there will be multiple copies of the R&R implementation
floating around, but this was already the case given the fact that we
currently handle everything with dynamic libraries.
2024-04-16 14:19:12 -05:00
Xing Xue
22bba85d82
[OpenMP][test][AIX] Make 64 the max number of threads for capacity tests in AIX 32-bit (#88739)
This patch makes 64 the max number of threads for 2 capacity tests in
AIX 32-bit mode rather than `XFAIL`ing them.
2024-04-16 13:14:29 -04:00
Xing Xue
454d449697
[OpenMP] Use a memory fence before incrementing the dispatch buffer index (#87995)
This patch uses a memory fence in function `__kmp_dispatch_next()` to
flush pending memory write invalidates before incrementing the
`volatile` variable `buffer_index` to fix intermittent time-outs of
OpenMP runtime LIT test cases `env/kmp_set_dispatch_buf.c` and
`worksharing/for/kmp_set_dispatch_buf.c`, noting that the same is needed
for incrementing `buffer_index` in function `__kmpc_next_section()`
(line 2600 of `kmp_dispatch.cpp`).
2024-04-16 13:13:49 -04:00
Joseph Huber
faad4e3fa8
[Libomptarget] Split PowerPC into separate triples (#88773)
Summary:
The previous patch mistakenly merged these when they indeed need to be
treated like separate triples because it's what's passed to the test
suite.
2024-04-15 15:03:37 -05:00
Jonathan Peyton
5300a6731e
[OpenMP] Fix re-locking hang found in issue 86684 (#88539)
This was initially reported here (including stacktraces):
https://stackoverflow.com/questions/78183545/does-compiling-imagick-with-openmp-enabled-in-freebsd-13-2-cause-sched-yield

If `__kmp_register_library_startup()` detects that another instance of
the library is present, `__kmp_is_address_mapped()` is eventually
called. which uses `kmpc_alloc()` to allocate memory. This function
calls `__kmp_entry_thread()` to access the thread-local memory pool,
which is a bad idea during initialization. This macro internally calls
`__kmp_get_global_thread_id_reg()` which sets the bootstrap lock at the
beginning (before calling `__kmp_register_library_startup()`).

The fix is to use `KMP_INTERNAL_MALLOC()`/`KMP_INTERNAL_FREE()` instead
of `kmpc_malloc()`/`kmpc_free()`. `KMP_INTERNAL_MALLOC` and
`KMP_INTERNAL_FREE` do not use any bootstrap locks. They just translate
to `malloc()`/`free()` and are meant to be used during library
initialization before other library-specific allocators have been
initialized.

Fixes: #86684
2024-04-12 15:13:59 -05:00
Xing Xue
b3792ae42a
[OpenMP][AIX] Fix test config for AIX (#88272)
This patch fixes the test config so that it works for
`tasking/omp50_taskdep_depobj.c` which uses different flags to test with
compiler's `omp.h`.
* set test environment variable `OBJECT_MODE` to `64` if it is set
explicitly to `64` in the AIX environment. `OBJECT_MODE` is default to
`32` and is recognized by AIX compilers and toolchain. In this way, we
don't need to set `-m64` for all compiler flags for 64-bit mode
* add option `-Wl,-bmaxdata` to 32-bit `test_openmp_flags` used by
`tasking/omp50_taskdep_depobj.c`
2024-04-10 16:06:31 -04:00
Joseph Huber
f81879c0f7 [Libomptarget] Add RPC-based printf implementation for OpenMP #85638
Summary:
Relanding after reverting, only applies to AMDGPU for now.

This patch adds an implementation of printf that's provided by the GPU
C library runtime. This pritnf currently implemented using the same
wrapper handling that OpenMP sets up. This will be removed once we have
proper varargs support.

This printf differs from the one CUDA offers in that it is synchronous
and uses a finite size. Additionally we support pretty much every
format specifier except the %n option.

Depends on #85331
2024-04-10 13:36:25 -05:00
Joseph Huber
a8f9f85ab0 [Libomptarget][NFC] Fix unused variable warnings
Summary:
This patch fixes a few warnings that would show up while building.
2024-04-10 10:01:15 -05:00
Joseph Huber
d022f6b8ff [Libomp] Place generated OpenMP headers into build resource directory (#88007)
Summary:
These headers are a part of the compiler's resource directory once
installed. However, they are currently placed in the binary directory
temporarily. This makes it more difficult to use the compiler out of the
build directory and will cause issues when moving to `liboffload`. This
patch changes the logic to write these instead to the copmiler's
resource directory inside of the build tree.

NOTE: This doesn't change the Fortran headers, I don't know enough about
those and it won't use the same directory.
2024-04-09 08:47:51 -05:00