166 Commits

Author SHA1 Message Date
Joseph Huber
b0fbddde38 [OpenMP] Only put retain for NVPTX so it can be optimized out for AMD
Summary:
This is a hack that only NVPTX needs.
2024-12-17 15:16:51 -06:00
wanglei
bdf727065b
[Offload] Add support for loongarch64 to host plugin
This adds support for the loongarch64 architecture to the offload host
plugin.

Similar to #115773

To fix some test issues, I've had to add the LoongArch64 target to:

- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)

Reviewed By: jhuber6

Pull Request: https://github.com/llvm/llvm-project/pull/120173
2024-12-17 19:06:10 +08:00
Jinsong Ji
e85a9f5540
libc: Prefix RPC Status code to avoid conflict in windows build (#119991)
Somehow conflict with define in wingdi.h.

Fix build failures:

[ 52%] Building CXX object
projects/offload/plugins-nextgen/common/CMakeFiles/PluginCommon.dir/src/RPC.cpp.obj
In file included from
...llvm\offload\plugins-nextgen\common\src\RPC.cpp:16:
...\llvm\libc\shared\rpc.h(48,3): error: expected identifier
   48 |   ERROR = 0x1000,
      |   ^
c:\Program files (x86)\Windows
Kits\10\include\10.0.22000.0\um\wingdi.h(118,29): note: expanded from
macro 'ERROR'
  118 | #define ERROR               0
      |                             ^
...\llvm\offload\plugins-nextgen\common\src\RPC.cpp(75,17): error:
expected unqualified-id
   75 |     return rpc::ERROR;
      |                 ^
c:\Program files (x86)\Windows
Kits\10\include\10.0.22000.0\um\wingdi.h(118,29): note: expanded from
macro 'ERROR'
  118 | #define ERROR               0
      |                             ^
2 errors generated.
2024-12-15 09:35:44 -05:00
Joseph Huber
f4ee5a673f
[OpenMP] Replace AMDGPU fences with generic scoped fences (#119619)
Summary:
This is simpler and more common. I would've replaced the CUDA uses and
made this the same but currently it doesn't codegen these fences fully
and just emits a full system wide barrier as a fallback.
2024-12-12 07:54:51 -06:00
Jan Patrick Lehr
74486dcf41
[Offload] Add CMake cache to be used in AMDGPU bot (#119369)
Adds initial CMake cache definition that is similar to what we use in
one of our production buidlbots. The goal is to consolidate the
configurations and make them accessible.
This cache file is a first step and to prepare for full pipeline testing
once the new bot comes online.
2024-12-10 17:50:02 +01:00
hidekisaito
f2bceb2311
[Offload][AMDGPU] accept generic target (#118919)
Enables generic ISA, e.g., "--offload-arch=gfx11-generic" device code to
run on gfx11-generic ISA capable device.

Executable may contain one ELF that has specific target ISA and another
ELF that has compatible generic ISA.
Under that circumstance, this code should say both ELFs are compatible,
leaving the rest to PluginManager to handle.
Suggestions on how best to address that is welcome.
2024-12-09 19:11:38 -05:00
Michał Górny
69227a11fe
[offload] Support LIBOMPTARGET_DEVICE_ARCHITECTURES={amdgpu|nvptx} (#119070)
Add two more special values for LIBOMPTARGET_DEVICE_ARCHITECTURES:
`amdgpu` and `nvptx`, to support building for all AMDGPU and NVPTX
targets respectively. This can be used in place of `all` when offload is
built with one of the GPU plugins only.
2024-12-07 15:37:28 +00:00
Michał Górny
4a44e4b192
[offload] Remove bogus offload-tblgen check for standalone build (#119004)
fd3907ccb583df99e9c19d2fe84e4e7c52d75de9 introduced a check for system
offload-tblgen executable when doing a standalone build. This check is
bogus, since offload-tblgen is built as part of offload and not some
other preinstalled component. The path is also overwritten below, so the
check only causes tests to be disabled unnecessarily.
2024-12-06 18:21:12 +00:00
Shilei Tian
92376c3ff5
[Offload][OMPX] Add the runtime support for multi-dim grid and block (#118042) 2024-12-06 09:07:50 -05:00
Michał Górny
b54ba5361e
[offload] Add gfx1012 (Navi 14) to AMDGPU models list (#118857)
Fixes #118824
2024-12-06 03:24:55 +00:00
Shilei Tian
eb49788bd9
[Offload][AMDGPU] Allow COV6 images (#118909) 2024-12-05 20:05:39 -05:00
Callum Fare
fd3907ccb5
Reland #118503: [Offload] Introduce offload-tblgen and initial new API implementation (#118614)
Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON`
(see last commit). Otherwise the changes are identical.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-12-05 09:34:04 +01:00
hidekisaito
8c36a823c2
Fix to account for multiple ISA enumeration (#118676) 2024-12-04 12:11:16 -06:00
Michał Górny
04b26f0eb7
[offload] Standalone build fixes (#118173)
A fair number of fixes to get standalone builds of offload working —
mostly copying missing bits from openmp. It's almost ready — I still
need to figure out why some of the tsts aren't linking to the right
libraries.
2024-12-04 11:27:37 +00:00
Shilei Tian
68bcba6d7a Revert "[AMDGPU] Use COV6 by default (#118515)"
This reverts commit 410cbe3cf28913cca2fc61b3437306b841d08172 because some
buildbots are not ready yet.
2024-12-03 20:17:06 -05:00
Shilei Tian
410cbe3cf2
[AMDGPU] Use COV6 by default (#118515) 2024-12-03 19:38:35 -05:00
Jan Patrick Lehr
51553227f0
Revert "Reland of #108413: [Offload] Introduce offload-tblgen and initial new API implementation" (#118541)
Reverts llvm/llvm-project#118503

Broke bot
https://lab.llvm.org/staging/#/builders/131/builds/9701/steps/5/logs/stdio
2024-12-03 21:42:38 +01:00
Callum Fare
8da490320f
Reland of #108413: [Offload] Introduce offload-tblgen and initial new API implementation (#118503)
This is another attempt to reland the changes from #108413

The previous two attempts introduced regressions and were reverted. This
PR has been more thoroughly tested with various configurations so
shouldn't cause any problems this time. If anyone is aware of any likely
remaining problems then please let me know.

The changes are identical other than the fixes contained in the last 5
commits.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-12-03 16:28:35 +00:00
Jan Patrick Lehr
c7babfa6a3
[Offload] Find libc relative to DeviceRTL path (#118497)
This was discussed as a potential solution in
https://github.com/llvm/llvm-project/pull/118173
2024-12-03 16:37:57 +01:00
Joseph Huber
a6ef0debb1 [libc][NFC] Rename RPC opcodes to better reflect their usage
Summary:
RPC_ is a generic prefix here, use LIBC_ to indicate that these are
opcodes used to implement the C library
2024-12-02 15:35:08 -06:00
Joseph Huber
91f5f974cb
[OpenMP] Unconditionally provide an RPC client interface for OpenMP (#117933)
Summary:
This patch adds an RPC interface that lives directly in the OpenMP
device runtime. This allows OpenMP to implement custom opcodes.
Currently this is only providing the host call interface, which is the
raw version of reverse offloading. Previously this lived in `libc/` as
an extension which is not the correct place.

The interface here uses a weak symbol for the RPC client by the same
name that the `libc` interface uses. This means that it will defer to
the libc one if both are present so we don't need to set up multiple
instances.

The presense of this symbol is what controls whether or not we set up
the RPC server. Because this is an external symbol it normally won't be
optimized out, so there's a special pass in OpenMPOpt that deletes this
symbol if it is unused during linking. That means at `O0` the RPC server
will always be present now, but will be removed trivially if it's not
used at O1 and higher.
2024-12-02 14:31:51 -06:00
Shilei Tian
8cb44859cc FIX: Fix test failure in offload/test/mapping/power_of_two_alignment.c
A C file with `template` in it. Awesome.
2024-12-01 21:40:46 -05:00
Shilei Tian
ed7a8f1556
[NFC][Offload] Correct the compile command of two C test files (#118243)
The compile command for the two C test files are `compilexx`, which is
actually
for C++ compilation.
2024-12-01 21:34:26 -05:00
Jan Patrick Lehr
5208bc3694
Revert "Reland #2 - [Offload] Introduce offload-tblgen and initial new API implementation (#108413. #117704)" (#117995)
Reverts llvm/llvm-project#117894

Buildbot failures in OpenMP/Offload bots.
https://lab.llvm.org/buildbot/#/builders/30/builds/11193
2024-11-28 12:58:17 +01:00
Callum Fare
992b00020f
Reland #2 - [Offload] Introduce offload-tblgen and initial new API implementation (#108413. #117704) (#117894)
Relands #117704, which relanded changes from #108413 - this was reverted
due to build issues. The new offload library did not build with
`LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by
pre-merge testing.

The last commit contains the fix; everything else is otherwise identical
to the approved PR.
___

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-11-28 10:19:37 +00:00
Joseph Huber
a24aa7dfa5
[Offload] Use libc 'hand-in-hand' module to find RPC header (#117928)
Summary:
We should now use the official™ way to include the files from
`libc/shared`. This required some code to make sure that it's not
included twice if multiple people use it as well as a sanity check on
the directory.
2024-11-27 20:14:13 -06:00
Joseph Huber
1d810ece2b
[libc] Move libc server handlers to a shared header (#117908)
Summary:
We can simply include this header from the shared directory now and do
not need to have this level of indirection. Simply stash it with the
other libc opcode handlers.

If we were able to move the printf handlers to the shared directory then
this could just be a header as well, which would HEAVILY simplify the
mess associated with building the RPC server first in the projects
build, then copying it to the runtimes build.
2024-11-27 14:57:52 -06:00
Joseph Huber
89d8e70031
[libc] Export a pointer to the RPC client directly (#117913)
Summary:
We currently have an unnecessary level of indirection when initializing
the RPC client. This is a holdover from when the RPC client was not
trivially copyable and simply makes it more complicated. Here we use the
`asm` syntax to give the C++ variable a valid name so that we can just
copy to it directly.

Another advantage to this, is that if users want to piggy-back on the
same RPC interface they need only declare theirs as extern with the same
symbol name, or make it weak to optionally use it if LIBC isn't
avaialb.e
2024-11-27 14:57:38 -06:00
Fraser Cormack
0cb5846a68 Revert "Reland - [Offload] Introduce offload-tblgen and initial new API implementation (#108413) (#117704)"
This reverts commit c979ec05642f292737d250c6682d85ed49bc7b6e.

This showed failures in the post-merge CI.
2024-11-27 10:49:01 +00:00
Callum Fare
c979ec0564
Reland - [Offload] Introduce offload-tblgen and initial new API implementation (#108413) (#117704)
Relands changes from #108413 - this was reverted due to build issues.
The problem was just that the `offload-tblgen` tool was behind recent
changes to tablegen that ensure `const` records. This has been fixed and
the PR is otherwise identical.

___

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-11-27 10:39:07 +00:00
Joseph Huber
d047bee496 Revert "[Offload] Introduce offload-tblgen and initial new API implementation (#108413)"
This reverts commit 8a2311c4bf9993230e37dc20b57973dc917f2338.
2024-11-25 12:16:46 -06:00
Joseph Huber
d7c20a6f0c [libc][NFC] Move RPC opcodes to the 'shared/' directory as well 2024-11-25 12:04:10 -06:00
Callum Fare
8a2311c4bf
[Offload] Introduce offload-tblgen and initial new API implementation (#108413)
Introduce `offload-tblgen` and an initial implementation of a subset of
the new API. The tablegen files are intended to be the single source of
truth for the new API, with the header files, documentation, and others
bits of source all automatically generated.

**TODO** (based on review feedback so far):
- [x] Check in the generated headers
- [x] Add an `offload-generate` target to trigger the generation rather
than building them every time
- [x] Decide how error handling should work
  - [x] Finish up new error handling implementation 
- [x] Decide naming convention
- [x] Add testing for the new API
- [x] Add tablegen specific testing
- [x] clang-tidy and use llvm:: types when possible
- [x] Add optional code location arguments
- [x] Avoid multiple returns from one function

### offload-tblgen

See the included
[README](d80db06491/offload/new-api/API/README.md)
for more information on how the API definition and generation works. I'm
happy to answer any questions about it and plan to walk through it in a
future LLVM Offload call.

It should be noted that struct definitions have not been fully
implemented/tested as they aren't used by the initial API definitions,
but finishing that off in the future shouldn't be too much work.

The tablegen tooling has been designed to be easily extended with new
backends, using the classes in `RecordTypes.hpp` to abstract over the
tablegen records.

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
$ git clone -b offload_adapter https://github.com/callumfare/unified-runtime.git
$ cd unified-runtime
$ mkdir build
$ cd build
$ cmake .. -GNinja -DUR_BUILD_ADAPTER_OFFLOAD=ON \
    -DUR_OFFLOAD_INSTALL_DIR=<offload build dir containing liboffload_new.so> \
    -DUR_OFFLOAD_INCLUDE_DIR=<offload build dir containing 'offload' headers directory>
$ ninja urinfo
export LD_LIBRARY_PATH=<offload build dir containing offload plugin libraries>
$ UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo
[cuda:gpu][cuda:0] CUDA, NVIDIA GeForce GT 1030  [12030]
# Demo with tracing
$ OFFLOAD_TRACE=1 UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo
---> offloadPlatformGet(.NumEntries = 0, .phPlatforms = {}, .pNumPlatforms = 0x7ffd05e4d6e0 (2))-> OFFLOAD_RESULT_SUCCESS
---> offloadPlatformGet(.NumEntries = 2, .phPlatforms = {0x564bf4040220, 0x564bf4040240}, .pNumPlatforms = nullptr)-> OFFLOAD_RESULT_SUCCESS
...
```


### Open questions and future work
* The new API is implemented in a separate library
(`liboffload_new.so`). It could just as easily be part of the existing
`libomptarget` library - I have no strong feelings on which is better.
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-11-25 11:34:14 -06:00
Joseph Huber
506ca19dc9
[OpenMP] Remove use of '__AMDGCN_WAVEFRONT_SIZE' (#113156)
Summary:
This is going to be deprecated in
https://github.com/llvm/llvm-project/pull/112849. This patch ports it to
use the builtin instead. This isn't a compile constant, so it could
slightly negatively affect codegen. There really should be an IR pass to
turn it into a constant if the function has known attributes.

Using the builtin is correct when we just do it for knowing the size
like we do here. Obviously guarding w32/w64 code with this check would
be broken.
2024-11-25 07:38:28 -06:00
Joseph Huber
b4d49fb52e
[libc] Remove RPC server API and use the header directly (#117075)
Summary:
This patch removes much of the `llvmlibc_rpc_server` interface. This
pretty much deletes all of this code and just replaces it with including
`rpc.h` directly. We still maintain the file to let `libc` handle the
opcodes, since those depend on the `printf` impelmentation.

This will need to be cleaned up more, but I don't want to put too much
into a single patch.
2024-11-25 07:13:28 -06:00
Joseph Huber
676a1e6643
[AMDGPU] Remove uses of deprecreated HSA executable functions (#117241)
Summary:
These functions were deprecated in ROCR 1.3 which was released quite
some time ago. The main functionality that was lost was modifying and
inspecting the code object indepedently of the executable, however we do
all of that custom through our ELF API. This should be within the
versions of other functions we use.
2024-11-22 07:16:40 -06:00
Ivan Radanov Ivanov
0a27e4eed4 [offload] Fix copy-paste defect in error message 2024-11-19 17:12:51 +09:00
Matt Arsenault
a6fc489bb7
AMDGPU: Add gfx950 subtarget definitions (#116307)
Mostly a stub, but adds some baseline tests and
tests for removed instructions.
2024-11-18 10:41:14 -08:00
Johannes Doerfert
2edfa50e7f
[Offload][NFC] Update README.md 2024-11-17 07:32:29 -08:00
agozillon
3723449955
[OpenMP] Allocatable explicit member mapping fortran offloading tests (#113555)
This PR is one in a series of 3 that aim to add support for explicit
member mapping of allocatable components in derived types within
OpenMP+Fortran for Flang.

This PR provides all of the runtime tests that are currently
upstreamable, unfortunately some of the other tests would require
linking of the fortran runtime for offload which we currently do not do.
But regardless, this is plenty to ensure that the mapping is working in
most cases.
2024-11-16 12:22:33 +01:00
Matin Raayai
bb3f5e1fed
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.

cc @arsenm @aeubanks
2024-11-14 13:30:05 -08:00
Joseph Huber
3a20a5f510 [Offload] Move compiler-rt to runtimes in cache 2024-11-14 10:44:29 -06:00
agozillon
d84d0caf28
[Flang][OpenMP] Update MapInfoFinalization to use BlockArgs Interface and modify use_device_ptr/addr to be order independent (#113919)
This patch primarily updates the MapInfoFinalization pass to utilise the
BlockArgument interface. It also shuffles newly added arguments the
MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo
lists, instead of one prior to the owning descriptor type.

During this it was noted that the use_device_ptr/addr handling of target
data was a little bit too order dependent so I've attempted to make it
less so, as we cannot depend on argument ordering to be the same as
Fortran for any future frontends.
2024-11-14 15:47:37 +01:00
aurel32
b6bd7477a9
[Offload] Add support for riscv64 to host plugin (#115773)
This adds support for the riscv64 architecture to the offload host
plugin. The check to define FFI_DEFAULT_ABI is intentionally not guarded
by __riscv_xlen as the value is the same for riscv32 and riscv64
(support for OpenMP on riscv32 is still under review).
2024-11-13 08:15:49 -06:00
Joseph Huber
de41b137dd
[Offload] Provide a CMake cache file to easily build offloading (#115074)
Summary:
This patch adds a cache file that will automatically enable openpm,
offload, and all the fancy GPU libraries.
2024-11-07 15:35:29 -06:00
Joseph Huber
d661aea4c5
[OpenMP] Add support for custom callback in AMDGPUStream (#112785)
Summary:
We have the ability to schedule callbacks after certain events complete.
Currently we can register an arbitrary callback in CUDA, but can't in
AMDGPU. I am planning on using this support to move the RPC handling to
a separate thread, then using these callbacks to suspend / resume it
when no kernels are running. This is a preliminary patch to keep this
noise out of that one.
2024-10-29 10:18:32 -07:00
Carl Ritson
076aac59ac
[AMDGPU] Add a new target for gfx1153 (#113138) 2024-10-23 12:56:58 +09:00
Brad Richardson
06eb10dadf
[flang][driver] rename flang-new to flang (#110023)
This does a global rename from `flang-new` to `flang`. I also
removed/changed any TODOs that I found related to making this change.

---------

Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
2024-10-10 09:26:04 +01:00
Ivan Butygin
26ca8ef836
[libc] GPU RPC interface: add return value to rpc_host_call (#111288) 2024-10-06 20:22:07 +03:00
Joseph Huber
e8d2057ca4
[OpenMP] Add critical region lock for NVPTX targets (#110148)
Summary:
We define this on AMDGCN but not NVPTX, which leads to some failures
dependong on the target.
2024-09-26 11:33:52 -07:00