llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 05:26:05 +00:00

Author	SHA1	Message	Date
Brad Richardson	06eb10dadf	[flang][driver] rename flang-new to flang (#110023 ) This does a global rename from `flang-new` to `flang`. I also removed/changed any TODOs that I found related to making this change. --------- Co-authored-by: H. Vetinari <h.vetinari@gmx.com> Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>	2024-10-10 09:26:04 +01:00
Ivan Butygin	26ca8ef836	[libc] GPU RPC interface: add return value to `rpc_host_call` (#111288 )	2024-10-06 20:22:07 +03:00
Joseph Huber	e8d2057ca4	[OpenMP] Add critical region lock for NVPTX targets (#110148 ) Summary: We define this on AMDGCN but not NVPTX, which leads to some failures dependong on the target.	2024-09-26 11:33:52 -07:00
agozillon	0215579dab	[Flang][Offload][Tests] Set default OpenMP version to 5.2 (52) (#110138 ) We recently added versioning support to Flang's OpenMP, which restricts and enables certain things based on the OpenMP specification version. Currently one of the check-offload tests makes use of a feature that's at a slightly higher version than the current default causing it to fail. This PR basically applies the highest current OpenMP version number as a default argument for the lit.cfg, if we need more fine grained control in the future we can expand it to different lit commands for each relevant version than can then be added in each test. But for now, to keep it simple, just set the max level version.	2024-09-26 18:56:26 +02:00
Michael Halkenhäuser	d36f66b42d	[NFC][offload][OMPT] Cleanup of OMPT internals (#109005 ) Removed `OmptCallbacks.cpp` since relevant contents were duplicated. Because of the static linking there should be no change in functionality.	2024-09-23 11:58:40 +02:00
Joseph Huber	5f02558d82	[OpenMP] Fix not linking C libraries when enabled (#109168 ) Summary: We used to do this automatically, add it back in to do it manually.	2024-09-18 10:02:16 -07:00
Joseph Huber	c3ac3fe825	[OpenMP] Fix redefining `stdint.h` types (#108607 ) Summary: We can include `stdint.h` just fine as long as we don't allow it to find system headers, passing `-nostdlibinc` and `-nogpuinc` suppresses these extra paths so we will just use the clang resource headers for `stdint.h` and `stddef.h`.	2024-09-13 13:22:44 -05:00
Akash Banerjee	a67b6e1635	Fix typo in test.	2024-09-06 14:23:06 +01:00
Johannes Doerfert	08533a3ee8	[Offload][NFC] Reorganize `utils::` and make Device/Host/Shared clearer (#100280 ) We had three `utils::` namespaces, all with different "meaning" (host, device, hsa_utils). We should, when we can, keep "include/Shared" accessible from host and device, thus RefCountTy has been moved to a separate header. `hsa_utils` was introduced to make `utils::` less overloaded. And common functionality was de-duplicated, e.g., `utils::advance` and `utils::advanceVoidPtr` -> `utils:advancePtr`. Type punning now checks for the size of the result to make sure it matches the source type. No functional change was intended.	2024-09-05 13:36:26 -07:00
Akash Banerjee	142433684a	[OpenMP][Flang] Fix dynamic-extent array mapping (#107247 ) This patch fixes the mapping and lowering of arrays with dynamic extents and adds a new test for the same. The fix discards the incomplete the dynamic extent information and replacing it with just the base type. When lowering to llvm later, the bounds information is used instead.	2024-09-05 12:44:10 +01:00
Jan Patrick Lehr	1a0cf245ac	[Offload] Change x86_64-pc-linux to x86_64-unknown-linux (#107023 ) It appears that the RUNTIMES build prefers the x86-64-unknown-linux-gnu triple notation for the host. This fixes runtime / test breakages when compiler-rt is used as the CLANG_DEFAULT_RTLIB.	2024-09-03 14:25:33 +02:00
WÁNG Xuěruì	9adf81182e	[Offload] Fix stray libomptarget message helper calls (#106837 ) In #92581 the `LibomptargetUitls.cmake` helpers have been removed, but only uses of `libomptarget_say` were migrated. Migrate the remaining few warning and error messages so the `check-offload` target would not fail due to missing `libomptarget_warning_say`. While at it, update the `check-offload` unavailability message to say `check-offload` instead of `check-libomptarget`. Fixes #92581	2024-08-31 07:06:41 -05:00
WÁNG Xuěruì	75545b3449	[Offload] Fix disabling of cuda target on unsupported platforms (#106835 ) The target name and the message are wrong -- both should say "cuda" for the filtering to work. Fixes commit 300e5b911442 (#93186).	2024-08-31 07:06:17 -05:00
agozillon	f4cf93fb50	[Flang][OpenMP] Align map clause generation and fix issue with non-shared allocations for assumed shape/size descriptor types (#97855 ) This PR aims to unify the map argument generation behavior across both the implicit capture (captured in a target region) and the explicit capture (process map), currently the varPtr field of the MapInfo for the same variable will be different depending on how it's captured. This PR tries to align that across the generations of MapInfoOp in the OpenMP lowering. Currently, I have opted to utilise the rawInput (input memref to a HLFIR DeclareInfoOp) as opposed to the addr field which includes more information. The side affect of this is that we have to deal with BoxTypes less often, which will result in simpler maps in these cases. The negative side affect of this is that we don't have access to the bounds information through the resulting value, however, I believe the bounds information we require in our case is still appropriately stored in the map bounds, and this seems to be the case from testing so far. The other fix is for cases where we end up with a BoxType argument into a function (certain assumed shape and sizes cases do this) that has no fir.ref wrapping it. As we need the Box to be a reference type to actually utilise the operation to access the base address stored inside and create the correct mappings we currently generate an intermediate allocation in these cases, and then store into it, and utilise this as the map argument, as opposed to the original. However, as we were not sharing the same intermediate allocation across all of the maps for a variable, this resulted in errors in certain cases when detatching/attatching the data e.g. via enter and exit. This PR adjusts this for cases Currently we only maintain tracking of all intermediate allocations for the current function scope, as opposed to module. Primarily as the only case I am aware of that this is required is in cases where we pass certain types of arguments to functions (so I opted to minimize the overhead of the pass for now). It could likely be extended to module scope if required if we find other cases where it's applicable and causing issues.	2024-08-23 19:48:43 +02:00
Ethan Luis McDonough	fde2d23ee2	[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587 ) (#102691 ) This pull request is a revised version of #76587. This pull request fixes some build issues that were present in the previous version of this change. > This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: > > - Adds blank registration functions to device RTL > - Gives PGO globals protected visibility when targeting a supported GPU > - Handles any addrspace casts for PGO calls > - Implements PGO global extraction in GPU plugins (currently only dumps info) > > These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.	2024-08-22 01:10:54 -05:00
Johannes Doerfert	3b7611594f	[Offload] Improve error reporting on memory faults (#104254 ) Since we can already track allocations, we can diagnose memory faults to some degree. If the fault happens in a prior allocation (use after free) or "close but outside" one, we can provide that information to the user. Note that the fault address might be page aligned, and not all accesses trigger a fault, especially for allocations that are backed by a MemoryManager. Still, if people disable the MemoryManager or the allocation is big enough, we can sometimes provide valueable feedback.	2024-08-21 10:01:35 -07:00
Joseph Huber	e96146cd46	[OpenMP] Temporarily disable test to keep bots green Summary: This test mysteriously fails on the bots but not locally, disable until I can figure out why.	2024-08-20 15:16:05 -05:00
Joseph Huber	e0326b668e	[OpenMP] Map `omp_default_mem_alloc` to global memory (#104790 ) Summary: Currently, we assign this to private memory. This causes failures on some SOLLVE tests. The standard isn't clear on the semantics of this allocation type, but there seems to be a consensus that it's supposed to be shared memory.	2024-08-20 12:00:41 -05:00
Fabian Mora	cfc76b6498	[llvm][offload] Move AMDGPU offload utilities to LLVM (#102487 ) This patch moves utilities from `offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to `llvm/Frontend/Offloading/Utility.h` to be reused by other projects. Concretely the following changes were made: - Rename `KernelMetaDataTy` to `AMDGPUKernelMetaData`. - Remove unused fields `KernelObject`, `KernelSegmentSize`, `ExplicitArgumentCount` and `ImplicitArgumentCount` from `AMDGPUKernelMetaData`. - Return the produced error if `ELFObj.sections()` failed instead of using `cantFail`. - Added `AGPRCount` field to `AMDGPUKernelMetaData`. - Added a default invalid value to all the fields in `AMDGPUKernelMetaData`.	2024-08-20 09:03:06 -04:00
estewart08	ea8bb4d633	[offload] - Fix issue with standalone debug offload build (#104647 ) Error: CommandLine Error: Option 'attributor-manifest-internal' registered more than once During the standalone debug build of offload the above error is seen at app runtime when using a prebuilt llvm with LLVM_LINK_LLVM_DYLIB=ON. This is caused by linking both libLLVM.so and various archives that are found via llvm_map_components_to_libnames for jit support.	2024-08-19 17:59:21 -05:00
Joseph Huber	161e250add	[OpenMP] Fix buildbot failing on allocator test	2024-08-14 13:56:12 -05:00
Joseph Huber	74d23f15b6	[OpenMP] Implement 'omp_alloc' on the device (#102526 ) Summary: The 'omp_alloc' function should be callable from a target region. This patch implemets it by simply calling `malloc` for every non-default trait value allocator. All the special access modifiers are unimplemented and return null. The null allocator returns null as the spec states it should not be usable from the target.	2024-08-14 13:38:55 -05:00
Johannes Doerfert	ff12c0061b	[Offload] Ensure to load images when the device is used (#103002 ) When we use the device, e.g., with an API that interacts with it, we need to ensure the image is loaded and the constructors are executed. Two tests are included to verify we 1) load images and run constructors when needed, and 2) we do so lazily only if the device is actually used. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>	2024-08-13 14:41:26 -07:00
Joseph Huber	dcc27ea41e	[LinkerWrapper] Always pass `-flto` if the linker supports it (#102972 ) Summary; Now that we use the linker to do LTO / device linking, we need to inform the `clang` invocation to use `-flto` so it forwards arguments like `-On` correctly.	2024-08-13 11:23:55 -05:00
Johannes Doerfert	3f9c9acedd	[Offload] Add the right paths to the CUDA lit tests (#102997 )	2024-08-12 20:34:33 -07:00
Johannes Doerfert	80525dfcde	[Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549 ) Through the new `-foffload-via-llvm` flag, CUDA kernels can now be lowered to the LLVM/Offload API. On the Clang side, this is simply done by using the OpenMP offload toolchain and emitting calls to `llvm` functions to orchestrate the kernel launch rather than `cuda` functions. These `llvm` functions are implemented on top of the existing LLVM/Offload API. As we are about to redefine the Offload API, this wil help us in the design process as a second offload language. We do not support any CUDA APIs yet, however, we could: https://www.osti.gov/servlets/purl/1892137 For proper host execution we need to resurrect/rebase https://tianshilei.me/wp-content/uploads/2021/12/llpp-2021.pdf (which was designed for debugging). ``` ❯❯❯ cat test.cu extern "C" { void llvm_omp_target_alloc_shared(size_t Size, int DeviceNum); void llvm_omp_target_free_shared(void DevicePtr, int DeviceNum); } __global__ void square(int A) { A = 42; } int main(int argc, char argv) { int DevNo = 0; int Ptr = reinterpret_cast<int >(llvm_omp_target_alloc_shared(4, DevNo)); Ptr = 7; printf("Ptr %p, Ptr %i\n", Ptr, Ptr); square<<<1, 1>>>(Ptr); printf("Ptr %p, Ptr %i\n", Ptr, Ptr); llvm_omp_target_free_shared(Ptr, DevNo); } ❯❯❯ clang++ test.cu -O3 -o test123 -foffload-via-llvm --offload-arch=native ❯❯❯ llvm-objdump --offloading test123 test123: file format elf64-x86-64 OFFLOADING IMAGE [0]: kind elf arch gfx90a triple amdgcn-amd-amdhsa producer openmp ❯❯❯ LIBOMPTARGET_INFO=16 ./test123 Ptr 0x155448ac8000, Ptr 7 Ptr 0x155448ac8000, Ptr 42 ```	2024-08-12 17:44:58 -07:00
jyu2-git	a43677c172	Test faild with amd. (#101781 ) Add unspport. This is relate #101101	2024-08-02 17:53:23 -07:00
jyu2-git	d8b61dd84b	[OpenMP] Generate implicit default mapper for mapping array section. (#101101 ) This is only for struct containing nested structs with user defined mappers. Add four functions: 1>buildImplicitMap: build map for default mapper 2>buildImplicitMapper: build default mapper. 3>hasUserDefinedMapper for given mapper name and mapper type, lookup user defined map, if found one return true. 4>isImplicitMapperNeeded check if Mapper is needed During create map, in checkMappableExpressionList, call isImplicitMapperNeeded when it return true, call buildImplicitMapper to generate implicit mapper and added to map clause. https://github.com/llvm/llvm-project/pull/101101	2024-08-02 17:22:40 -07:00
Johannes Doerfert	f3bfc56327	[Offload][OpenMP] Prettify error messages by "demangling" the kernel name (#101400 ) The kernel names for OpenMP are manually mangled and not ideal when we report something to the user. We demangle them now, providing the function and line number of the target region, together with the actual kernel name.	2024-08-01 15:24:15 -07:00
Johannes Doerfert	9a1013220b	[Offload] Allow to record kernel launch stack traces (#100472 ) Similar to (de)allocation traces, we can record kernel launch stack traces and display them in case of an error. However, the AMD GPU plugin signal handler, which is invoked on memroy faults, cannot pinpoint the offending kernel. Insteade print `<NUM>`, set via `OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=<NUM>`, many traces. The recoding/record uses a ring buffer of fixed size (for now 8). For `trap` errors, we print the actual kernel name, and trace if recorded.	2024-07-31 11:49:50 -07:00
Johannes Doerfert	c95abe94ae	[Offload] Implement double free (and other allocation error) reporting (#100261 ) As a first step towards a GPU sanitizer we now can track allocations and deallocations in order to report double frees, and other problems during deallocation.	2024-07-30 10:10:57 -07:00
Joseph Huber	363c1e6e51	[OpenMP] Re-enable test after correctly forwarding `mllvm`	2024-07-29 10:06:54 -05:00
Joseph Huber	dbb8b7a0f4	Reapply "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )" This reverts commit fea5914c926e2f013a8b5e27eaa74c7047fb2c71.	2024-07-26 17:21:56 -05:00
Joseph Huber	fea5914c92	Revert "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )" This reverts commit 069e8bcd82c4420239f95c7e6a09e1f756317cfc. Summary: Some tests failing, revert this for now.	2024-07-26 16:39:12 -05:00
Joseph Huber	069e8bcd82	[OpenMP][libc] Remove special handling for OpenMP printf (#98940 ) Summary: Currently there are several layers to handle `printf`. Since we now have varargs and an implementation of `printf` this can be heavily simplified. 1. The frontend renames `printf` into `omp_vprintf` and gives it an argument buffer. Removing 1. triggered some code in the AMDGPU backend menat for HIP / OpenCL, so I hadded an exception to it. 2. Forward this to CUDA vprintf or ignore it. We no longer need special handling for it since we have varargs. So now we just forward this to CUDA vprintf if we have libc, otherwise just leave `printf` as an external function and expect that `libc` will be linked in.	2024-07-26 16:03:36 -05:00
Joseph Huber	7ebd97b852	[OpenMP] Do not define '__assert_fail' if we have the GPU libc (#100409 ) Summary: The C library is intended to provide `__assert_fail`, so in the cases that we have both we should defer to that. This means that if you build the C library for GPUs you'll get the RPC based asser, and if not you'll get the trap based one.	2024-07-26 15:18:10 -05:00
Johannes Doerfert	7102592af7	[Offload] Repair and rename `llvm-omp-device-info` (to `-offload-`) (#100309 ) The `llvm-omp-device-info` tool is very handy, but broke due to the lazy evaluation of devices. This repairs the functionality and adds a test. The tool is also renamed into `llvm-offload-device-info` as `-omp-` is going away.	2024-07-24 09:35:09 -07:00
Joseph Huber	8d8fa01a66	Reapply "[libc] Remove 'packaged' GPU build support (#100208 )" This reverts commit 550b83d658755664a7f0f93b36242e885743a91b.	2024-07-24 10:24:53 -05:00
Johannes Doerfert	3c8efd7928	[OpenMP] Ensure the actual kernel is annotated with launch bounds (#99927 ) In debug mode there is a wrapper (the kernel) around the function in which we generate the kernel code. We worked around this before to get the correct kernel name, but now we really distinguish both to attach the launch bounds to the kernel, not the inner function.	2024-07-23 09:02:47 -07:00
Joseph Huber	4854e25359	[Offload] Re-enable tests that are now passing Summary: Some recent patches made these stop failing so the XFAIL now makes the bots go red. Fixes https://github.com/llvm/llvm-project/issues/98903	2024-07-23 10:56:55 -05:00
Shilei Tian	41f6599ae1	[NFC][Offload] Move variables to where they are used (#99956 )	2024-07-22 19:52:16 -04:00
Pranav Bhandarkar	d7e185cca9	[OMPIRBuilder] - Handle dependencies in `createTarget` (#93977 ) This patch handles dependencies specified by the `depend` clause on an OpenMP target construct. It does this much the same way clang does it by materializing an OpenMP `task` that is tagged with the dependencies. The following functions are relevant to this patch - 1) `createTarget` - This function itself is largely unchanged except that it now accepts a vector of `DependData` objects that it simply forwards to `emitTargetCall` 2) `emitTargetCall` - This function has changed now to check if an outer target-task needs to be materialized (i.e if `target` construct has `nowait` or has `depend` clause). If yes, it calls `emitTargetTask` to do all the heavy lifting for creating and dispatching the task. 3) `emitTargetTask` - Bulk of the change is here. See the large comment explaining what it does at the beginning of this function	2024-07-22 10:56:45 -05:00
Jan Patrick Lehr	caaf8099ef	[Offload][OMPT] Add callbacks for (dis)associate_ptr (#99046 ) This adds the OMPT callbacks for the API functions disassociate_ptr and associate_ptr.	2024-07-17 10:15:19 +02:00
Jan Patrick Lehr	4ed0f84d38	[Offload] XFAIL four tests while working on fix (#98899 ) omp_dynamic_shared_memory_mixed_amdgpu.c omp_dynamic_shared_memory_amdgpu.c amdgcn-amd-amdhsa::bug51982.c amdgcn-amd-amdhsa::bug51781.c	2024-07-15 15:45:59 +02:00
Jinsong Ji	6556ba66b2	[Offload][test]Fix typo of requires (#98327 ) Typos in 8823448807f3b1a1362d1417e062d763734e02f5.	2024-07-10 10:51:47 -04:00
jyu2-git	32f7672acc	[Clang][OpenMP] This is addition fix for #92210 . (#94802 ) Fix another runtime problem when explicit map both pointer and pointee in target data region. In #92210, problem is only addressed in target region, but missing for target data region. The change just passing AreBothBasePtrAndPteeMapped in generateInfoForComponentList when processing target data. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2024-07-03 20:56:53 -07:00
Joseph Huber	3c50cbfda4	[DeviceRTL] Make defined 'libc' functions weak in OpenMP (#97356 ) Summary: These functions provide special-case implementations internal to the OpenMP device runtime. This can potentially conflict with the symbols pulled in from the actual GPU `libc`. This patch makes these weak, so in the case that the GPU libc functions exist they will be overridden. This should not impact performance in the average case because the old `-mlink-builtin-bitcode` version does internalization, deleting weak, and the new LTO path will resolve to the strong reference and then internalize it.	2024-07-02 13:23:53 -05:00
dhruvachak	946f5d111d	[OpenMP] [OMPT] Callback registration should not depend on the device init callback. (#96371 ) Even if the device init callback is not registered, a tool should be allowed to register other callbacks.	2024-07-01 10:07:05 -07:00
Gheorghe-Teodor Bercea	1a478a69bc	[OpenMP][offload] Fix dynamic schedule tracking (#97065 ) This patch fixes the dynamic schedule tracking.	2024-07-01 10:23:11 -04:00
Ethan Luis McDonough	8823448807	[Offload] Refactor offload test requirements (#95196 ) Many tests in the `offload` project have requirements defined by which targets are not supported rather than which platforms are supported. This patch aims to streamline the requirement definitions by adding four new feature tags: `host`, `gpu`, `amdgpu`, and `nvidiagpu`.	2024-06-29 00:56:18 -05:00

1 2 3

119 Commits