llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-02 14:36:07 +00:00

Author	SHA1	Message	Date
Abid Qadeer	9e08db796b	[OpenMPIRBuilder] Don't drop debug info for target region. (#80692 ) When an outlined function is generated for omp target region, a corresponding DISubprogram was not being generated. This resulted in all the debug information for the target region being dropped. This commit adds DISubprogram for the outlined function if there is one available for the parent function. It also updates the current debug location so that the right scope is used for the entries in the outlined function. There are places in the OpenMPIRBuilder which changes insertion point but don't update the debug location accordingly. They cause issue when debug info is enabled. I have fixed a few that I observed to cause issue. But there may be more and a systematic cleanup may be required. With this change in place, I can set source line breakpoint in target region and run to them in debugger.	2024-09-04 10:16:14 +01:00
Sergio Afonso	84b1e59580	[MLIR][OpenMP][OMPIRBuilder] Add lowering support for omp.target_triples (#100156 ) This patch modifies MLIR to LLVM IR lowering of the OpenMP dialect to take into consideration the contents of the `omp.target_triples` module attribute while generating code for `omp.target` operations. It adds the `OpenMPIRBuilderConfig::TargetTriples` field and initializes it using the `amendOperation` flow of the `OpenMPToLLVMIRTranslation` pass. Some changes are introduced into the `OpenMPIRBuilder` to allow passing the information about whether a target region is intended to be offloaded from outside. The result of this change is that offloading calls are only generated when the `--offload-arch` or `-fopenmp-targets` options are given to the compiler. Otherwise, only the host fallback code is generated. This fixes linker errors currently triggered by `flang-new` if a source file containing a `target` construct is compiled without any of the aforementioned options. Several unit tests impacted by these changes, which are intended to check host code generated for `omp.target` operations, are updated to contain the new attribute. Without it, no calls to `__tgt_target_kernel` and associated control flow operations are generated. Fixes #100209.	2024-08-02 11:58:40 +01:00
Pranav Bhandarkar	5b4e5f8ac6	[OpenMPIRBuilder][Clang][NFC] - Combine `emitOffloadingArrays` and `emitOffloadingArraysArgument` in OpenMPIRBuilder (#97088 ) This patch introduces a new interface in `OpenMPIRBuilder` that combines the creation of the so-called offloading pointer arrays and their subsequent preparation as arguments to the OpenMP runtime library. We then use this in Clang. This is intended to be used in the near future by other frontends such as Flang when lowering MLIR to LLVMIR.	2024-07-25 16:28:11 -05:00
Krzysztof Parzyszek	a0c590795e	[Frontend][OpenMP] Allow implicit clauses to fail to apply (#100460 ) The `linear(x)` clause implies `firstprivate(x)` on the compound construct if `x` is not an induction variable. With more construct combinations coming in OpenMP 6.0, the `firstprivate` clause may not be possible to apply, e.g. in "masked simd". An additional benefit from this change is that it allows treating leaf constructs as combined constructs with a single constituent. Otherwise, a `linear` clause on a lone `simd` construct could imply a `firstprivate` clause that can't be applied.	2024-07-25 09:20:18 -05:00
harishch4	b4ab52c8e7	[Flang][OpenMP] Lowering Order clause to MLIR (#96730 )	2024-06-27 11:58:12 +05:30
Akash Banerjee	6b1c51bc05	[OpenMP] Migrate GPU Reductions CodeGen from Clang to OMPIRBuilder (#80343 ) This patch migrates the CGOpenMPRuntimeGPU::emitReduction and related functions to the OpenMPIRBUilder. In future patches MLIR OpenMP translation would be making use of these functions. Co-authored-by: Jan Leyonberg <jan.leyonberg@amd.com>	2024-06-26 20:18:38 +01:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Krzysztof Parzyszek	eb88e7c1d9	[Frontend][OpenMP] Remove `reduction` from allowed clauses for `target` (#90754 ) The "reduction" clause is not allowed on the "target" construct.	2024-05-30 07:56:58 -05:00
Tom Eccles	74a87548e5	[flang][MLIR][OpenMP] make reduction by-ref toggled per variable (#92244 ) Fixes #88935 Toggling reduction by-ref broke when multiple reduction clauses were used. Decisions made for the by-ref status for later clauses could then invalidate decisions for earlier clauses. For example, ``` reduction(+:scalar,scalar2) reduction(+:array) ``` The first clause would choose by value reduction and generate by-value reduction regions, but then after this the second clause would force by-ref to support the array argument. But by the time the second clause is processed, the first clause has already had the wrong kind of reduction regions generated. This is solved by toggling whether a variable should be reduced by reference per variable. In the above example, this allows only `array` to be reduced by ref.	2024-05-16 15:27:59 +01:00
Krzysztof Parzyszek	4ec4a8e7fe	[Frontend][OpenMP] Privatizing clauses in construct decomposition (#92176 ) Add remaining clauses with the "privatizing" property to construct decomposition, specifically to the part handling the `allocate` clause. --------- Co-authored-by: Tom Eccles <t@freedommail.info>	2024-05-15 09:04:07 -05:00
Krzysztof Parzyszek	be7c9e3957	[flang][OpenMP] Decompose compound constructs, do recursive lowering (#90098 ) A compound construct with a list of clauses is broken up into individual leaf/composite constructs. Each such construct has the list of clauses that apply to it based on the OpenMP spec. Each lowering function (i.e. a function that generates MLIR ops) is now responsible for generating its body as described below. Functions that receive AST nodes extract the construct, and the clauses from the node. They then create a work queue consisting of individual constructs, and invoke a common dispatch function to process (lower) the queue. The dispatch function examines the current position in the queue, and invokes the appropriate lowering function. Each lowering function receives the queue as well, and once it needs to generate its body, it either invokes the dispatch function on the rest of the queue (if any), or processes nested evaluations if the work queue is at the end. Re-application of ca1bd5995f6ed934f9187305190a5abfac049173 with fixes for compilation errors.	2024-05-13 10:32:16 -05:00
Krzysztof Parzyszek	25a3ba3315	Revert "[flang][OpenMP] Decompose compound constructs, do recursive lowering (#90098 )" It breaks some builds, e.g. https://lab.llvm.org/buildbot/#/builders/268/builds/13909 This reverts commit ca1bd5995f6ed934f9187305190a5abfac049173.	2024-05-13 08:43:45 -05:00
Krzysztof Parzyszek	ca1bd5995f	[flang][OpenMP] Decompose compound constructs, do recursive lowering (#90098 ) A compound construct with a list of clauses is broken up into individual leaf/composite constructs. Each such construct has the list of clauses that apply to it based on the OpenMP spec. Each lowering function (i.e. a function that generates MLIR ops) is now responsible for generating its body as described below. Functions that receive AST nodes extract the construct, and the clauses from the node. They then create a work queue consisting of individual constructs, and invoke a common dispatch function to process (lower) the queue. The dispatch function examines the current position in the queue, and invokes the appropriate lowering function. Each lowering function receives the queue as well, and once it needs to generate its body, it either invokes the dispatch function on the rest of the queue (if any), or processes nested evaluations if the work queue is at the end.	2024-05-13 08:09:24 -05:00
Krzysztof Parzyszek	4631e7bad6	[Frontend][OpenMP] Add unit tests for getLeafConstructsOrSelf, NFC (#90110 )	2024-04-30 11:46:30 -05:00
Krzysztof Parzyszek	d577518d98	[Frontend][OpenMP] Implement getLeafOrCompositeConstructs (#89104 ) This function will break up a construct into constituent leaf and composite constructs, e.g. if OMPD_c_d_e and OMPD_d_e are composite constructs, then OMPD_a_b_c_d_e will be broken up into the list {OMPD_a, OMPD_b, OMPD_c_d_e}.	2024-04-24 08:03:36 -05:00
Krzysztof Parzyszek	70d3ddb280	[Frontend][OpenMP] Add functions for checking construct type (#87258 ) Implement helper functions to identify leaf, composite, and combined constructs.	2024-04-23 08:10:40 -05:00
Krzysztof Parzyszek	40137ff0d8	[Frontend][OpenMP] Refactor getLeafConstructs, add getCompoundConstruct (#87247 ) Emit a special leaf construct table in DirectiveEmitter.cpp, which will allow both decomposition of a construct into leafs, and composition of constituent constructs into a single compound construct (if possible). The function `getLeafConstructs` is no longer auto-generated, but implemented in OMP.cpp. The table contains a row for each directive, and each row has the following format `dir_id, num_leafs, leaf1, leaf2, ..., leafN, -1, ...` The rows are sorted lexicographically with respect to the leaf constructs. This allows a binary search for the row corresponding to the given list of leafs. There is an auxiliary table that for each directive contains the index of the row corresponding to that directive. Looking up leaf constructs for a directive `dir_id` is of constant time, and and consists of two lookups: `LeafTable[Auxiliary[dir_id]]`. Finding a compound directive given the set of leafs is of time O(logn), and is roughly represented by `row = binary_search(LeafTable); return row[0]`. The functions `getLeafConstructs` and `getCompoundConstruct` use these lookup methods internally.	2024-04-22 14:41:11 -05:00
Sergio Afonso	3eb0ba34b0	[MLIR][Flang][OpenMP] Make omp.simdloop into a loop wrapper (#87365 ) This patch updates the definition of `omp.simdloop` to enforce the restrictions of a wrapper operation. It has been renamed to `omp.simd`, to better reflect the naming used in the spec. All uses of "simdloop" in function names have been updated accordingly. Some changes to Flang lowering and OpenMP to LLVM IR translation are introduced to prevent the introduction of compilation/test failures. The eventual long term solution might be different.	2024-04-17 11:28:30 +01:00
Joseph Huber	470aefb240	[Offload][NFC] Remove `omp_` prefix from offloading entries (#88071 ) Summary: These entires are generic for offloading with the new driver now. Having the `omp` prefix was a historical artifact and is confusing when used for CUDA. This patch just renames them for now, future patches will rework the binary format to make it more common.	2024-04-09 15:50:15 -05:00
Akash Banerjee	e9da5f0083	[OpenMP] Fix target data region codegen being omitted for device pass (#85218 ) This patch enables the BodyCodeGen callback to still trigger for the TargetData nested region during the device pass. There maybe Target code nested within the TargetData region for which this is required. Also add tests for the same.	2024-03-19 13:04:23 +00:00
Leandro Lupori	64422cf826	[llvm][mlir][OMPIRBuilder] Translate omp.single's copyprivate (#80488 ) Use the new copyprivate list from omp.single to emit calls to __kmpc_copyprivate, during the creation of the single operation in OMPIRBuilder. This is patch 4 of 4, to add support for COPYPRIVATE in Flang. Original PR: https://github.com/llvm/llvm-project/pull/73128	2024-02-28 13:33:42 -03:00
agozillon	dcf4ca558c	[OpenMP][MLIR][OMPIRBuilder] Add a small optional constant alloca raise function pass to finalize, utilised in convertTarget (#78818 ) This patch seeks to add a mechanism to raise constant (not ConstantExpr or runtime/dynamic) sized allocations into the entry block for select functions that have been inserted into a list for processing. This processing occurs during the finalize call, after OutlinedInfo regions have completed. This currently has only been utilised for createOutlinedFunction, which is triggered for TargetOp generation in the OpenMP MLIR dialect lowering to LLVM-IR. This currently is required for Target kernels generated by createOutlinedFunction to avoid subsequent optimization passes doing some unintentional malformed optimizations for AMD kernels (unsure if it occurs for other vendors). If the allocas are generated inside of the kernel and are not in the entry block and are subsequently passed to a function this can lead to required instructions being erased or manipulated in a way that causes the kernel to run into a HSA access error. This fix is related to a series of problems found in: https://github.com/llvm/llvm-project/issues/74603 This problem primarily presents itself for Flang's HLFIR AssignOp currently, when utilised with a scalar temporary constant on the RHS and a descriptor type on the LHS. It will generate a call to a runtime function, wrap the RHS temporary in a newly allocated descriptor (an llvm struct), and pass both the LHS and RHS descriptor into the runtime function call. This will currently be embedded into the middle of the target region in the user entry block, which means the allocas are also embedded in the middle, which seems to pose issues when later passes are executed. This issue may present itself in other HLFIR operations or unrelated operations that generate allocas as a by product, but for the moment, this one test case is the only scenario I've found this problem. Perhaps this is not the appropriate fix, I am very open to other suggestions, I've tried a few others (at varying levels of the flang/mlir compiler flow), but this one is the smallest and least intrusive change set. The other two, that come to mind (but I've not fully looked into, the former I tried a little with blocks but it had a few issues I'd need to think through): - Having a proper alloca only block (or region) generated for TargetOps that we could merge into the entry block that's generated by convertTarget's createOutlinedFunction. - Or diverging a little from Clang's current target generation and using the CodeExtractor to generate the user code as an outlined function region invoked from the kernel we make, with our kernel arguments passed into it. Similar to the current parallel generation. I am not sure how well this would intermingle with the existing parallel generation though that's layered in. Both of these methods seem like quite a divergence from the current status quo, which I am not entirely sure is merited for the small test this change aims to fix.	2024-02-23 22:59:41 +01:00
Joseph Huber	cc374d8056	[OpenMP] Remove `register_requires` global constructor (#80460 ) Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used. This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything. This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it. It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well.	2024-02-21 11:33:32 -06:00
Kazu Hirata	5c9d82de6b	[llvm] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 22:46:02 -08:00
Dominik Adamski	bb4484d41e	[OpenMPIRBuilder] Add support for target workshare loops (#73360 ) The workshare loop for target region uses the new OpenMP device runtime. The code generation scheme for the new device runtime is presented below: Input code: ``` workshare-loop { loop-body } ``` Output code: helper function which represents loop body: ``` function-loop-body(counter, loop-body-args) { loop-body } ``` workshare-loop is replaced by the proper device runtime call: ``` call __kmpc_new_worksharing_rtl(function-loop-body, loop-body-args, loop-tripcount, ...) ``` This PR uses the new device runtime functions which were added in PR: https://github.com/llvm/llvm-project/pull/73225	2023-12-06 09:47:09 +01:00
Fangrui Song	dd3184c30f	[unittest,examples] Replace uses of IRBuilder::getInt8PtrTy with getPtrTy. NFC	2023-11-27 08:29:13 -08:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
Dominik Adamski	2cce0f6c57	[OpenMP][OMPIRBuilder] Add support to omp target parallel (#67000 ) Added support for LLVM IR code generation which is used for handling omp target parallel code. The call for __kmpc_parallel_51 is generated and the parallel region is outlined to separate function. The proper setup of kmpc_target_init mode is not included in the commit. It is assumed that the SPMD mode for target initialization is properly set by other codegen functions.	2023-11-06 11:44:00 +01:00
Johannes Doerfert	b8cbc5c02c	[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401 ) The KernelEnvironment is for compile time information about a kernel. It allows the compiler to feed information to the runtime. The KernelLaunchEnvironment is for dynamic information per kernel launch. It allows the rutime to feed information to the kernel that is not shared with other invocations of the kernel. The first use case is to replace the globals that synchronize teams reductions with per-launch versions. This allows concurrent teams reductions. More uses cases will follow, e.g., per launch memory pools. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-10-31 19:38:43 -07:00
Shraiysh	9922aadf9e	[OpenMPIRBuilder] Added `if` clause for `teams` (#69139 ) This patch adds support for the `if` clause on `teams` construct. The value of the argument must be an integer value. If the value evaluates to true (non-zero) integer, then the number of threads is determined by `num_threads` clause (or default and ICV if `num_threads` is absent). When the condition evaluates to false (zero), then the bounds are set to 1. ([OpenMP 5.2 Section 10.2](https://www.openmp.org/spec-html/5.2/openmpse58.html)) This essentially means that ``` upperbound = ifexpr ? upperbound : 1 lowerbound = ifexpr ? lowerbound : 1 ```	2023-10-17 15:00:39 -05:00
Shraiysh	e41eaf4896	[OpenMPIRBuilder] Add ThreadLimit and NumTeams clauses to teams construct (#68364 ) This patch adds support for `thread_limit` and bounds on `num_teams` clause for the teams construct in OpenMP. Added testcases for the same.	2023-10-11 10:36:03 -05:00
Kazu Hirata	303e020126	[FrontEnd] Fix a warning This patch fixes: third-party/unittest/googletest/include/gtest/gtest.h:1379:11: error: comparison of integers of different signs: 'const unsigned int' and 'const int' [-Werror,-Wsign-compare]	2023-10-09 10:25:43 -07:00
Shraiysh	9050b27bd5	[OpenMPIRBuilder] Remove wrapper function in `createTask`, `createTeams` (#67723 ) This patch removes the wrapper function in `OpenMPIRBuilder::createTask` and `OpenMPIRBuilder.createTeams`. The outlined function is directly of the form that is expected by the runtime library calls. This patch also adds a utility function to help add fake values and their uses, which will be deleted in finalization callbacks. Why we needed wrappers earlier? Before the post outline callbacks are executed, the IR has the following structure: ``` define @func() { ;... call void @outlined_fn(ptr %data) ;... } define void @outlined_fn(ptr %data) ``` OpenMP offloading expects a specific signature for the outlined function in a runtime call. For example, `__kmpc_fork_teams` expects the following signature: ``` define @outlined_fn(ptr %global.tid, ptr %data) ``` As there is no way to change a function's arguments after it has been created, a wrapper function with the expected signature is created that calls the outlined function inside it. How we are handling it now? To handle this in the current patch, we create a "fake" global tid and add a "fake" use for it in the to-be-outlined region. We need to create these fake values so the outliner sees it as something it needs to pass to the outlined function. We also tell the outliner to exclude this global tid value from the aggregate `data` argument, so it comes as a separate argument in the beginning. This way, we are able to directly get the outlined function in the expected format. This is inspired by the way `createParallel` handles outlining (using fake values and then deleting them later). Tasks are handled with a similar approach. This simplifies the generated code and the code to do this itself also becomes simpler (because we no longer have to construct a new function).	2023-10-09 09:20:31 -04:00
agozillon	2a1f1b5fde	[OpenMP][OpenMPIRBuilder] Move copyInput to a passed in lambda function and re-order kernel argument load/stores (#68124 ) This patch moves the existing copyInput function into a lambda argument that can be defined by a caller to the function. This allows more flexibility in how the function is defined, allowing Clang and MLIR to utilise their own respective functions and types inside of the lamba without affecting the OMPIRBuilder itself. The idea is to eventually replace/build on the existing copyInput function that's used and moved into OpenMPToLLVMIRTranslation.cpp to a slightly more complex implementation that uses MLIRs map information (primarily ByRef and ByCapture information at the moment). The patch also moves kernel load stores to the top of the kernel, prior to the first openmp runtime invocation. Just makes the IR a little closer to Clang.	2023-10-06 16:47:27 +02:00
JOE1994	81ee059073	[llvm] Replace uses of Type::getPointerTo (NFC) opaque pointer clean-up effort (NFC)	2023-10-05 10:08:38 -04:00
Shraiysh	8d17875acb	[OMPIRBuilder] Added `createTeams` (#66807 ) This patch adds basic support for `omp teams` to the OpenMPIRBuilder. The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls. This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken.	2023-09-24 16:23:43 -05:00
Prabhdeep Singh Soni	9b57b167bb	[OMPIRBuilder] Fix shared clause for task construct This patch fixes the shared clause for the task construct with multiple shared variables. The shareds field in the kmp_task_t is not an inline array in the struct, rather it is a pointer to an array. With an inline array, the pointer dereference to the outlined function body of the task would segmentation fault when accessed by the runtime. Reviewed By: kiranchandramohan, jdoerfert Differential Revision: https://reviews.llvm.org/D158462	2023-09-15 12:19:47 -04:00
Sergio Afonso	094a63a20b	[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive This patch updates the `OpenMPIRBuilderConfig` structure to hold all available 'requires' clauses, and it replicates part of the code generation for the 'requires' registration function from clang in the `OMPIRBuilder`, to be used with flang. Porting the rest of features of the clang implementation to the IRBuilder and sharing it between clang and flang remains for a future patch, due to the complexity of the logic selecting the attributes of the generated registration function. Differential Revision: https://reviews.llvm.org/D147217	2023-09-14 10:33:54 +01:00
Jan Sjodin	b7fcf51515	[OpenMP][OpenMPIRBuilder] Add kernel launch codegen to emitTargetCall This patch adds code emission in emitTargetCall to call the OpenMP runtime to launch an kernel, and to call the fallback host implementation if the launch fails. Reviewed By: TIFitis, kiranchandramohan, jdoerfert Differential Revision: https://reviews.llvm.org/D155633	2023-08-15 10:03:06 -04:00
Hao Jin	c384e79675	[OpenMP][IR] Set correct alignment for internal variables OpenMP runtime functions assume the pointers are aligned to sizeof(pointer), but it is being aligned incorrectly. Fix with the proper alignment in the IR builder. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D157040	2023-08-11 15:20:15 -04:00
Akash Banerjee	1e92e25cb4	[MLIR][OpenMP] Added MLIR translation support for use_device clauses Added MLIR support for translating use_device_ptr and use_device_addr clauses for LLVMIR lowering. - use_device_ptr: The mapped variables marked with use_device_ptr are accessed through a copy of the base pointer mappers. The mapper is copied onto a new temporary pointer variable. - use_device_addr: The mapped variables marked with use_device_addr are accessed directly through the base pointer mappers. - If mapping information is not provided explicitly then default map_type of alloc/release is assumed and the map_size is set to 0. Depends on D152554 Reviewed By: kiranchandramohan, raghavendhra Differential Revision: https://reviews.llvm.org/D146648	2023-08-04 15:38:50 +01:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Shilei Tian	10068cd654	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-26 13:35:14 -04:00
Jan Sjodin	caa35a1ad9	[OpenMP][OpenMPIRBuilder] Make outlined function parameters i64 or ptr This patch ensures that all outlined functions parameters are i64 or ptr when compiling for a target device, which is what the OpenMP runtime expects. The values are then cast to the correct type inside the kernel. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D155628	2023-07-25 13:01:40 -04:00
Shilei Tian	6bd74fd65f	Revert commits for kernel environment This reverts commits for kernel environments as they causes issues in AMD BB.	2023-07-23 23:32:31 -04:00
Shilei Tian	c979e79813	[LLVM] Remove the module dump introduced mistakenly	2023-07-23 18:55:41 -04:00
Shilei Tian	c5c8040390	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-23 18:36:01 -04:00
Fangrui Song	14c55e6e2f	[unittest] Improve OpenMPIRBuilderTest after D149162 Make it less sensitive to omp_offload.info operands order and improve the failure diagnostic. Caught by D155789	2023-07-20 14:37:54 -07:00
Sergio Afonso	40340cf91a	[MLIR][OpenMP][OMPIRBuilder] Use target triple to initialize `IsGPU` flag This patch modifies the construction of the `OpenMPIRBuilder` in MLIR to initialize the `IsGPU` flag using target triple information passed down from the Flang frontend. If not present, it will default to `false`. This replicates the behavior currently implemented in Clang, where the `CodeGenModule::createOpenMPRuntime()` method creates a different `CGOpenMPRuntime` instance depending on the target triple, which in turn has an effect on the `IsGPU` flag of the `OpenMPIRBuilderConfig` object. Differential Revision: https://reviews.llvm.org/D151903	2023-07-20 15:07:50 +01:00

1 2 3 4

189 Commits