llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-02 21:46:06 +00:00

Author	SHA1	Message	Date
Akira Hatanaka	d9a685a9dd	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.	2024-03-27 12:24:49 -07:00
Akira Hatanaka	b311756450	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 )" (#86674 ) This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6. It appears that the commit broke msan bots.	2024-03-26 07:37:57 -07:00
Akira Hatanaka	8bd1f9116a	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects.	2024-03-25 18:05:42 -07:00
Artem Belevich	631c6e834c	[CUDA] Add support for CUDA-12.3 and sm_90a (#74895 )	2023-12-11 12:18:28 -08:00
Youngsuk Kim	d43c081aef	[clang][CGOpenMPRuntimeGPU] Merge consecutive AddrSpaceCasts (NFC) (#74279 ) Merge consecutive AddrSpaceCasts into a single AddrSpaceCast.	2023-12-04 07:03:09 -05:00
Johannes Doerfert	fae233c63f	[OpenMP] Avoid initializing the KernelLaunchEnvironment if possible (#73864 ) If we don't have a team reduction we don't need a kernel launch environment (for now). In that case we can avoid the cost.	2023-11-29 14:49:13 -08:00
Youngsuk Kim	bc6b632723	[CGOpenMPRuntimeGPU] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque ptr cleanup effort	2023-11-25 11:28:18 -06:00
Jay Foad	cf1e0c0b07	[AMDGPU] Define new targets gfx1200 and gfx1201 (#73133 ) Define target names and ELF numbers for new GFX12 targets gfx1200 and gfx1201. For now they behave identically to GFX11.	2023-11-23 16:44:05 +00:00
Youngsuk Kim	b4db24e330	[CGOpenMPRuntimeGPU] Replace unneeded use of CreatePointerBitCastOrAddrSpaceCast (NFC) Opaque ptr cleanup effort (NFC)	2023-11-18 04:17:46 -06:00
Johannes Doerfert	7318fe6334	[OpenMP][FIX] Ensure device reduction geps work for multi-var reductions If we have more than one reduction variable we need to be consistent wrt. indexing. In 3de645efe30b83ba1b6d7e500486c4f441a17a61 we broke this as the buffer type was reduced to a singleton but the index computation was not adjusted to account for that offset. This fixes it by interleaving the reduction variables properly in a array-of-struct style. We can revert it back to struct-of-array in a follow up if turns out to be a problem. I doubt it since half the accesses should benefit from the locallity this layout offers and only the other half were consecutive before.	2023-11-10 14:34:46 -08:00
Johannes Doerfert	3de645efe3	[OpenMP][NFC] Split the reduction buffer size into two components Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.	2023-11-06 11:50:41 -08:00
Johannes Doerfert	921bd29913	[OpenMP] Remove alignment for global <-> local reduction functions The alignment did likely not help much but increases the memory requirement. Note that half of the affected accesses are all performed by a single thread in each block. The reads are by consecutive threads in a single block.	2023-11-06 11:50:41 -08:00
Johannes Doerfert	abe71b77f9	[OpenMP][NFC] Delete dead code	2023-11-06 11:50:41 -08:00
Vlad Serebrennikov	dda8e3de35	[clang][NFC] Refactor `ImplicitParamDecl::ImplicitParamKind` This patch converts `ImplicitParamDecl::ImplicitParamKind` into a scoped enum at namespace scope, making it eligible for forward declaring. This is useful for `preferred_type` annotations on bit-fields.	2023-11-06 12:01:09 +03:00
Vlad Serebrennikov	edd690b02e	[clang][NFC] Refactor `TagTypeKind` (#71160 ) This patch converts TagTypeKind into scoped enum. Among other benefits, this allows us to forward-declare it where necessary.	2023-11-03 21:45:39 +04:00
Johannes Doerfert	d3e7a48cbd	[OpenMP][NFC] Remove a no-op function	2023-11-03 10:28:36 -07:00
Johannes Doerfert	f9a89e6b9c	[OpenMP][FIX] Allocate per launch memory for GPU team reductions (#70752 ) We used to perform team reduction on global memory allocated in the runtime and by clang. This was racy as multiple instances of a kernel, or different kernels with team reductions, would use the same locations. Since we now have the kernel launch environment, we can allocate dynamic memory per-launch, allowing us to move all the state into a non-racy place. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-11-01 11:11:48 -07:00
Vlad Serebrennikov	49fd28d960	[clang][NFC] Refactor `ArrayType::ArraySizeModifier` This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.	2023-10-31 18:06:34 +03:00
Johannes Doerfert	31b91213bd	[OpenMP] Unify the min/max thread/teams pathways We used to pass the min/max threads/teams values through different paths from the frontend to the middle end. This simplifies the situation by passing the values once, only when we will create the KernelEnvironment, which contains the values. At that point we also manifest the metadata, as appropriate. Some footguns have also been removed, e.g., our target check is now triple-based, not calling convention-based, as the latter is dependent on the ordering of operations. The types of the values have been unified to int32_t.	2023-10-29 10:53:20 -07:00
Johannes Doerfert	ab34d71087	[OpenMP][NFC] Remove untested code emitting no-op call	2023-10-26 14:38:24 -07:00
Johannes Doerfert	289a0f255d	[OpenMP] Remove SPMD specific handling during globalization Globalization and SPMD are different things that used to be conflated. Some leftover crossover interactions remain, trying to remove them now.	2023-10-26 14:38:23 -07:00
Shilei Tian	d6254e1b2e	Introduce the initial support for OpenMP kernel language (#66844 ) This patch starts the support for OpenMP kernel language, basically to write OpenMP target region in SIMT style, similar to kernel languages such as CUDA. What included in this first patch is the `ompx_bare` clause for `target teams` directive. When `ompx_bare` exists, globalization is disabled such that local variables will not be globalized. The runtime init/deinit function calls will not be emitted. That being said, almost all OpenMP executable directives are not supported in the region, such as parallel, task. This patch doesn't include the Sema checks for that, so the use of them is UB. Simple directives, such as atomic, can be used. We provide a set of APIs (for C, they are prefix with `ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc. Please refer to https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf for more details.	2023-10-05 17:38:06 -04:00
JP Lehr	1bff5f6d0b	Revert "[OpenMP] Introduce the initial support for OpenMP kernel language (#66844 )" This reverts commit e997dca3333823ffe2ea3aea288299f551532dcd.	2023-09-29 15:35:10 -05:00
Shilei Tian	e997dca333	[OpenMP] Introduce the initial support for OpenMP kernel language (#66844 ) This patch starts the support for OpenMP kernel language, basically to write OpenMP target region in SIMT style, similar to kernel languages such as CUDA. What included in this first patch is the `ompx_bare` clause for `target teams` directive. When `ompx_bare` exists, globalization is disabled such that local variables will not be globalized. The runtime init/deinit function calls will not be emitted. That being said, almost all OpenMP executable directives are not supported in the region, such as parallel, task. This patch doesn't include the Sema checks for that, so the use of them is UB. Simple directives, such as atomic, can be used. We provide a set of APIs (for C, they are prefix with `ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc. For more details, you can refer to https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf.	2023-09-29 13:11:09 -04:00
Sergio Afonso	094a63a20b	[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive This patch updates the `OpenMPIRBuilderConfig` structure to hold all available 'requires' clauses, and it replicates part of the code generation for the 'requires' registration function from clang in the `OMPIRBuilder`, to be used with flang. Porting the rest of features of the clang implementation to the IRBuilder and sharing it between clang and flang remains for a future patch, due to the complexity of the logic selecting the attributes of the generated registration function. Differential Revision: https://reviews.llvm.org/D147217	2023-09-14 10:33:54 +01:00
Aaron Jarmusch	131ba0ae01	Revert "[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 )" This reverts commit e831a32c93c1ab404785773cc7c08c01730d61e5.	2023-09-12 22:46:09 +00:00
Aaron Jarmusch	e3298bb275	fixup! [Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 )	2023-09-12 20:52:33 +00:00
Aaron Jarmusch	e831a32c93	[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 ) Fix for an issue where clang was not adding the address space according to the data layout, instead was using the default which resulted in a crash at times. The fix includes changes to the cases of LargeCapMemAlloc and CGroupMemAlloc where we are setting the AddrSpace according to the DataLayout.	2023-09-12 15:44:39 -04:00
Shilei Tian	10068cd654	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-26 13:35:14 -04:00
Shilei Tian	6bd74fd65f	Revert commits for kernel environment This reverts commits for kernel environments as they causes issues in AMD BB.	2023-07-23 23:32:31 -04:00
Shilei Tian	c5c8040390	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-23 18:36:01 -04:00
Jay Foad	92542f2a40	[AMDGPU] Add targets gfx1150 and gfx1151 This is the target definition only. Currently they are treated the same as GFX 11.0.x. Differential Revision: https://reviews.llvm.org/D155429	2023-07-17 13:06:12 +01:00
Sergio Afonso	63ca93c7d1	[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change. `IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs. Differential Revision: https://reviews.llvm.org/D154591	2023-07-10 14:14:16 +01:00
Doru Bercea	13888870e5	Enable dynamic-sized VLAs for data sharing in OpenMP offloaded target regions. Review: https://reviews.llvm.org/D153883	2023-07-06 10:57:10 -04:00
Dave Pagan	eb61bde829	[OpenMP][CodeGen] Add codegen for combined 'loop' directives. The loop directive is a descriptive construct which allows the compiler flexibility in how it generates code for the directive's associated loop(s). See OpenMP specification 5.2 [257:8-9]. Codegen added in this patch for the combined 'loop' directives are: 'target teams loop' -> 'target teams distribute parallel for' 'teams loop' -> 'teams distribute parallel for' 'target parallel loop' -> 'target parallel for' 'parallel loop' -> 'parallel for' NOTE: The implementation of the 'loop' directive itself is unchanged. Differential Revision: https://reviews.llvm.org/D145823	2023-07-05 12:31:59 -05:00
Youngsuk Kim	5f32baf17d	[clang] Replace uses of CreateElementBitCast (NFC) Partial progress towards replacing uses of CreateElementBitCast, as it no longer does what its name suggests. Reviewed By: barannikov88 Differential Revision: https://reviews.llvm.org/D154229	2023-06-30 17:35:36 -04:00
Sergei Barannikov	2348902268	[clang][CodeGen] Remove no-op EmitCastToVoidPtr (NFC) Reviewed By: JOE1994 Differential Revision: https://reviews.llvm.org/D153694	2023-06-29 20:29:38 +03:00
Manna, Soumi	213709e7be	[CLANG] Fix Static Code Analyzer Concerns with bad bit right shift operation in getNVPTXLaneID() In getNVPTXLaneID(CodeGenFunction &), the value of LaneIDBits is 4294967295 since function call llvm::Log2_32(CGF->getTarget()->getGridValue().GV_Warp_Size) might return 4294967295. unsigned LaneIDBits = llvm::Log2_32(CGF.getTarget().getGridValue().GV_Warp_Size); unsigned LaneIDMask = ~0u >> (32u - LaneIDBits); The shift amount (32U - LaneIDBits) might be 33, So it has undefined behavior for right shifting by more than 31 bits. This patch adds an assert to guard the LaneIDBits overflow issue with LaneIDMask value. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D151606	2023-06-22 13:29:28 -07:00
Nikita Popov	8a19af513d	[Clang] Remove uses of PointerType::getWithSamePointeeType (NFC) No longer relevant with opaque pointers.	2023-06-12 12:18:28 +02:00
Kazu Hirata	706c442e72	[CodeGen] Use DenseMapBase::lookup (NFC)	2023-06-11 13:19:26 -07:00
Konstantin Zhuravlyov	9d05727972	AMDGPU: Add basic gfx942 target Differential Revision: https://reviews.llvm.org/D149983	2023-05-10 11:51:06 -04:00
Konstantin Zhuravlyov	1fc70210a6	AMDGPU: Add basic gfx941 target Differential Revision: https://reviews.llvm.org/D149982	2023-05-10 11:51:06 -04:00
Manna, Soumi	07996804a0	[NFC] ][CLANG] Fix static code analyzer concerns Reported by Coverity: 1. Inside "ASTReader.cpp" file, in clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const , llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl > &): Using the auto keyword without an & causes a copy. auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type pair. 2. Inside "ASTReader.cpp" file, in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, llvm::SmallVectorImpl<clang::ASTReader::ImportedSubmodule> *): Using the auto keyword without an & causes a copy. auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type DenseMapPair. 3. Inside "CGOpenMPRuntimeGPU.cpp" file, in clang::CodeGen::CGOpenMPRuntimeGPU::emitGenericVarsEpilog(clang::CodeGen::CodeGenFunction &, bool): Using the auto keyword without an & causes a copy. auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type pair. 4. Inside "ASTWriter.cpp" file, in clang::ASTWriter::WriteHeaderSearch(clang::HeaderSearch const &): Using the auto keyword without an & causes a copy. auto_causes_copy: Using the auto keyword without an & causes the copy of an object of type UnresolvedHeaderDirective. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D149461	2023-05-05 14:34:36 -07:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Doru Bercea	01910787d3	Fix failure with team-wide allocated variable Review: https://reviews.llvm.org/D147572	2023-04-20 14:40:35 -04:00
Itay Bookstein	782c59a4ee	[OpenMP] Prefix outlined and reduction func names with original func's name This patch prefixes omp outlined helpers and reduction funcs with the original function's name. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140722	2023-04-19 23:00:26 +03:00
Itay Bookstein	6fdd13e0ec	Revert "[OpenMP] Prefix outlined and reduction func names with original func's name" This reverts commit 029bfc311d4d7d3cd90be81bb08c046848796d02.	2023-04-19 19:08:49 +03:00
Itay Bookstein	029bfc311d	[OpenMP] Prefix outlined and reduction func names with original func's name This patch attempts to prefix omp outlined helpers and reduction funcs with the original function's name. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140722	2023-04-19 19:05:21 +03:00
Richard Sandiford	b6d4d51f8f	[clang] Specify attribute syntax & spelling with a single argument When constructing an attribute, the syntactic form was specified using two arguments: an attribute-independent syntax type and an attribute-specific spelling index. This patch replaces them with a single argument. In most cases, that's done using a new Form class that combines the syntax and spelling into a single object. This has the minor benefit of removing a couple of constructors. But the main purpose is to allow additional information to be stored as well, beyond just the syntax and spelling enums. In the case of the attribute-specific Create and CreateImplicit functions, the patch instead uses the attribute-specific spelling enum. This helps to ensure that the syntax and spelling are consistent with each other and with the Attr.td definition. If a Create or CreateImplicit caller specified a syntax and a spelling, the patch drops the syntax argument and keeps the spelling. If the caller instead specified only a syntax (so that the spelling was SpellingNotCalculated), the patch simply drops the syntax argument. There were two cases of the latter: TargetVersion and Weak. TargetVersionAttrs were created with GNU syntax, which matches their definition in Attr.td, but which is also the default. WeakAttrs were created with Pragma syntax, which does not match their definition in Attr.td. Dropping the argument switches them to AS_GNU too (to match [GCC<"weak">]). Differential Revision: https://reviews.llvm.org/D148102	2023-04-13 10:14:49 +01:00

1 2 3

150 Commits