llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-26 03:16:07 +00:00

Author	SHA1	Message	Date
Kyungwoo Lee	b667d161f0	[StructuralHash] Refactor (#112621 ) This is largely NFC, and it prepares for #112638. - Use stable_hash instead of uint64_t - Rename update* to hash* functions. They compute stable_hash locally and return it. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-10-26 09:20:26 -07:00
davidtrevelyan	4102625380	[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155 ) # What This PR renames the newly-introduced llvm attribute `sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise, sibling variables such as `SanitizeRealtimeUnsafe` are renamed to `SanitizeRealtimeBlocking` respectively. There are no other functional changes. # Why? - There are a number of problems that can cause a function to be real-time "unsafe", - we wish to communicate what problems rtsan detects and why they're unsafe, and - a generic "unsafe" attribute is, in our opinion, too broad a net - which may lead to future implementations that need extra contextual information passed through them in order to communicate meaningful reasons to users. - We want to avoid this situation and make the runtime library boundary API/ABI as simple as possible, and - we believe that restricting the scope of attributes to names like `sanitize_realtime_blocking` is an effective means of doing so. We also feel that the symmetry between `[[clang::blocking]]` and `sanitize_realtime_blocking` is easier to follow as a developer. # Concerns - I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been part of the tree for a few weeks now (introduced here: https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't been released in version 20 yet, am I correct in considering this to not be a breaking change?	2024-10-26 13:06:11 +01:00
Gang Chen	4ac0e7e400	[AMDGPU] Add a type for the named barrier (#113614 )	2024-10-25 11:24:47 -07:00
ssijaric-nv	14db069468	[InstCombine] Fix a cycle when folding fneg(select) with scalable vector types (#112465 ) The two folding operations are causing a cycle for the following case with scalable vector types: define <vscale x 2 x double> @test_fneg_select_abs(<vscale x 2 x i1> %cond, <vscale x 2 x double> %b) { %1 = select <vscale x 2 x i1> %cond, <vscale x 2 x double> zeroinitializer, <vscale x 2 x double> %b %2 = fneg fast <vscale x 2 x double> %1 ret <vscale x 2 x double> %2 } 1) fold fneg: -(Cond ? C : Y) -> Cond ? -C : -Y 2) fold select: (Cond ? -X : -Y) -> -(Cond ? X : Y) 1) results in the following since '<vscale x 2 x double> zeroinitializer' passes the check for the immediate constant: %.neg = fneg fast <vscale x 2 x double> zeroinitializer %b.neg = fneg fast <vscale x 2 x double> %b %1 = select fast <vscale x 2 x i1> %cond, <vscale x 2 x double> %.neg, <vscale x 2 x double> %b.neg and so we end up going back and forth between 1) and 2). Attempt to fold scalable vector constants, so that we end up with a splat instead: define <vscale x 2 x double> @test_fneg_select_abs(<vscale x 2 x i1> %cond, <vscale x 2 x double> %b) { %b.neg = fneg fast <vscale x 2 x double> %b %1 = select fast <vscale x 2 x i1> %cond, <vscale x 2 x double> shufflevector (<vscale x 2 x double> insertelement (<vscale x 2 x double> poison, double -0.000000e+00, i64 0), <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x double> %b.neg ret <vscale x 2 x double> %1 }	2024-10-25 10:47:39 -07:00
Alexander Richardson	305a1ceae3	[DataLayout] Refactor storage of non-integral address spaces Instead of storing this as a separate array of non-integral pointers, add it to the PointerSpec class instead. This will allow for future simplifications such as splitting the non-integral property into multiple distinct ones: relocatable (i.e. non-stable representation) and non-integral representation (i.e. pointers with metadata). Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/105734	2024-10-25 10:02:40 -07:00
Jay Foad	90cdc03e7f	[IR] Fix undiagnosed cases of structs containing scalable vectors (#113455 ) Type::isScalableTy and StructType::containsScalableVectorType failed to detect some cases of structs containing scalable vectors because containsScalableVectorType did not call back into isScalableTy to check the element types. Fix this, which requires sharing the same Visited set in both functions. Also change the external API so that callers are never required to pass in a Visited set, and normalize the naming to isScalableTy.	2024-10-25 12:56:10 +01:00
Thomas Fransham	b8fddca7bd	[llvm] Support llvm::Any across shared libraries on windows (#108051 ) This is part of the effort to support for enabling plugins on windows by adding better support for building llvm as a DLL. The export macros used here were added in #96630 Since shared library symbols aren't deduplicated across multiple libraries on windows like Linux we have to manually explicitly import and export `Any::TypeId` template instantiations for the uses of `llvm::Any` in the LLVM codebase to support LLVM Windows shared library builds. This change ensures that external code, including LLVM's own tests, can use PassManager callbacks when LLVM is built as a DLL. I also removed the only use of llvm::Any for LoopNest that only existed in debug code and there also doesn't seem to be any code creating `Any<LoopNest>`	2024-10-24 08:07:13 +03:00
Florian Mayer	23b18fa01e	[MTE] Do not allow local aliases to MTE globals (#106280 ) With this change and appropriate linker changes (https://r.android.com/3236256) AOSP boots with memtag-global throughout the platform. Without this change, we would sometimes generate PC-relative references to tagged globals, which then do not have the proper tag.	2024-10-21 17:00:41 -07:00
goldsteinn	69a798a996	Reapply "[Inliner] Propagate more attributes to params when inlining (#91101 )" (2nd Attempt) (#112749 ) Root cause of the bug was code hanging onto `range` attr after changing BitWidth. This was fixed in PR #112633.	2024-10-17 20:28:47 -05:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Nikita Popov	255a99c29f	[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309 ) This fixes all the places that hit the new assertion added in https://github.com/llvm/llvm-project/pull/106524 in tests. That is, cases where the value passed to the APInt constructor is not an N-bit signed/unsigned integer, where N is the bit width and signedness is determined by the isSigned flag. The fixes either set the correct value for isSigned, set the implicitTrunc flag, or perform more calculations inside APInt. Note that the assertion is currently still disabled by default, so this patch is mostly NFC.	2024-10-17 08:48:08 +02:00
Arthur Eubanks	9e6d24f61f	Revert "[Inliner] Propagate more attributes to params when inlining (#91101 )" This reverts commit ae778ae7ce72219270c30d5c8b3d88c9a4803f81. Creates broken IR, see comments in #91101.	2024-10-16 21:21:34 +00:00
Tex Riddell	875afa939d	[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Based on example PR #96222 and fix PR #101268, with some differences due to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp). - Add llvm.experimental.constrained.atan2 - Intrinsics.td, ConstrainedOps.def, LangRef.rst - Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp - Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp, and LegalizeVectorTypes.cpp - Update isKnownNeverNaN in SelectionDAG.cpp - Update SelectionDAGDumper.cpp - Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp - TargetLoweringBase.cpp - Expand for vectors, promote f16 - X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC Part 4 for Implement the atan2 HLSL Function #70096.	2024-10-16 11:43:17 -07:00
goldsteinn	ae778ae7ce	[Inliner] Propagate more attributes to params when inlining (#91101 ) - [Inliner] Add tests for propagating more parameter attributes; NFC - [Inliner] Propagate more attributes to params when inlining Add support for propagating: - `derefereancable` - `derefereancable_or_null` - `align` - `nonnull` - `range` These are only propagated if the parameter to the to-be-inlined callsite match the exact parameter used in the to-be-inlined function.	2024-10-16 11:53:21 -05:00
Jay Foad	d9c95efb6c	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546 ) Convert almost every instance of: CreateCall(Intrinsic::getOrInsertDeclaration(...), ...) to the equivalent CreateIntrinsic call.	2024-10-16 15:43:30 +01:00
Rahul Joshi	6924fc0326	[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428 ) Add `Intrinsic::getDeclarationIfExists` to lookup an existing declaration of an intrinsic in a `Module`.	2024-10-16 07:21:10 -07:00
Kazu Hirata	3a56b03ef3	[IR] Avoid repeated hash lookups (NFC) (#112469 )	2024-10-16 06:40:10 -07:00
Daniel Paoliello	c9f27275c1	[clang][aarch64] Add support for the MSVC qualifiers __ptr32, __ptr64, __sptr, __uptr for AArch64 (#111879 ) MSVC has a set of qualifiers to allow using 32-bit signed/unsigned pointers when building 64-bit targets. This is useful for WoW code (i.e., the part of Windows that handles running 32-bit application on a 64-bit OS). Currently this is supported on x64 using the 270, 271 and 272 address spaces, but does not work for AArch64 at all. This change adds the same 270, 271 and 272 address spaces to AArch64 and adjusts the data layout string accordingly. Clang will generate the correct address space casts, but these will currently be ignored until the AArch64 backend is updated to handle them. Partially fixes #62536 This is a resurrected version of <https://reviews.llvm.org/D158857> (originally created by @a_vorobev) - I've cleaned it up a little, fixed the rest of the tests and added to auto-upgrade for the data layout.	2024-10-15 10:37:36 -07:00
Yingwei Zheng	9b7491e866	[IR] Add support for `samesign` in `Operator::hasPoisonGeneratingFlags` (#112358 ) Fix https://github.com/llvm/llvm-project/issues/112356.	2024-10-15 23:07:16 +08:00
Yingwei Zheng	8d8bb4032b	[Verifier] Verify attribute `denormal-fp-math[-f32]` (#112310 ) Some typos are also fixed. Address https://github.com/llvm/llvm-project/pull/112067#pullrequestreview-2363722447.	2024-10-15 17:32:16 +08:00
elhewaty	9efb07f261	[IR] Add `samesign` flag to icmp instruction (#111419 ) Inspired by https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423	2024-10-15 17:11:25 +08:00
Teresa Johnson	1de71652fd	[MemProf] Support cloning for indirect calls with ThinLTO (#110625 ) This patch enables support for cloning in indirect callsites. This is done by synthesizing callsite records for each virtual call target from the profile metadata. In the thin link all the synthesized records for a particular indirect callsite initially share the same context node, but support is added to partition the callsites and outgoing edges based on the callee function, creating a separate node for each target. In the LTO backend, when cloning is needed we first perform indirect call promotion, then change the target of the new direct call to the desired clone. Note this is ThinLTO-specific, since for regular LTO indirect call promotion should have already occurred.	2024-10-11 13:53:35 -07:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Matt Arsenault	c198f775cd	AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642 ) These have been replaced with atomicrmw	2024-10-09 09:27:28 +04:00
Paul Walker	87cdc8328d	[LLVM][ConstFolds] Verify a scalar src before attempting scalar->vector bitcast transformation. (#111149 ) It was previously safe to assume isa<Constant{Int,FP}> meant a scalar value. This is not true when use-constant-##-for-###-splat are enabled.	2024-10-08 13:28:44 +01:00
Yingwei Zheng	a3a253d3c7	[ConstantFPRange] Implement `ConstantFPRange::makeExactFCmpRegion` (#111490 ) Note: The current implementation doesn't return optimal result for `fcmp one/une x, +/-inf` since we don't handle this case in https://github.com/llvm/llvm-project/pull/110891. Maybe we can make it optimal after seeing some real-world cases.	2024-10-08 15:55:10 +08:00
Yingwei Zheng	4647a4666c	[ConstantFPRange] Implement `ConstantFPRange::makeSatisfyingFCmpRegion` (#110891 ) This patch adds support for `ConstantFPRange::makeSatisfyingFCmpRegion`. We only check the optimality for cases where the result can be represented by a ConstantFPRange. This patch also adds some tests for `ConstantFPRange::fcmp` because it depends on `makeSatisfyingFCmpRegion`. Unfortunately we cannot exhaustively test this function due to time limit. I just pick some interesting ranges instead.	2024-10-08 13:41:24 +08:00
Matt Arsenault	9dca83f2e1	AMDGPU: Add noalias.addrspace metadata when autoupgrading atomic intrinsics (#102599 ) This will be needed to continue generating the raw instruction in the flat case.	2024-10-08 00:13:28 +04:00
Matt Arsenault	a8e1311a1c	[RFC] IR: Define noalias.addrspace metadata (#102461 ) This is intended to solve a problem with lowering atomics in OpenMP and C++ common to AMDGPU and NVPTX. In OpenCL and CUDA, it is undefined behavior for an atomic instruction to modify an object in thread private memory. In OpenMP, it is defined. Correspondingly, the hardware does not handle this correctly. For AMDGPU, 32-bit atomics work and 64-bit atomics are silently dropped. We therefore need to codegen this by inserting a runtime address space check, performing the private case without atomics, and fallback to issuing the real atomic otherwise. This metadata allows us to avoid this extra check and branch. Handle this by introducing metadata intended to be applied to atomicrmw, indicating they cannot access the forbidden address space.	2024-10-07 23:21:42 +04:00
Austin Kerbow	c4d89203f3	[AMDGPU] Support preloading hidden kernel arguments (#98861 ) Adds hidden kernel arguments to the function signature and marks them inreg if they should be preloaded into user SGPRs. The normal kernarg preloading logic then takes over with some additional checks for the correct implicitarg_ptr alignment. Special care is needed so that metadata for the hidden arguments is not added twice when generating the code object.	2024-10-06 17:44:33 -07:00
Michael Liao	765d7e7a47	[IR] Fix '-Wparentheses' warnings. NFC	2024-10-04 20:19:39 -04:00
Paul Walker	d283705829	[AArch64][SVE] Fix definition of bfloat fcvt intrinsics. (#110281 ) Affected intrinsics: llvm.aarch64.sve.fcvt.bf16f32 llvm.aarch64.sve.fcvtnt.bf16f32 The named intrinsics took a predicate based on the smallest element type when it should be based on the largest. The intrinsics have been replace by v2 equivalents and affected code ported to use them. Patch includes changes to getSVEPredicateBitCast() that ensure the generated code for the auto-upgraded old intrinsics is unchanged.	2024-10-03 12:36:01 +01:00
Koakuma	076392b0aa	[SPARC] Fix regression from UpgradeDataLayoutString change (#110608 ) It turns out that we cannot rely on the presence of `-i64:64` as a position reference when adding the `-i128:128` datalayout string due to some custom datalayout strings lacking it (e.g ones used by bugpoint, among other things). Do not add the `-i128:128` string in that case. This fixes the regression introduced in https://github.com/llvm/llvm-project/pull/106951.	2024-10-03 05:20:56 +07:00
Noah Goldstein	e343af777e	[SimplifyCFG][Attributes] Enabling sinking calls with differing number of attrsets Prior impl would fail if the number of attribute sets on the two calls wasn't the same which is unnecessary as long as we aren't throwing away and must-preserve attrs. Closes #110896	2024-10-02 15:15:07 -05:00
Yingwei Zheng	586736226e	[ConstantFPRange] Implement `ConstantFPRange::makeAllowedFCmpRegion` (#110082 ) This patch adds `makeAllowedFCmpRegion` support for `ConstantFPRange`.	2024-10-02 20:44:15 +08:00
Benjamin Maxwell	95f00a63ce	[IR] Allow fast math flags on calls with homogeneous FP struct types (#110506 ) This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).	2024-10-02 10:05:09 +01:00
Yingwei Zheng	4db10561cc	[ConstantFPRange] Address review comments. NFC. (#110793 ) 1. Address post-commit review comments https://github.com/llvm/llvm-project/pull/86483#discussion_r1782305961. 2. Move some NFC changes from https://github.com/llvm/llvm-project/pull/110082 to this patch.	2024-10-02 14:22:15 +08:00
Noah Goldstein	4d4beeb43c	[SimplifyCFG] Supporting hoisting/sinking callbases with differing attrs Some (many) attributes can safely be dropped to enable sinking. For example removing `nonnull` on a return/param can't affect correctness. Closes #109472	2024-10-01 18:27:08 -05:00
goldsteinn	afc0557a04	[IR][Attribute] Add support for intersecting AttributeLists; NFC (#109719 ) Add support for taking the intersection of two AttributeLists s.t the result list contains attributes that are valid in the context of both inputs. i.e if we have `nonnull align(32) noundef` intersected with `nonnull align(16) dereferenceable(10)`, the result is `nonnull align(16)`. Further it handles attributes that are not-droppable. For example dropping `byval` can change the nature of a callsite/function so its impossible to correct a correct intersection if its dropped from the result. i.e `nonnull byval(i64)` intersected with `nonnull` is invalid. The motivation for the infrastructure is to enable sinking/hoisting callsites with differing attributes.	2024-10-01 11:45:32 -05:00
Rahul Joshi	2469d7e361	[NFC] Add a new Intrinsics.cpp file for intrinsic code (#110078 ) Add new file Intrinsics.cpp and move all functions in the `Intrinsic` namespace to it.	2024-10-01 06:55:35 -07:00
eric-xtang1008	a59e5d8115	[ConstantFold][RFC] Add AllowLHSConstant parameter in getBinOpAbsorber (#109736 ) Add a AllowLHSConstant parameter in getBinOpAbsorber function for supporting more binary operators.	2024-10-01 14:51:01 +02:00
Jeremy Morse	96f37ae453	[NFC] Use initial-stack-allocations for more data structures (#110544 ) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.	2024-09-30 23:15:18 +01:00
Rahul Joshi	1b7b3b8d35	[NFC] Move intrinsic related functions to Intrinsic namespace (#110125 ) Move static functions `Function::lookupIntrinsicID` and `Function::isTargetIntrinsic` to Intrinsic namespace.	2024-09-30 07:42:53 -07:00
Kazu Hirata	619688f3d3	[IR] Avoid repeated hash lookups (NFC) (#110450 )	2024-09-30 06:47:39 -07:00
Koakuma	dbad963a69	[SPARC] Align i128 to 16 bytes in SPARC datalayouts (#106951 ) Align i128s to 16 bytes, following the example at https://reviews.llvm.org/D86310. clang already does this implicitly, but do it in backend code too for the benefit of other frontends (see e.g https://github.com/llvm/llvm-project/issues/102783 & https://github.com/rust-lang/rust/issues/128950).	2024-09-30 08:32:33 +07:00
Alex MacLean	e7621f4199	Reland "[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast" (#110262 ) Remove the following intrinsics which can be trivially replaced with an `addrspacecast` * llvm.nvvm.ptr.gen.to.global * llvm.nvvm.ptr.gen.to.shared * llvm.nvvm.ptr.gen.to.constant * llvm.nvvm.ptr.gen.to.local * llvm.nvvm.ptr.global.to.gen * llvm.nvvm.ptr.shared.to.gen * llvm.nvvm.ptr.constant.to.gen * llvm.nvvm.ptr.local.to.gen Also, cleanup the NVPTX lowering of `addrspacecast` making it more concise. This was reverted to avoid conflicts while reverting #107655. Re-landing unchanged.	2024-09-28 14:13:17 -07:00
Alex MacLean	a131fbf168	Reland "[NVPTX] deprecate nvvm.rotate.* intrinsics, cleanup funnel-shift handling" (#110025 ) This change deprecates the following intrinsics which can be trivially converted to llvm funnel-shift intrinsics: - @llvm.nvvm.rotate.b32 - @llvm.nvvm.rotate.right.b64 - @llvm.nvvm.rotate.b64 This fixes a bug in the previous version (#107655) which flipped the order of the operands to the PTX funnel shift instruction. In LLVM IR the high bits are the first arg and the low bits are the second arg, while in PTX this is reversed.	2024-09-27 05:23:08 -07:00
Rahul Joshi	6786928c4f	[Core] Skip over target name in intrinsic name lookup (#109971 ) When searching for an intrinsic name in a target specific slice of the intrinsic name table, skip over the target prefix. For such cases, currently the first loop iteration in `lookupLLVMIntrinsicByName` does nothing (i.e., `Low` and `High` stay unchanged and it does not shrink down the search window), so we can skip this useless first iteration by skipping over the target prefix.	2024-09-25 12:01:43 -07:00
gonzalobg	0f521931b8	LLVMContext: add getSyncScopeName() to lookup individual scope name (#109484 ) This PR adds a `getSyncScopeString(Id)` API to `LLVMContext` that returns the `StringRef` for that ID, if any.	2024-09-25 11:13:56 -07:00

1 2 3 4 5 ...

6659 Commits