6659 Commits

Author SHA1 Message Date
Kyungwoo Lee
b667d161f0
[StructuralHash] Refactor (#112621)
This is largely NFC, and it prepares for #112638.
 - Use stable_hash instead of uint64_t
- Rename update* to hash* functions. They compute stable_hash locally and return it.
 
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-10-26 09:20:26 -07:00
davidtrevelyan
4102625380
[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155)
# What

This PR renames the newly-introduced llvm attribute
`sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise,
sibling variables such as `SanitizeRealtimeUnsafe` are renamed to
`SanitizeRealtimeBlocking` respectively. There are no other functional
changes.


# Why?

- There are a number of problems that can cause a function to be
real-time "unsafe",
- we wish to communicate what problems rtsan detects and *why* they're
unsafe, and
- a generic "unsafe" attribute is, in our opinion, too broad a net -
which may lead to future implementations that need extra contextual
information passed through them in order to communicate meaningful
reasons to users.
- We want to avoid this situation and make the runtime library boundary
API/ABI as simple as possible, and
- we believe that restricting the scope of attributes to names like
`sanitize_realtime_blocking` is an effective means of doing so.

We also feel that the symmetry between `[[clang::blocking]]` and
`sanitize_realtime_blocking` is easier to follow as a developer.

# Concerns

- I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been
part of the tree for a few weeks now (introduced here:
https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't
been released in version 20 yet, am I correct in considering this to not
be a breaking change?
2024-10-26 13:06:11 +01:00
Gang Chen
4ac0e7e400
[AMDGPU] Add a type for the named barrier (#113614) 2024-10-25 11:24:47 -07:00
ssijaric-nv
14db069468
[InstCombine] Fix a cycle when folding fneg(select) with scalable vector types (#112465)
The two folding operations are causing a cycle for the following case
with
scalable vector types:

define <vscale x 2 x double> @test_fneg_select_abs(<vscale x 2 x i1>
%cond, <vscale x 2 x double> %b) {
%1 = select <vscale x 2 x i1> %cond, <vscale x 2 x double>
zeroinitializer, <vscale x 2 x double> %b
  %2 = fneg fast <vscale x 2 x double> %1
  ret <vscale x 2 x double> %2
}

1) fold fneg:  -(Cond ? C : Y) -> Cond ? -C : -Y

2) fold select: (Cond ? -X : -Y) -> -(Cond ? X : Y)

1) results in the following since '<vscale x 2 x double>
zeroinitializer' passes
the check for the immediate constant:

%.neg = fneg fast <vscale x 2 x double> zeroinitializer
%b.neg = fneg fast <vscale x 2 x double> %b
%1 = select fast <vscale x 2 x i1> %cond, <vscale x 2 x double> %.neg,
<vscale x 2 x double> %b.neg

and so we end up going back and forth between 1) and 2).

Attempt to fold scalable vector constants, so that we end up with a
splat instead:

define <vscale x 2 x double> @test_fneg_select_abs(<vscale x 2 x i1>
%cond, <vscale x 2 x double> %b) {
  %b.neg = fneg fast <vscale x 2 x double> %b
%1 = select fast <vscale x 2 x i1> %cond, <vscale x 2 x double>
shufflevector (<vscale x 2 x double> insertelement (<vscale x 2 x
double> poison, double -0.000000e+00, i64 0), <vscale x 2 x double>
poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x double>
%b.neg
  ret <vscale x 2 x double> %1
}
2024-10-25 10:47:39 -07:00
Alexander Richardson
305a1ceae3
[DataLayout] Refactor storage of non-integral address spaces
Instead of storing this as a separate array of non-integral pointers,
add it to the PointerSpec class instead. This will allow for future
simplifications such as splitting the non-integral property into
multiple distinct ones: relocatable (i.e. non-stable representation) and
non-integral representation (i.e. pointers with metadata).

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/105734
2024-10-25 10:02:40 -07:00
Jay Foad
90cdc03e7f
[IR] Fix undiagnosed cases of structs containing scalable vectors (#113455)
Type::isScalableTy and StructType::containsScalableVectorType failed to
detect some cases of structs containing scalable vectors because
containsScalableVectorType did not call back into isScalableTy to check
the element types. Fix this, which requires sharing the same Visited set
in both functions. Also change the external API so that callers are
never required to pass in a Visited set, and normalize the naming to
isScalableTy.
2024-10-25 12:56:10 +01:00
Thomas Fransham
b8fddca7bd
[llvm] Support llvm::Any across shared libraries on windows (#108051)
This is part of the effort to support for enabling plugins on windows by
adding better support for building llvm as a DLL. The export macros used
here were added in #96630

Since shared library symbols aren't deduplicated across multiple
libraries on windows like Linux we have to manually explicitly import
and export `Any::TypeId` template instantiations for the uses of
`llvm::Any` in the LLVM codebase to support LLVM Windows shared library
builds.
This change ensures that external code, including LLVM's own tests, can
use PassManager callbacks when LLVM is built as a DLL.

I also removed the only use of llvm::Any for LoopNest that only existed
in debug code and there also doesn't seem to be any code creating
`Any<LoopNest>`
2024-10-24 08:07:13 +03:00
Florian Mayer
23b18fa01e
[MTE] Do not allow local aliases to MTE globals (#106280)
With this change and appropriate linker changes
(https://r.android.com/3236256)
AOSP boots with memtag-global throughout the platform.

Without this change, we would sometimes generate PC-relative references
to tagged globals, which then do not have the proper tag.
2024-10-21 17:00:41 -07:00
goldsteinn
69a798a996
Reapply "[Inliner] Propagate more attributes to params when inlining (#91101)" (2nd Attempt) (#112749)
Root cause of the bug was code hanging onto `range` attr after
changing BitWidth. This was fixed in PR #112633.
2024-10-17 20:28:47 -05:00
goldsteinn
c85611e858
[SimplifyLibCall][Attribute] Fix bug where we may keep range attr with incompatible type (#112649)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.

The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.

Fixes #112633
2024-10-17 10:32:55 -05:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Nikita Popov
255a99c29f
[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309)
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.

The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.

Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
2024-10-17 08:48:08 +02:00
Arthur Eubanks
9e6d24f61f Revert "[Inliner] Propagate more attributes to params when inlining (#91101)"
This reverts commit ae778ae7ce72219270c30d5c8b3d88c9a4803f81.

Creates broken IR, see comments in #91101.
2024-10-16 21:21:34 +00:00
Tex Riddell
875afa939d
[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

Based on example PR #96222 and fix PR #101268, with some differences due
to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp).

- Add llvm.experimental.constrained.atan2 - Intrinsics.td,
ConstrainedOps.def, LangRef.rst
- Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in
BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp
- Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp,
and LegalizeVectorTypes.cpp
- Update isKnownNeverNaN in SelectionDAG.cpp
- Update SelectionDAGDumper.cpp
- Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp
- TargetLoweringBase.cpp - Expand for vectors, promote f16
- X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC

Part 4 for Implement the atan2 HLSL Function #70096.
2024-10-16 11:43:17 -07:00
goldsteinn
ae778ae7ce
[Inliner] Propagate more attributes to params when inlining (#91101)
- **[Inliner] Add tests for propagating more parameter attributes; NFC**
- **[Inliner] Propagate more attributes to params when inlining**

Add support for propagating:
        - `derefereancable`
        - `derefereancable_or_null`
        - `align`
        - `nonnull`
        - `range`
    
These are only propagated if the parameter to the to-be-inlined callsite
match the exact parameter used in the to-be-inlined function.
2024-10-16 11:53:21 -05:00
Jay Foad
d9c95efb6c
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546)
Convert almost every instance of:
  CreateCall(Intrinsic::getOrInsertDeclaration(...), ...)
to the equivalent CreateIntrinsic call.
2024-10-16 15:43:30 +01:00
Rahul Joshi
6924fc0326
[LLVM] Add Intrinsic::getDeclarationIfExists (#112428)
Add `Intrinsic::getDeclarationIfExists` to lookup an existing
declaration of an intrinsic in a `Module`.
2024-10-16 07:21:10 -07:00
Kazu Hirata
3a56b03ef3
[IR] Avoid repeated hash lookups (NFC) (#112469) 2024-10-16 06:40:10 -07:00
Daniel Paoliello
c9f27275c1
[clang][aarch64] Add support for the MSVC qualifiers __ptr32, __ptr64, __sptr, __uptr for AArch64 (#111879)
MSVC has a set of qualifiers to allow using 32-bit signed/unsigned
pointers when building 64-bit targets. This is useful for WoW code
(i.e., the part of Windows that handles running 32-bit application on a
64-bit OS). Currently this is supported on x64 using the 270, 271 and
272 address spaces, but does not work for AArch64 at all.

This change adds the same 270, 271 and 272 address spaces to AArch64 and
adjusts the data layout string accordingly. Clang will generate the
correct address space casts, but these will currently be ignored until
the AArch64 backend is updated to handle them.

Partially fixes #62536

This is a resurrected version of <https://reviews.llvm.org/D158857>
(originally created by @a_vorobev) - I've cleaned it up a little, fixed
the rest of the tests and added to auto-upgrade for the data layout.
2024-10-15 10:37:36 -07:00
Yingwei Zheng
9b7491e866
[IR] Add support for samesign in Operator::hasPoisonGeneratingFlags (#112358)
Fix https://github.com/llvm/llvm-project/issues/112356.
2024-10-15 23:07:16 +08:00
Yingwei Zheng
8d8bb4032b
[Verifier] Verify attribute denormal-fp-math[-f32] (#112310)
Some typos are also fixed. Address
https://github.com/llvm/llvm-project/pull/112067#pullrequestreview-2363722447.
2024-10-15 17:32:16 +08:00
elhewaty
9efb07f261
[IR] Add samesign flag to icmp instruction (#111419)
Inspired by
https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423
2024-10-15 17:11:25 +08:00
Teresa Johnson
1de71652fd
[MemProf] Support cloning for indirect calls with ThinLTO (#110625)
This patch enables support for cloning in indirect callsites.

This is done by synthesizing callsite records for each virtual call
target from the profile metadata. In the thin link all the synthesized
records for a particular indirect callsite initially share the same
context node, but support is added to partition the callsites and
outgoing edges based on the callee function, creating a separate node
for each target.

In the LTO backend, when cloning is needed we first perform indirect
call promotion, then change the target of the new direct call to the
desired clone.

Note this is ThinLTO-specific, since for regular LTO indirect call
promotion should have already occurred.
2024-10-11 13:53:35 -07:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Matt Arsenault
c198f775cd
AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642)
These have been replaced with atomicrmw
2024-10-09 09:27:28 +04:00
Paul Walker
87cdc8328d
[LLVM][ConstFolds] Verify a scalar src before attempting scalar->vector bitcast transformation. (#111149)
It was previously safe to assume isa<Constant{Int,FP}> meant a scalar
value. This is not true when use-constant-##-for-###-splat are enabled.
2024-10-08 13:28:44 +01:00
Yingwei Zheng
a3a253d3c7
[ConstantFPRange] Implement ConstantFPRange::makeExactFCmpRegion (#111490)
Note: The current implementation doesn't return optimal result for `fcmp
one/une x, +/-inf` since we don't handle this case in
https://github.com/llvm/llvm-project/pull/110891. Maybe we can make it
optimal after seeing some real-world cases.
2024-10-08 15:55:10 +08:00
Yingwei Zheng
4647a4666c
[ConstantFPRange] Implement ConstantFPRange::makeSatisfyingFCmpRegion (#110891)
This patch adds support for `ConstantFPRange::makeSatisfyingFCmpRegion`.
We only check the optimality for cases where the result can be
represented by a ConstantFPRange.

This patch also adds some tests for `ConstantFPRange::fcmp` because it
depends on `makeSatisfyingFCmpRegion`. Unfortunately we cannot
exhaustively test this function due to time limit. I just pick some
interesting ranges instead.
2024-10-08 13:41:24 +08:00
Matt Arsenault
9dca83f2e1
AMDGPU: Add noalias.addrspace metadata when autoupgrading atomic intrinsics (#102599)
This will be needed to continue generating the raw instruction in the flat case.
2024-10-08 00:13:28 +04:00
Matt Arsenault
a8e1311a1c
[RFC] IR: Define noalias.addrspace metadata (#102461)
This is intended to solve a problem with lowering atomics in
OpenMP and C++ common to AMDGPU and NVPTX.

In OpenCL and CUDA, it is undefined behavior for an atomic instruction
to modify an object in thread private memory. In OpenMP, it is defined.
Correspondingly, the hardware does not handle this correctly. For
AMDGPU,
32-bit atomics work and 64-bit atomics are silently dropped. We
therefore
need to codegen this by inserting a runtime address space check,
performing
the private case without atomics, and fallback to issuing the real
atomic
otherwise. This metadata allows us to avoid this extra check and branch.

Handle this by introducing metadata intended to be applied to atomicrmw,
indicating they cannot access the forbidden address space.
2024-10-07 23:21:42 +04:00
Austin Kerbow
c4d89203f3
[AMDGPU] Support preloading hidden kernel arguments (#98861)
Adds hidden kernel arguments to the function signature and marks them
inreg if they should be preloaded into user SGPRs. The normal kernarg
preloading logic then takes over with some additional checks for the
correct implicitarg_ptr alignment.

Special care is needed so that metadata for the hidden arguments is not
added twice when generating the code object.
2024-10-06 17:44:33 -07:00
Michael Liao
765d7e7a47 [IR] Fix '-Wparentheses' warnings. NFC 2024-10-04 20:19:39 -04:00
Paul Walker
d283705829
[AArch64][SVE] Fix definition of bfloat fcvt intrinsics. (#110281)
Affected intrinsics:
  llvm.aarch64.sve.fcvt.bf16f32
  llvm.aarch64.sve.fcvtnt.bf16f32
    
The named intrinsics took a predicate based on the smallest element type
when it should be based on the largest. The intrinsics have been replace
by v2 equivalents and affected code ported to use them.
    
Patch includes changes to getSVEPredicateBitCast() that ensure the
generated code for the auto-upgraded old intrinsics is unchanged.
2024-10-03 12:36:01 +01:00
Koakuma
076392b0aa
[SPARC] Fix regression from UpgradeDataLayoutString change (#110608)
It turns out that we cannot rely on the presence of `-i64:64` as a
position reference when adding the `-i128:128` datalayout string due to
some custom datalayout strings lacking it (e.g ones used by bugpoint,
among other things).
Do not add the `-i128:128` string in that case.

This fixes the regression introduced in
https://github.com/llvm/llvm-project/pull/106951.
2024-10-03 05:20:56 +07:00
Noah Goldstein
e343af777e [SimplifyCFG][Attributes] Enabling sinking calls with differing number of attrsets
Prior impl would fail if the number of attribute sets on the two calls
wasn't the same which is unnecessary as long as we aren't throwing
away and must-preserve attrs.

Closes #110896
2024-10-02 15:15:07 -05:00
Yingwei Zheng
586736226e
[ConstantFPRange] Implement ConstantFPRange::makeAllowedFCmpRegion (#110082)
This patch adds `makeAllowedFCmpRegion` support for `ConstantFPRange`.
2024-10-02 20:44:15 +08:00
Benjamin Maxwell
95f00a63ce
[IR] Allow fast math flags on calls with homogeneous FP struct types (#110506)
This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
2024-10-02 10:05:09 +01:00
Yingwei Zheng
4db10561cc
[ConstantFPRange] Address review comments. NFC. (#110793)
1. Address post-commit review comments
https://github.com/llvm/llvm-project/pull/86483#discussion_r1782305961.
2. Move some NFC changes from
https://github.com/llvm/llvm-project/pull/110082 to this patch.
2024-10-02 14:22:15 +08:00
Noah Goldstein
4d4beeb43c [SimplifyCFG] Supporting hoisting/sinking callbases with differing attrs
Some (many) attributes can safely be dropped to enable sinking. For
example removing `nonnull` on a return/param can't affect correctness.

Closes #109472
2024-10-01 18:27:08 -05:00
goldsteinn
afc0557a04
[IR][Attribute] Add support for intersecting AttributeLists; NFC (#109719)
Add support for taking the intersection of two AttributeLists s.t the
result list contains attributes that are valid in the context of both
inputs.

i.e if we have `nonnull align(32) noundef` intersected with `nonnull
align(16) dereferenceable(10)`, the result is `nonnull align(16)`.

Further it handles attributes that are not-droppable. For example
dropping `byval` can change the nature of a callsite/function so its
impossible to correct a correct intersection if its dropped from the
result. i.e `nonnull byval(i64)` intersected with `nonnull` is
invalid.

The motivation for the infrastructure is to enable sinking/hoisting
callsites with differing attributes.
2024-10-01 11:45:32 -05:00
Rahul Joshi
2469d7e361
[NFC] Add a new Intrinsics.cpp file for intrinsic code (#110078)
Add new file Intrinsics.cpp and move all functions in the `Intrinsic`
namespace to it.
2024-10-01 06:55:35 -07:00
eric-xtang1008
a59e5d8115
[ConstantFold][RFC] Add AllowLHSConstant parameter in getBinOpAbsorber (#109736)
Add a AllowLHSConstant parameter in getBinOpAbsorber function for
supporting more binary operators.
2024-10-01 14:51:01 +02:00
Jeremy Morse
96f37ae453
[NFC] Use initial-stack-allocations for more data structures (#110544)
This replaces some of the most frequent offenders of using a DenseMap that
cause a malloc, where the typical element-count is small enough to fit in
an initial stack allocation.

Most of these are fairly obvious, one to highlight is the collectOffset
method of GEP instructions: if there's a GEP, of course it's going to have
at least one offset, but every time we've called collectOffset we end up
calling malloc as well for the DenseMap in the MapVector.
2024-09-30 23:15:18 +01:00
Rahul Joshi
1b7b3b8d35
[NFC] Move intrinsic related functions to Intrinsic namespace (#110125)
Move static functions `Function::lookupIntrinsicID` and
`Function::isTargetIntrinsic` to Intrinsic namespace.
2024-09-30 07:42:53 -07:00
Kazu Hirata
619688f3d3
[IR] Avoid repeated hash lookups (NFC) (#110450) 2024-09-30 06:47:39 -07:00
Koakuma
dbad963a69
[SPARC] Align i128 to 16 bytes in SPARC datalayouts (#106951)
Align i128s to 16 bytes, following the example at
https://reviews.llvm.org/D86310.

clang already does this implicitly, but do it in backend code too for
the benefit of other frontends (see e.g
https://github.com/llvm/llvm-project/issues/102783 &
https://github.com/rust-lang/rust/issues/128950).
2024-09-30 08:32:33 +07:00
Alex MacLean
e7621f4199
Reland "[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast" (#110262)
Remove the following intrinsics which can be trivially replaced with an
`addrspacecast`

  * llvm.nvvm.ptr.gen.to.global
  * llvm.nvvm.ptr.gen.to.shared
  * llvm.nvvm.ptr.gen.to.constant
  * llvm.nvvm.ptr.gen.to.local
  * llvm.nvvm.ptr.global.to.gen
  * llvm.nvvm.ptr.shared.to.gen
  * llvm.nvvm.ptr.constant.to.gen
  * llvm.nvvm.ptr.local.to.gen

Also, cleanup the NVPTX lowering of `addrspacecast` making it more
concise.

This was reverted to avoid conflicts while reverting #107655. Re-landing
unchanged.
2024-09-28 14:13:17 -07:00
Alex MacLean
a131fbf168
Reland "[NVPTX] deprecate nvvm.rotate.* intrinsics, cleanup funnel-shift handling" (#110025)
This change deprecates the following intrinsics which can be trivially
converted to llvm funnel-shift intrinsics:

- @llvm.nvvm.rotate.b32
- @llvm.nvvm.rotate.right.b64
- @llvm.nvvm.rotate.b64

This fixes a bug in the previous version (#107655) which flipped the
order of the operands to the PTX funnel shift instruction. In LLVM IR
the high bits are the first arg and the low bits are the second arg,
while in PTX this is reversed.
2024-09-27 05:23:08 -07:00
Rahul Joshi
6786928c4f
[Core] Skip over target name in intrinsic name lookup (#109971)
When searching for an intrinsic name in a target specific slice of the
intrinsic name table, skip over the target prefix. For such cases,
currently the first loop iteration in `lookupLLVMIntrinsicByName` does
nothing (i.e., `Low` and `High` stay unchanged and it does not shrink
down the search window), so we can skip this useless first iteration by
skipping over the target prefix.
2024-09-25 12:01:43 -07:00
gonzalobg
0f521931b8
LLVMContext: add getSyncScopeName() to lookup individual scope name (#109484)
This PR adds a `getSyncScopeString(Id)` API to `LLVMContext` that
returns the `StringRef` for that ID, if any.
2024-09-25 11:13:56 -07:00