2408 Commits

Author SHA1 Message Date
Nikita Popov
e506bfa7ae [SDAG] Fix incorrect use of undef for boolean contents (PR63055)
FoldSetCC() returns UNDEF in a number of cases. However, the SetCC
result must follow BooleanContents. Unless the type is a
pre-legalization i1 or we have UndefinedBooleanContents, the use of
UNDEF will not uphold the requirement that the top bits are either
zero or match the low bit. In such cases, return zero instead.

Fixes https://github.com/llvm/llvm-project/issues/63055.

Differential Revision: https://reviews.llvm.org/D151883
2023-06-01 15:19:22 +02:00
David Green
7740216f2e [DAG] Combine insert(shuffle(load), load, 0) into a single load
Given an insert of a scalar load into a vector shuffle with mask
u,0,1,2,3,4,5,6 or 1,2,3,4,5,6,7,u (depending on the insert index),
it can be more profitable to convert to a single load and avoid the
shuffles. This adds a DAG combine for it, providing the new load is
still fast.

Differential Revision: https://reviews.llvm.org/D151029
2023-05-31 19:48:57 +01:00
Dhruv Chawla
3b3912e9b8 Reapply [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()
This exposed a miscompile due to incorrect flag preservation in
integer type legalization, which has been fixed in D151472.

-----

This patch is a continuation of D150110. It separates the cases for
ADD and SUB into their own cases so that computeForAddSub can be
directly called and the NSW flag passed. This allows better
optimization when the NSW flag is enabled, and allows fixing up the
TODO that was there previously in SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D150769
2023-05-31 12:25:41 +02:00
Craig Topper
a4f437f012 SelectionDAG: Teach ComputeKnownBits about VSCALE
This reverts commit 9b92f70d4758f75903ce93feaba5098130820d40.  The issue
with the re-applied change was an implicit truncation due to the
multiplication.  Although the operations were converted to `APInt`, the
values were implicitly converted to `long` due to the typing rules.

Fixes: #59594

Differential Revision: https://reviews.llvm.org/D140347
2023-05-26 10:48:49 -07:00
Craig Topper
c5e6c886aa [VP][SelectionDAG][RISCV] Add get_vector_length intrinsics and generic SelectionDAG support.
The generic implementation is umin(TC, VF * vscale).

Lowering to vsetvli for RISC-V will come in a future patch.

This patch is a pre-requisite to be able to CodeGen vectorized code from
D99750.

Reviewed By: reames, frasercrmck

Differential Revision: https://reviews.llvm.org/D149916
2023-05-26 09:06:38 -07:00
Nikita Popov
2ba14283cd Revert "[SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()"
This reverts commit b66551370fdfc6f357ae0d77237119d2b1077b62.

This has exposed a pre-existing miscompile, reported in
https://reviews.llvm.org/D150769#4370467.
2023-05-25 11:13:51 +02:00
Dhruv Chawla
b66551370f [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()
This patch is a continuation of D150110. It separates the cases for
ADD and SUB into their own cases so that computeForAddSub can be
directly called and the NSW flag passed. This allows better
optimization when the NSW flag is enabled, and allows fixing up the
TODO that was there previously in SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D150769
2023-05-17 15:15:05 +02:00
Noah Goldstein
d294e3cb76 [SelectionDAG] Improve computeKnownBits implementations of sdiv and udiv
Add `exact` flag handling for `udiv` and add entire `sdiv` case.

Differential Revision: https://reviews.llvm.org/D150098
2023-05-16 18:58:13 -05:00
Jay Foad
d8229e2f14 [KnownBits] Define and use intersectWith and unionWith
Define intersectWith and unionWith as two complementary ways of
combining KnownBits. The names are chosen for consistency with
ConstantRange.

Deprecate commonBits as a synonym for intersectWith.

Differential Revision: https://reviews.llvm.org/D150443
2023-05-16 09:23:51 +01:00
Noah Goldstein
e36caaeeb2 [SelectionDAG] Use computeKnownBits if Op is not recognized by isKnownNeverZero
The current logic is pretty limitted unless the `Op` is a
constant. This at least covers more obvious cases.

Reviewed By: craig.topper, foad

Differential Revision: https://reviews.llvm.org/D149196
2023-05-13 14:36:04 -05:00
Noah Goldstein
da9f306739 [SelectionDAG] Limit max recursion in isKnownNeverZero and isKnownToBeAPowerOfTwo
Both of these functions recursively call themselves so it makes sense
to limit that upper bound.

Differential Revision: https://reviews.llvm.org/D149195
2023-05-13 14:35:57 -05:00
Craig Topper
4a9e6c422f [SelectionDAG] Correct AddNodeIDCustom for MemIntrinsicSDNodes.
We were missing any support for ISD::INTRINSIC_W_CHAIN/INTRINSIC_VOID
used for memory operations.

For ISD::PREFETCH and target memory nodes we didn't add the subclass
data.

This patch handles all MemIntrinsicSDNode in one place and adds the
missing subclass data.

Note. Unlike load/stores we don't add the memory VT in AddNodeIDCustom or getMemIntrinsicNode. Not sure why.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D150387
2023-05-11 16:28:47 -07:00
David Green
b774f14841 [DAG] Calculate the number of sign bits for constant BUILD_VECTOR directly.
For constant BUILD_VECTORs the operands need to be legal types. This can mean
that when the number of sign bits is calculated it may look that the entire
constant and inefficiently produce less sign bits than it could. For example i8
vectors could use i32 elements, for which 0x000000ff would be incorrectly
limited to 1 sign bit as the original value has 24 sign bits. This makes it
look at the constant directly, truncated to the correct type for the element so
that it can correctly return 8.

Differential Revision: https://reviews.llvm.org/D149956
2023-05-07 22:31:10 +01:00
Simon Pilgrim
b7116ba8b0 [DAG] computeOverflowForUnsignedAdd - use ConstantRange::unsignedAddMayOverflow as fallback
Replaces the more specific uadd_ov case
2023-05-06 22:03:38 +01:00
Simon Pilgrim
b83aa8bc75 [DAG] computeOverflowForUnsignedAdd - use getMaxValue().ult(2) to detect 0/1 values. NFCI. 2023-05-06 19:46:34 +01:00
Simon Pilgrim
08c1150d4c [DAG] Add computeOverflowForSignedSub/computeOverflowForUnsignedSub/computeOverflowForSub
Match the addition variants (although computeOverflowForUnsignedSub is really just a placeholder), and use this in DAGCombiner::visitSUBO
2023-05-06 15:55:04 +01:00
Simon Pilgrim
489e728672 [DAG] computeOverflowForSignedAdd - fix typo in comment. NFC. 2023-05-06 14:18:22 +01:00
Simon Pilgrim
7395f6ae78 [DAG] Add computeOverflowForSignedAdd and computeOverflowForAdd wrapper
Add basic computeOverflowForSignedAdd helper to recognise that sadd overflow can't occur if both operands have more that one sign bit.

Add computeOverflowForAdd wrapper that calls computeOverflowForSignedAdd/computeOverflowForUnsignedAdd depending on the IsSigned argument, and use this in DAGCombiner::visitADDO
2023-05-06 13:33:14 +01:00
Simon Pilgrim
c7fce3f98b [DAG] Rename computeOverflowKind -> computeOverflowForUnsignedAdd. NFC.
Matches the naming convention for the equivalent ValueTracking helpers - further SelectionDAG computeOverflowFor*() helpers will be added soon.
2023-05-05 19:38:54 +01:00
Simon Pilgrim
3928589314 [DAG] computeKnownBits - remove old ashr TODO comment
KnownBits::ashr now uses the minimum shift amount to try and extend the sign bit
2023-05-04 10:26:30 +01:00
NAKAMURA Takumi
c1221251fb Restore CodeGen/MachineValueType.h from Support
This is rework of;

  - rG13e77db2df94 (r328395; MVT)

Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.

Depends on D148767

Differential Revision: https://reviews.llvm.org/D149024
2023-05-03 00:13:20 +09:00
Sergei Barannikov
e744e51b12 [SelectionDAG] Rename ADDCARRY/SUBCARRY to UADDO_CARRY/USUBO_CARRY (NFC)
This will make them consistent with other overflow-aware nodes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D148196
2023-04-29 21:59:58 +03:00
Jay Foad
fdc0d5f399 [DAG] Do not call computeKnownBits from isKnownToBeAPowerOfTwo
The only way known bits could help identify a known power of two is if
it knows exactly which power of two it is, i.e. if it is a known
constant. But in that case the value should have been simplified to a
constant already. So save some compile time by not calling
computeKnownBits.

Differential Revision: https://reviews.llvm.org/D149325
2023-04-27 11:05:56 +01:00
Wang, Xin10
cf8d6844de remove useless condition in assert in SelectionDAG.cpp
The condition Opcode <= std::numeric_limits<int>::max() will
always be true here.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D148306
2023-04-17 21:53:19 -04:00
Shraiysh Vaishay
7021182d6b [nfc][llvm] Replace pointer cast functions in PointerUnion by llvm casting functions.
This patch replaces the uses of PointerUnion.is function by llvm::isa,
PointerUnion.get function by llvm::cast, and PointerUnion.dyn_cast by
llvm::dyn_cast_if_present. This is according to the FIXME in
the definition of the class PointerUnion.

This patch does not remove them as they are being used in other
subprojects.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D148449
2023-04-17 13:40:51 -05:00
Amaury Séchet
9041e1fa29 [DAG] Peek through zext/trunc in haveNoCommonBitsSet.
This limitation was discovered thanks to some regression in D127115 .

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D147821
2023-04-11 11:44:15 +00:00
Craig Topper
65f3794111 [SelectionDAG] Use MemVT for FoldingSetNodeID in SelectionDAG::getLoadVP.
Return types and operands are put in the ID by AddNodeIDNode. I'm
pretty sure this was supposed to be the memory VT.
2023-04-03 15:15:48 -07:00
Simon Pilgrim
2434c8fcf9 [DAG] canCreateUndefOrPoison - add ISD::INSERT_VECTOR_ELT handling
If the inserted element index is guaranteed to be inbounds then a ISD::INSERT_VECTOR_ELT will not create poison/undef.
2023-04-02 16:28:26 +01:00
Simon Pilgrim
8153b92d9b [DAG] Add SelectionDAG::SplitScalar helper
Similar to the existing SelectionDAG::SplitVector helper, this helper creates the EXTRACT_ELEMENT nodes for the LO/HI halves of the scalar source.

Differential Revision: https://reviews.llvm.org/D147264
2023-03-31 18:35:40 +01:00
Yeting Kuo
84c8c2b4b4 [DAG][RISCV] Allow scalable vector ISD::STRICT_FP_ROUND and support vector ISD::STRICT_FP_ROUND for RISC-V.
The patch customized lower vector type ISD::STRICT_FP_ROUND to RISCVISD::STRICT_FP_ROUND.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147113
2023-03-30 08:20:02 +08:00
Kazu Hirata
9bb96fd874 [llvm] Use llvm::isNullConstant (NFC) 2023-03-22 00:31:48 -07:00
Simon Pilgrim
c1f81e7604 [DAG] mergeStore - peek through truncates when finding dead store(trunc(load())) patterns
Extend the existing store(load()) removal code to account for intermediate truncates that some targets won't remove with canCombineTruncStore - we only care about the load/store MemoryVT.

Fixes regression from D146121
2023-03-15 11:54:13 +00:00
pvanhout
1f1fea6c38 Reland: [DAG/AMDGPU] Use UniformityAnalysis in DAGISel
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D145918
2023-03-14 14:38:45 +01:00
pvanhout
0e79106fc9 Revert "[DAG/AMDGPU] Use UniformityAnalysis in DAGISel"
This reverts commit 0022b5803fd4f5a4e9fcf233267c0ffa1b88f763.
2023-03-14 11:48:58 +01:00
pvanhout
0022b5803f [DAG/AMDGPU] Use UniformityAnalysis in DAGISel
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D145918
2023-03-14 11:18:28 +01:00
Yeting Kuo
b2c48559c8 [IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND.
The patch mainly does two things. The first is allowing scalable vector
ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower
strict_fpextend to riscv_strict_fpextend_vl, the strict version of
riscv_fpextend_vl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145548
2023-03-09 17:37:59 +08:00
Marco Elver
bdb4353ae0 [SelectionDAG] Optimize copyExtraInfo deep copy
It turns out that there are relatively trivial, albeit rare, cases that
require a MaxDepth of more than 16 (see added test). However, we want to
avoid having to rely on a large fixed MaxDepth.

Since these cases are relatively rare, apply the following strategy:

  1. Start with a low MaxDepth of 16 - if the entry node was not
     reached, we can return (the common case).

  2. If the entry node was reached, exponentially increase MaxDepth up
     to some large limit that should cover all cases and guard against
     stack exhaustion.

This retains the better performance with a low MaxDepth in the common
case, and in complex cases backs off and retries. On a whole, this is
preferable vs. starting with a large MaxDepth which would unnecessarily
penalize the common case where a low MaxDepth is sufficient.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D145386
2023-03-06 17:29:53 +01:00
Marco Elver
7ecd2a23f5 [SelectionDAG] Fix missing lambda capture
Move MaxDepth into the lambda, since it is not needed outside. This
fixes some compilers that complain about missing capture:

  error C3493: 'MaxDepth' cannot be implicitly captured because no
  default capture mode has been specified

Fixes: f693932fbea7 ("[SelectionDAG] Transitively copy NodeExtraInfo on RAUW")
2023-03-02 23:47:36 +01:00
Marco Elver
f693932fbe [SelectionDAG] Transitively copy NodeExtraInfo on RAUW
During legalization of the SelectionDAG, some nodes are replaced with
arch-specific nodes. These may be complex nodes, where the root node no
longer corresponds to the node that should carry the extra info.

Fix the issue by copying extra info to the new node and all its new
transitive operands during RAUW. See code comments for more details.

This fixes the remaining pcsections-atomics.ll tests on X86.

v2: Optimize copyExtraInfo() deep copy. For now we assume that only
NodeExtraInfo that have PCSections set require deep copy. Furthermore,
limit the depth of graph search while pre-populating the visited set,
assuming the to-be-replaced subgraph 'From' has limited complexity. An
assertion catches if the maximum depth needs to be increased.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D144677
2023-03-02 23:07:19 +01:00
Craig Topper
06c6b787b2 [SelectionDAG][AArch64] Constant fold in SelectionDAG::getVScale if VScaleMin==VScaleMax.
Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D145113
2023-03-02 12:02:38 -08:00
Marco Elver
e0bc779000 Revert "[SelectionDAG] Transitively copy NodeExtraInfo on RAUW"
This reverts commit 7f635b90e7bdf1378fd9a65fc62b99e8e07d4aaf.

The current implementation causes pathological slowdowns in certain
cases: https://github.com/llvm/llvm-project/issues/61108
2023-03-02 09:39:44 +01:00
Marco Elver
7f635b90e7 [SelectionDAG] Transitively copy NodeExtraInfo on RAUW
During legalization of the SelectionDAG, some nodes are replaced with
arch-specific nodes. These may be complex nodes, where the root node no
longer corresponds to the node that should carry the extra info.

Fix the issue by copying extra info to the new node and all its new
transitive operands during RAUW. See code comments for more details.

This fixes the remaining pcsections-atomics.ll tests on X86.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D144677
2023-02-27 12:16:14 +01:00
Fangrui Song
e4f4f34e7a [SelectionDAG] Migrate away from soft-deprecated functions. NFC 2023-02-21 11:01:34 -08:00
Kazu Hirata
4a05edd410 [llvm] Use APInt::getZero instead of APInt::getNullValue (NFC)
Note that APInt::getNullValue has been soft-deprecated in favor of
APInt::getZero.
2023-02-19 22:42:01 -08:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Kazu Hirata
cbde2124f1 Use APInt::popcount instead of APInt::countPopulation (NFC)
This is for consistency with the C++20-style bit manipulation
functions in <bit>.
2023-02-19 11:29:12 -08:00
Simon Pilgrim
ce63cd3bf1 [DAG] Fold freeze(concat_vectors(x,y,...)) -> concat_vectors(freeze(x),freeze(y),...)
Another of the cleanups necessary for D136529
2023-02-08 20:26:43 +00:00
Simon Pilgrim
b7deb71ef5 [DAG] Fold freeze(build_pair(x,y)) -> build_pair(freeze(x),freeze(y))
One of the cleanups necessary for D136529 - another being how we're going to handle moving freeze through multiple result nodes (like uaddo and subcarry)
2023-02-08 17:54:03 +00:00
Yeting Kuo
7bc2cd614e [VP][DAGCombiner] Introduce generalized pattern match for vp sdnodes.
The patch tries to solve duplicated combine work for vp sdnodes. The idea is to
introduce MatchConext that verifies specific patterns and generate specific node
infromation. There is two MatchConext in DAGCombiner. EmptyMatcher is for
normal nodes and VPMatcher is for vp nodes.

The idea of this patch is come form Simon Moll's proposal [0]. I only fixed some
minor issues and added few new features in this patch.

[0]: c38a14484a

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141891
2023-02-08 13:45:35 +08:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00