2591 Commits

Author SHA1 Message Date
Daniel Paoliello
0c5c7b52f0 Emit the CodeView S_ARMSWITCHTABLE debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:

* The address of the branch instruction that uses the jump table.
* The address of the jump table.
* The "base" address that the values in the jump table are relative to.
* The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted).

Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to.

Documentation for the symbol can be found in the Microsoft PDB library dumper: 0fe89a942f/cvdump/dumpsym7.cpp (L5518)

This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D149367
2023-08-31 12:06:50 -07:00
Simon Pilgrim
376050db9f [DAG] Move some unary constant folds from getNode() to FoldConstantArithmetic()
We need to clean up some type handling before the remainder (int<->fp and bitcasts) can be moved over.
2023-08-30 13:59:28 +01:00
Craig Topper
299b1b4071 [SelectionDAG][RISCV] Teach getConstant to use SPLAT_VECTOR_PARTS if vXi64 SPLAT_VECTOR is legal but i64 scalars are not.
That matches how such a SPLAT_VECTOR would have been type legalized
so assume it is ok to use for creating constants after type legalization.

Still need some improvements to SPLAT_VECTOR lowering.

This overlaps with some of what D158742 was trying to fix.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D158870
2023-08-29 09:22:17 -07:00
Luke Lau
8f1d1e2b61 [SDAG] Add computeKnownBits support for ISD::SPLAT_VECTOR_PARTS
We can work out the known bits for a given lane by concatenating the known bits of each scalar operand.

In the description of ISD::SPLAT_VECTOR_PARTS in ISDOpcodes.h it says that the
total size of the scalar operands must cover the output element size, but I've
added a stricter assertion here that the total width of the scalar operands
must be exactly equal to the element size. It doesn't seem to trigger, and I'm
not sure if there any targets that use SPLAT_VECTOR_PARTS for anything other
than v4i32 -> v2i64 splats.

We also need to include it in isTargetCanonicalConstantNode, otherwise
returning the known bits introduces an infinite combine loop.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D158852
2023-08-28 10:35:58 +01:00
Arthur Eubanks
0a4fc4ac1c Revert "Emit the CodeView S_ARMSWITCHTABLE debug symbol for jump tables"
This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3.

Causes crashes, see comments in https://reviews.llvm.org/D149367.

Some follow-up fixes are also reverted:

This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085.
This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6.
This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.
2023-08-25 18:34:15 -07:00
Daniel Paoliello
8d0c3db388 Emit the CodeView S_ARMSWITCHTABLE debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:

* The address of the branch instruction that uses the jump table.
* The address of the jump table.
* The "base" address that the values in the jump table are relative to.
* The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted).

Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to.

Documentation for the symbol can be found in the Microsoft PDB library dumper: 0fe89a942f/cvdump/dumpsym7.cpp (L5518)

This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D149367
2023-08-25 10:19:17 -07:00
Kazu Hirata
134115618a [CodeGen] Use isAllOnesConstant and isNullConstant (NFC) 2023-08-20 22:56:40 -07:00
Jeffrey Byrnes
d26a06728d [DAG] NFC: Add getBitcastedExtOrTrunc
Simple function which scalarizes Ops then ExtOrTruncs them according to function parameters

Differential Revision: https://reviews.llvm.org/D157733

Change-Id: Ie5215069228f7bf530cd2dbb4bd17cbf409e046a
2023-08-17 14:29:17 -07:00
Paul Walker
566065207b [SelectionDAG] Use TypeSize variant of ComputeValueVTs to compute correct offsets for scalable aggregate types.
Differential Revision: https://reviews.llvm.org/D157872
2023-08-16 11:56:31 +00:00
Noah Goldstein
2549ec1866 [SelectionDAG] Improve isKnownToBeAPowerOfTwo
Add additional cases for:
select, vselect, {u,s}{min,max}, and, casts, rotl, rotr

And improve handling of constants and shifts.

Differential Revision: https://reviews.llvm.org/D156778
2023-08-16 02:00:15 -05:00
Noah Goldstein
ac485e4072 [SelectionDAG] Add/Improve cases in isKnownNeverZero
1) Handle casts a bit more cleanly just with a loop rather than with
   recursion.

2) Add additional cases for smin/smax

3 ) For shifts we can also deduce non-zero if the maximum shift amount
    on the known 1s is non-zero.

Differential Revision: https://reviews.llvm.org/D156777
2023-08-16 02:00:15 -05:00
David Green
de775f264d [DAG] Add constant SPLAT handling in getNodes SIGN_EXTEND_INREG
This helps simplify constant splats a little. Without this the code in
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L14072 always returns the
existing node.

Differential Revision: https://reviews.llvm.org/D157259
2023-08-08 10:27:55 +01:00
Matt Arsenault
0efdf3baf5 DAG: Remove getTargetIndex as it's unused
Fixes #29973
2023-08-05 09:20:09 -04:00
Bjorn Pettersson
4ce7c4a92a [llvm] Drop some typed pointer handling/bitcasts
Differential Revision: https://reviews.llvm.org/D157016
2023-08-03 22:54:33 +02:00
Simon Pilgrim
076bee1020 [DAG] getNode() - fold (zext (trunc (assertzext x))) -> (assertzext x)
If the pre-truncated value was the same width as the extension, and the assertzext guarantees that the extended bits are already zero, then skip the zext/trunc 'zero_extend_inreg' pattern.

Addresses several regressions noticed in D155472
2023-07-31 10:43:11 +01:00
Zhongyunde
05aae0839f Reland [AArch64][NFC] Call the API getVScaleRange directly
Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range.

the previous changes bring in a Buildbot failure because MinSVEVectorSize = MinSVEVectorSize.
    error: explicitly assigning value of variable of type 'unsigned int' to itself [-Werror,-Wself-assign]

Reviewed By: sdesmalen, nikic, dmgreen
Differential Revision: https://reviews.llvm.org/D155708
2023-07-26 18:55:31 +08:00
Zhongyunde
ebaac2b2d6 Revert "[AArch64][NFC] Call the API getVScaleRange directly"
This reverts commit 67005c8e6fa9464f8bc436305a422071013ae499.
2023-07-26 16:44:14 +08:00
Zhongyunde
67005c8e6f [AArch64][NFC] Call the API getVScaleRange directly
Use the maximum 64 for BitWidth of getVScaleRange to
avoid returning an empty range.

Reviewed By: sdesmalen, nikic, dmgreen
Differential Revision: https://reviews.llvm.org/D155708
2023-07-26 15:54:04 +08:00
David Green
0c41c59dee [DAG][AArch64] Fix truncated vscale constant types
It appears that vscale values truncated to i1 causes mismatches in the constant
types when created in getNode. https://godbolt.org/z/TaaTo86ne.

Differential Revision: https://reviews.llvm.org/D155626
2023-07-20 09:12:05 +01:00
Matt Arsenault
296e24cd2e DAG: Constant fold frexp nodes
Special casing the nonfinite exponent value everywhere is kind of
annoying.
2023-07-17 17:34:29 -04:00
Simon Pilgrim
4f95821f58 [DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI.
This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!
2023-07-17 17:17:40 +01:00
Noah Goldstein
a4c461c063 [SelectionDAG] Fill in some more cases in isKnownNeverZero
This mostly copies cases that already exist in ValueTracking, although
it skips the more complex ones. Those can be filled in as needed.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D149199
2023-07-12 17:17:53 -05:00
Matt Arsenault
003b58f65b IR: Add llvm.frexp intrinsic
Add an intrinsic which returns the two pieces as multiple return
values. Alternatively could introduce a pair of intrinsics to
separately return the fractional and exponent parts.

AMDGPU has native instructions to return the two halves, but could use
some generic legalization and optimization handling. For example, we
should be able to handle legalization of f16 on older targets, and for
bf16. Additionally antique targets need a hardware workaround which
would be better handled in the backend rather than in library code
where it is now.
2023-06-28 14:50:16 -04:00
Simon Pilgrim
64d01432d2 Fix "for for" duplicate typo in comment. NFC. 2023-06-27 11:43:09 +01:00
Alex MacLean
17aa37dd30 [SelectionDAG] Add memory size for CSEMap ID calculation
In NVPTX `ReplaceVectorLoad()`, i1 and i8 types are promoted to i16,
followed by a truncate operation. Thus, v2i8 (or v2i1) and v2i16 will
have the same VTList, which causes a collision in CSEMap.

To differentiate the original VTList, let's add the size in generating
an ID. Otherwise the compiler crashes in refineAlignment:
`MMO->getSize() == getSize() && "Size mismatch!"`

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153712
2023-06-26 16:12:48 -07:00
Craig Topper
eea865bd4a Recommit "[SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND."
I have fixed an existing DAGCombiner bug that caused the previous assertion failure.
See 7163539466d7e8930416e55dd9fd29891f8239f2.

Original message

We don't have VP_ANY_EXTEND or VP_SIGN_EXTEND_INREG yet so I've
deviated a little from the non-VP lowering.

My goal was to fix the crashes that occurs on these test cases without this patch.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D152854
2023-06-15 12:03:25 -07:00
Alan Zhao
222d73ff7a Revert "[SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND."
This reverts commit 6bf79fb09416b02b3f8589a4998610d70c185dae.

Reason: causes Clang to crash during Chrome debug builds: https://crbug.com/1455144
2023-06-15 10:20:03 -07:00
Craig Topper
6bf79fb094 [SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND.
We don't have VP_ANY_EXTEND or VP_SIGN_EXTEND_INREG yet so I've
deviated a little from the non-VP lowering.

My goal was to fix the crashes that occurs on these test cases without this patch.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D152854
2023-06-14 08:52:56 -07:00
Craig Topper
a5cd198181 [SelectionDAG] Don't allow type legalization to create noop VP_TRUNCATE.
Type legalization may need to promote the result to the same type
as the input. Instead of forming a vp_truncate with the same
source and dest type, don't create any vp_truncate.

Handling in getNode like is done for ISD::TRUNCATE.
2023-06-13 12:51:24 -07:00
Anna Thomas
26bfbec5d2 [Intrinsic] Introduce reduction intrinsics for minimum/maximum
This patch introduces the reduction intrinsic for floating point minimum
and maximum which has the same semantics (for NaN and signed zero) as
llvm.minimum and llvm.maximum.

Reviewed-By: nikic

Differential Revision: https://reviews.llvm.org/D152370
2023-06-13 12:29:58 -04:00
Phoebe Wang
1c6fd98ffb [SelectionDAG] Do not salvage with vector node
rG2eb7cbf987f21 added this code, which results in crash for vector
nodes. This patch solves it by skipping for the vector nodes.

Thanks Steve for helping reducing the test case.

Co-authored-by: Steve Merritt <steve.merritt@intel.com>

Reviewed By: goldstein.w.n

Differential Revision: https://reviews.llvm.org/D152492
2023-06-09 14:55:16 +08:00
Matt Arsenault
eece6ba283 IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics
AMDGPU has native instructions and target intrinsics for this, but
these really should be subject to legalization and generic
optimizations. This will enable legalization of f16->f32 on targets
without f16 support.

Implement a somewhat horrible inline expansion for targets without
libcall support. This could be better if we could introduce control
flow (GlobalISel version not yet implemented). Support for strictfp
legalization is less complete but works for the simple cases.
2023-06-06 17:07:18 -04:00
Serge Pavlov
eecaeb6f10 [FPEnv] Intrinsics for access to FP environment
The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'.
They are used to read floating-point environment, set it or reset to
some default state. They do the same actions as C library functions
'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls
to these functions.

The new intrinsics specify FP environment as a value of integer type, it
is convenient of most targets where the FP state is a content of some
register. Some targets however use long representations. On X86 the size
of FP environment is 256 bits, and even half of this size is not a legal
ibteger type. To facilitate legalization in such cases, two sets of DAG
nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP
environment may be represented by a legal integer type. Nodes
GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in
memory, much like `fesetenv` and `fegetenv` do. They are used when
target has long representation for floationg-point state.

Differential Revision: https://reviews.llvm.org/D71742
2023-06-05 13:10:01 +07:00
Nikita Popov
e506bfa7ae [SDAG] Fix incorrect use of undef for boolean contents (PR63055)
FoldSetCC() returns UNDEF in a number of cases. However, the SetCC
result must follow BooleanContents. Unless the type is a
pre-legalization i1 or we have UndefinedBooleanContents, the use of
UNDEF will not uphold the requirement that the top bits are either
zero or match the low bit. In such cases, return zero instead.

Fixes https://github.com/llvm/llvm-project/issues/63055.

Differential Revision: https://reviews.llvm.org/D151883
2023-06-01 15:19:22 +02:00
David Green
7740216f2e [DAG] Combine insert(shuffle(load), load, 0) into a single load
Given an insert of a scalar load into a vector shuffle with mask
u,0,1,2,3,4,5,6 or 1,2,3,4,5,6,7,u (depending on the insert index),
it can be more profitable to convert to a single load and avoid the
shuffles. This adds a DAG combine for it, providing the new load is
still fast.

Differential Revision: https://reviews.llvm.org/D151029
2023-05-31 19:48:57 +01:00
Dhruv Chawla
3b3912e9b8 Reapply [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()
This exposed a miscompile due to incorrect flag preservation in
integer type legalization, which has been fixed in D151472.

-----

This patch is a continuation of D150110. It separates the cases for
ADD and SUB into their own cases so that computeForAddSub can be
directly called and the NSW flag passed. This allows better
optimization when the NSW flag is enabled, and allows fixing up the
TODO that was there previously in SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D150769
2023-05-31 12:25:41 +02:00
Craig Topper
a4f437f012 SelectionDAG: Teach ComputeKnownBits about VSCALE
This reverts commit 9b92f70d4758f75903ce93feaba5098130820d40.  The issue
with the re-applied change was an implicit truncation due to the
multiplication.  Although the operations were converted to `APInt`, the
values were implicitly converted to `long` due to the typing rules.

Fixes: #59594

Differential Revision: https://reviews.llvm.org/D140347
2023-05-26 10:48:49 -07:00
Craig Topper
c5e6c886aa [VP][SelectionDAG][RISCV] Add get_vector_length intrinsics and generic SelectionDAG support.
The generic implementation is umin(TC, VF * vscale).

Lowering to vsetvli for RISC-V will come in a future patch.

This patch is a pre-requisite to be able to CodeGen vectorized code from
D99750.

Reviewed By: reames, frasercrmck

Differential Revision: https://reviews.llvm.org/D149916
2023-05-26 09:06:38 -07:00
Nikita Popov
2ba14283cd Revert "[SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()"
This reverts commit b66551370fdfc6f357ae0d77237119d2b1077b62.

This has exposed a pre-existing miscompile, reported in
https://reviews.llvm.org/D150769#4370467.
2023-05-25 11:13:51 +02:00
Dhruv Chawla
b66551370f [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()
This patch is a continuation of D150110. It separates the cases for
ADD and SUB into their own cases so that computeForAddSub can be
directly called and the NSW flag passed. This allows better
optimization when the NSW flag is enabled, and allows fixing up the
TODO that was there previously in SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D150769
2023-05-17 15:15:05 +02:00
Noah Goldstein
d294e3cb76 [SelectionDAG] Improve computeKnownBits implementations of sdiv and udiv
Add `exact` flag handling for `udiv` and add entire `sdiv` case.

Differential Revision: https://reviews.llvm.org/D150098
2023-05-16 18:58:13 -05:00
Jay Foad
d8229e2f14 [KnownBits] Define and use intersectWith and unionWith
Define intersectWith and unionWith as two complementary ways of
combining KnownBits. The names are chosen for consistency with
ConstantRange.

Deprecate commonBits as a synonym for intersectWith.

Differential Revision: https://reviews.llvm.org/D150443
2023-05-16 09:23:51 +01:00
Noah Goldstein
e36caaeeb2 [SelectionDAG] Use computeKnownBits if Op is not recognized by isKnownNeverZero
The current logic is pretty limitted unless the `Op` is a
constant. This at least covers more obvious cases.

Reviewed By: craig.topper, foad

Differential Revision: https://reviews.llvm.org/D149196
2023-05-13 14:36:04 -05:00
Noah Goldstein
da9f306739 [SelectionDAG] Limit max recursion in isKnownNeverZero and isKnownToBeAPowerOfTwo
Both of these functions recursively call themselves so it makes sense
to limit that upper bound.

Differential Revision: https://reviews.llvm.org/D149195
2023-05-13 14:35:57 -05:00
Craig Topper
4a9e6c422f [SelectionDAG] Correct AddNodeIDCustom for MemIntrinsicSDNodes.
We were missing any support for ISD::INTRINSIC_W_CHAIN/INTRINSIC_VOID
used for memory operations.

For ISD::PREFETCH and target memory nodes we didn't add the subclass
data.

This patch handles all MemIntrinsicSDNode in one place and adds the
missing subclass data.

Note. Unlike load/stores we don't add the memory VT in AddNodeIDCustom or getMemIntrinsicNode. Not sure why.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D150387
2023-05-11 16:28:47 -07:00
David Green
b774f14841 [DAG] Calculate the number of sign bits for constant BUILD_VECTOR directly.
For constant BUILD_VECTORs the operands need to be legal types. This can mean
that when the number of sign bits is calculated it may look that the entire
constant and inefficiently produce less sign bits than it could. For example i8
vectors could use i32 elements, for which 0x000000ff would be incorrectly
limited to 1 sign bit as the original value has 24 sign bits. This makes it
look at the constant directly, truncated to the correct type for the element so
that it can correctly return 8.

Differential Revision: https://reviews.llvm.org/D149956
2023-05-07 22:31:10 +01:00
Simon Pilgrim
b7116ba8b0 [DAG] computeOverflowForUnsignedAdd - use ConstantRange::unsignedAddMayOverflow as fallback
Replaces the more specific uadd_ov case
2023-05-06 22:03:38 +01:00
Simon Pilgrim
b83aa8bc75 [DAG] computeOverflowForUnsignedAdd - use getMaxValue().ult(2) to detect 0/1 values. NFCI. 2023-05-06 19:46:34 +01:00
Simon Pilgrim
08c1150d4c [DAG] Add computeOverflowForSignedSub/computeOverflowForUnsignedSub/computeOverflowForSub
Match the addition variants (although computeOverflowForUnsignedSub is really just a placeholder), and use this in DAGCombiner::visitSUBO
2023-05-06 15:55:04 +01:00
Simon Pilgrim
489e728672 [DAG] computeOverflowForSignedAdd - fix typo in comment. NFC. 2023-05-06 14:18:22 +01:00