llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-09 21:56:06 +00:00

Author	SHA1	Message	Date
Kerry McLaughlin	f6dd32fd35	[SVE][CodeGen] Lower scalable masked gathers Lowers the llvm.masked.gather intrinsics (scalar plus vector addressing mode only) Changes in this patch: - Add custom lowering for MGATHER, using getGatherVecOpcode() to choose the appropriate gather load opcode to use. - Improve codegen with refineIndexType/refineUniformBase, added in D90942 - Tests added for gather loads with 32 & 64-bit scaled & unscaled offsets. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91092	2020-12-07 12:20:41 +00:00
Simon Pilgrim	9cf4f493a7	[DAG] Move SelectionDAG implementation to KnownBits::setInReg(). NFCI.	2020-12-04 18:09:08 +00:00
Kazu Hirata	7a4af2a8e7	[SelectionDAG] Use is_contained (NFC)	2020-12-02 19:09:45 -08:00
Kerry McLaughlin	4bee3197f6	[SVE][CodeGen] Extend isConstantSplatValue to support ISD::SPLAT_VECTOR Updated the affected scalable_of_scalable tests in sve-gep.ll, as isConstantSplatValue now returns true in DAGCombiner::visitMUL and folds `(mul x, 1) -> x` Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91363	2020-11-26 11:19:40 +00:00
Kai Luo	8e6d92026c	[DAG][PowerPC] Fix dropped `nsw` flag in `SimplifySetCC` by adding `doesNodeExist` helper `SimplifySetCC` invokes `getNodeIfExists` without passing `Flags` argument and `getNodeIfExists` uses a default `SDNodeFlags` to intersect the original flags, as a consequence, flags like `nsw` is dropped. Added a new helper function `doesNodeExist` to check if a node exists without modifying its flags. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D89938	2020-11-25 04:39:03 +00:00
Hongtao Yu	d0e42037bf	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Kazu Hirata	2583d8eb08	[CodeGen] Use llvm::is_contained (NFC)	2020-11-19 22:07:56 -08:00
Simon Pilgrim	8996742741	[KnownBits] Add KnownBits::makeConstant helper. NFCI. Helper for cases where we need to create a KnownBits from a (fully known) constant value.	2020-11-12 16:16:04 +00:00
Simon Pilgrim	1a62ca65c1	[KnownBits] Add KnownBits::commonBits helper. NFCI. We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........	2020-11-11 12:15:54 +00:00
Kerry McLaughlin	170947a5de	[SVE][CodeGen] Lower scalable masked scatters Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only) Changes included in this patch: - Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use. Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts. - Added the getCanonicalIndexType function to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes) - Tests with 32 & 64-bit scaled & unscaled offsets Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90941	2020-11-11 11:50:22 +00:00
Kerry McLaughlin	ffbbfc76ca	[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for MaskedScatterSDNode to print the details of masked scatters (is truncating, signed or scaled). This is the first in a series of patches which adds support for scalable masked scatters Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90939	2020-11-11 10:58:24 +00:00
Simon Pilgrim	bf04e34383	[DAG] computeKnownBits - Replace ISD::SREM handling with KnownBits::srem to reduce code duplication	2020-11-05 17:12:58 +00:00
Simon Pilgrim	e237d56b43	[KnownBits] Move ValueTracking/SelectionDAG UREM KnownBits handling to KnownBits::urem. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 14:30:59 +00:00
Simon Pilgrim	32bee18b84	[KnownBits] Move ValueTracking/SelectionDAG UDIV KnownBits handling to KnownBits::udiv. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 13:42:42 +00:00
Cameron McInally	c126eb7529	[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247. Differential Revision: https://reviews.llvm.org/D90644	2020-11-04 14:20:31 -06:00
Kerry McLaughlin	f2412d372d	[SVE][CodeGen] Lower scalable integer vector reductions This patch uses the existing LowerFixedLengthReductionToSVE function to also lower scalable vector reductions. A separate function has been added to lower VECREDUCE_AND & VECREDUCE_OR operations with predicate types using ptest. Lowering scalable floating-point reductions will be addressed in a follow up patch, for now these will hit the assertion added to expandVecReduce() in TargetLowering. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D89382	2020-11-04 11:38:49 +00:00
Simon Pilgrim	825e517e34	[DAG] computeKnownBits - Replace ISD::MUL handling with the common KnownBits::computeForMul implementation	2020-11-04 11:32:08 +00:00
Simon Pilgrim	e9b88c754a	[DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.	2020-11-03 18:09:33 +00:00
Simon Pilgrim	cb798f040a	[DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 17:30:36 +00:00
Simon Pilgrim	cab21d4fa8	[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 14:22:28 +00:00
Cameron McInally	dda1e74b58	[Legalize] Add legalizations for VECREDUCE_SEQ_FADD Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass. Differential Revision: https://reviews.llvm.org/D90247	2020-10-30 16:02:55 -05:00
Nikita Popov	6c2ad4cf87	[SDAG] Extract helper to determine neutral element (NFC) Make the existing VECREDUCE based code more generic, but expressing it in terms of the neutral value of the base opcode instead.	2020-10-29 22:05:06 +01:00
Nikita Popov	91bf172088	[SDAG] Extract helper to get vecreduce base opcode (NFC)	2020-10-29 20:22:22 +01:00
Simon Pilgrim	ce356e1546	[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method. I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns. Differential Revision: https://reviews.llvm.org/D87930	2020-10-24 12:23:09 +01:00
Simon Pilgrim	88523f6f4b	[DAG] getNode(ISD::EXTRACT_SUBVECTOR) Drop unnecessary N2C null check - we assert that this isn't null and have already used the pointer. NFCI. Fixes cppcheck + null dereference warning.	2020-10-21 11:53:44 +01:00
David Sherwood	5b17b323a6	[SVE][CodeGen] Replace use of TypeSize comparison operator in CreateStackTemporary We were previously relying upon the TypeSize comparison operators to obtain the maximum size of two types, however use of such operators is being deprecated in favour of making the caller aware that it could be dealing with scalable vector types. I have changed the code to assert that the two types have the same scalable property and thus we can simply take the maximum of the known minimum sizes instead. Differential Revision: https://reviews.llvm.org/D88563	2020-10-21 08:31:36 +01:00
Krzysztof Parzyszek	61eaa2e14a	[SDAG] Remember to set UndefElts in isSplatValue for SPLAT_VECTOR	2020-10-10 19:42:24 -05:00
David Sherwood	fa0293081d	[SVE][CodeGen] Fix TypeSize/ElementCount related warnings in sve-split-store.ll I have fixed up a number of warnings resulting from TypeSize -> uint64_t casts and calling getVectorNumElements() on scalable vector types. I think most of the changes are fairly trivial except for those in DAGTypeLegalizer::SplitVecRes_MSTORE I've tried to ensure we create the MachineMemoryOperands in a sensible way for scalable vectors. I have added a CHECK line to the following test: CodeGen/AArch64/sve-split-store.ll that ensures no new warnings are added. Differential Revision: https://reviews.llvm.org/D86928	2020-10-05 19:27:00 +01:00
Sanjay Patel	2ccbf3dbd5	[SDAG] fold x * 0.0 at node creation time In the motivating case from https://llvm.org/PR47517 we create a node that does not get constant folded before getNegatedExpression is attempted from some other node, and we crash. By moving the fold into SelectionDAG::simplifyFPBinop(), we get the constant fold sooner and avoid the problem.	2020-10-04 11:31:57 -04:00
Kerry McLaughlin	fcf70e1e3b	[SVE][CodeGen] Lower scalable fp_extend & fp_round operations This patch adds FP_EXTEND_MERGE_PASSTHRU & FP_ROUND_MERGE_PASSTHRU ISD nodes, used to lower scalable vector fp_extend/fp_round operations. fp_round has an additional argument, the 'trunc' flag, which is an integer of zero or one. This also fixes a warning introduced by the new tests added to sve-split-fcvt.ll, resulting from an implicit TypeSize -> uint64_t cast in SplitVecOp_FP_ROUND. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88321	2020-10-01 12:17:37 +01:00
Krzysztof Parzyszek	db04bec5f1	[SDAG] Do not convert undef to 0 when folding CONCAT/BUILD_VECTOR Differential Revision: https://reviews.llvm.org/D88273	2020-09-29 09:12:26 -05:00
Jay Foad	d6b04f3937	[SDag] Refactor and simplify divergence calculation and checking. NFC.	2020-09-29 14:05:07 +01:00
Qiu Chaofan	c0f8e4c06c	[SelectionDAG] Add guard to automatically insert flags This is like FastMathFlagGuard in IR. Since we use SDAG instance to get values, it's with SelectionDAG. By creating a FlagInserter in current scope, all values created by getNode will get the flags if no Flags argument provided. In this patch, I applied it to floating point operations folding part in DAG combiner, and removed Flags passing to getNode to show its effect. Other places in DAG combiner and other helper methods similar to getNode also need this. They can be done in follow-up patches. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87361	2020-09-26 13:57:52 +08:00
David Sherwood	e077367a28	[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits An existing function Type::getScalarSizeInBits returns a uint64_t instead of a TypeSize class because the caller is requesting a scalar size, which cannot be scalable. This patch makes other similar functions requesting a scalar size consistent with that, thereby eliminating more than 1000 implicit TypeSize -> uint64_t casts. Differential revision: https://reviews.llvm.org/D87889	2020-09-23 09:20:08 +01:00
Simon Pilgrim	d967aaa8fa	[DAG] BuildVectorSDNode::getSplatValue - pull out repeated getNumOperands() calls. NFCI.	2020-09-18 16:10:23 +01:00
Craig Topper	c193a689b4	[SelectionDAG] Use Align/MaybeAlign in calls to getLoad/getStore/getExtLoad/getTruncStore. The versions that take 'unsigned' will be removed in the future. I tried to use getOriginalAlign instead of getAlign in some places. getAlign factors in the minimum alignment implied by the offset in the pointer info. Since we're also passing the pointer info we can use the original alignment. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87592	2020-09-14 13:54:50 -07:00
Craig Topper	56b33391d3	[SelectionDAG] Move ISD:PARITY formation from DAGCombine to SimplifyDemandedBits. Previously, we formed ISD::PARITY by looking for (and (ctpop X), 1) but the AND might be separated from the ctpop. For example if the parity result is multiplied by 2, we'll pull the AND through the shift. So to handle more cases, move to SimplifyDemandedBits where we can handle more cases that result in only the LSB of the CTPOP being used.	2020-09-13 21:04:13 -07:00
Simon Pilgrim	d816499f95	[KnownBits] Move SelectionDAG::computeKnownBits ISD::ABS handling to KnownBits::abs Move the ISD::ABS handling to a KnownBits::abs handler, to simplify future implementations in ValueTracking/GlobalISel.	2020-09-09 13:22:58 +01:00
Simon Wallis	79ea83e104	[SelectionDAG] memcpy expansion of const volatile struct ignores const zero In getMemcpyLoadsAndStores(), a memcpy where the source is a zero constant is expanded to a MemOp::Set instead of a MemOp::Copy, even when the memcpy is volatile. This is incorrect. The fix is to add a check for volatile, and expand to MemOp::Copy in the volatile case. Reviewed By: chill Differential Revision: https://reviews.llvm.org/D87134	2020-09-07 13:22:09 +01:00
Jay Foad	5350e1b509	[KnownBits] Implement accurate unsigned and signed max and min Use the new implementation in ValueTracking, SelectionDAG and GlobalISel. Differential Revision: https://reviews.llvm.org/D87034	2020-09-07 09:09:01 +01:00
David Sherwood	73a3d350a4	[SVE][CodeGen] Fix up warnings in sve-split-insert/extract tests I have fixed up some more ElementCount/TypeSize related warnings in the following tests: CodeGen/AArch64/sve-split-extract-elt.ll CodeGen/AArch64/sve-split-insert-elt.ll In SelectionDAG::CreateStackTemporary we were relying upon the implicit cast from TypeSize -> uint64_t when calling MachineFrameInfo::CreateStackObject. I've fixed this by passing in the known minimum size instead, which I believe is fine because the associated stack id indicates whether this is a scalable object or not. I've also fixed up a case in TargetLowering::SimplifyDemandedBits when extracting a vector element from a scalable vector. The result is a scalar, hence it wasn't caught at the start of the function. If the vector is scalable we just bail out for now. Differential Revision: https://reviews.llvm.org/D86431	2020-09-04 09:51:31 +01:00
Simon Pilgrim	1673a08044	SelectionDAG.h - remove unnecessary FunctionLoweringInfo.h include. NFCI. Use forward declarations and move the include down to dependent files that actually use it. This also exposes a number of implicit dependencies on KnownBits.h	2020-09-03 18:33:25 +01:00
Matt Arsenault	759482ddaa	GlobalISel: Implement computeKnownBits for G_BSWAP and G_BITREVERSE	2020-09-01 12:49:57 -04:00
David Sherwood	9fbb113247	[SVE][CodeGen] Fix TypeSize/ElementCount related warnings in sve-split-load.ll I have fixed up a number of warnings resulting from TypeSize -> uint64_t casts and calling getVectorNumElements() on scalable vector types. I think most of the changes are fairly trivial except for those in DAGTypeLegalizer::SplitVecRes_MLOAD I've tried to ensure we create the MachineMemoryOperands in a sensible way for scalable vectors. I have added a CHECK line to the following test: CodeGen/AArch64/sve-split-load.ll that ensures no new warnings are added. Differential Revision: https://reviews.llvm.org/D86697	2020-09-01 07:47:59 +01:00
Paul Walker	73ac3c0ede	[SVE] Lower scalable vector ISD::FNEG operations. Also updates isConstOrConstSplatFP to allow the mul(A,-1) -> neg(A) transformation when -1 is expressed as an ISD::SPLAT_VECTOR. Differential Revision: https://reviews.llvm.org/D86415	2020-08-25 11:22:28 +01:00
David Sherwood	88bbd30736	[SVE][CodeGen] Fix issues with EXTRACT_SUBVECTOR when using scalable FP vectors In this patch I have fixed two issues: 1. Our SVE tuple get/set intrinsics were using the wrong constant type for the index passed to EXTRACT_SUBVECTOR. I have fixed this by using the function SelectionDAG::getVectorIdxConstant to create the value. Also, I have updated the documentation for EXTRACT_SUBVECTOR describing what type the constant index should be and we now enforce this when creating the node. 2. The AArch64 backend was missing the appropriate patterns for extracting certain subvectors (nxv4f16 and nxv2f32) from legal SVE types. I have added them as part of this patch. The only way that I could find to test the new patterns was to use the SVE tuple get intrinsics, although I realise it looks a bit unusual. Tests added here: test/CodeGen/AArch64/sve-extract-subvector.ll Differential Revision: https://reviews.llvm.org/D85516	2020-08-12 08:35:46 +01:00
Kerry McLaughlin	455ed56d48	[SVE][CodeGen] Legalisation of INSERT_VECTOR_ELT for scalable vectors When the result type of insertelement needs to be split, SplitVecRes_INSERT_VECTOR_ELT will try to store the vector to a stack temporary, store the element at the location of the stack temporary plus the index, and reload the Lo/Hi parts. This patch does the following to ensure this works for scalable vectors: - Sets the StackID with getStackIDForScalableVectors() in CreateStackTemporary - Adds an IsScalable flag to getMemBasePlusOffset() and scales the offset by VScale when this is true - Ensures the immediate is clamped correctly by clampDynamicVectorIndex so that we don't try to use an out of range index Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D84874	2020-08-11 12:57:28 +01:00
Kerry McLaughlin	85c7e89f3b	[CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize Changes the Offset arguments to both functions from int64_t to TypeSize & updates all uses of the functions to create the offset using TypeSize::Fixed() Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85220	2020-08-11 12:17:10 +01:00
Simon Pilgrim	66a163f328	[DAG] GetDemandedBits - remove custom AND handling. As mentioned on D85463, we should be using SimplifyMultipleUseDemandedBits (which is the default fallback). The minor regression in illegal-bitfield-loadstore.ll will be addressed properly by D77804.	2020-08-07 12:55:47 +01:00
Simon Pilgrim	fcefb53222	Remove unreachable break. NFC	2020-08-07 12:37:49 +01:00

1 2 3 4 5 ...

2070 Commits