llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-17 14:56:07 +00:00

Author	SHA1	Message	Date
Nadav Rotem	018921002e	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221	2012-07-14 21:30:27 +00:00
Owen Anderson	b8844d6744	Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization. This is a speculative fix for a problem on Mips reported by Akira Hatanaka. llvm-svn: 160036	2012-07-11 06:38:55 +00:00
Nadav Rotem	d908ddc186	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Owen Anderson	d4b841f8f9	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
Evan Cheng	4c6f917d34	Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection. llvm-svn: 159179	2012-06-26 01:19:33 +00:00
Lang Hames	b8650f106a	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Pete Cooper	5b61422d80	Fix potential crash if DAGCombine on stores sees a half type llvm-svn: 158927	2012-06-21 18:00:39 +00:00
Pete Cooper	fe5b84b404	Add users of a MERGE_VALUE node to the worklist to process again when the node is removed. Sorry, no test case. Foudn it by inspection of the code llvm-svn: 158839	2012-06-20 19:35:43 +00:00
Hal Finkel	8a31138521	Fix DAGCombine to deal with ext-conversion of pre/post_inc loads. The test case for this will come with the PPC indexed preinc loads commit. llvm-svn: 158822	2012-06-20 15:42:48 +00:00
Lang Hames	39fb1d08dc	Add DAG-combines for aggressive FMA formation. This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757	2012-06-19 22:51:23 +00:00
Lang Hames	a33db65bd9	Make comment slightly more helpful. llvm-svn: 158467	2012-06-14 20:37:15 +00:00
Owen Anderson	0eda3e1de6	Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention. llvm-svn: 157708	2012-05-30 18:54:50 +00:00
Owen Anderson	c7aaf523e1	Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node. llvm-svn: 157707	2012-05-30 18:50:39 +00:00
Jim Grosbach	92f6adc8be	DAGCombiner should not change the type of an extract_vector index. When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419	2012-05-08 20:56:07 +00:00
Owen Anderson	ab63d84252	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	41b0665b5b	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	b5f167c660	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Elena Demikhovsky	8d7e56c409	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Jakob Stoklund Olesen	beb9469d5c	Register DAGUpdateListeners with SelectionDAG. Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248	2012-04-20 22:08:46 +00:00
Hal Finkel	e0cf6397fd	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Nadav Rotem	9d376b6578	Reapply 154397. Original message: Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490	2012-04-11 08:26:11 +00:00
Duncan Sands	4f53074cca	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Owen Anderson	3efc8f22bd	Revert r154397, which was causing make check failures on the buildbots. llvm-svn: 154414	2012-04-10 18:02:12 +00:00
Nadav Rotem	065564d85a	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397	2012-04-10 14:58:31 +00:00
Anton Korobeynikov	4d1220de34	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Rafael Espindola	1d9672bdce	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Rafael Espindola	8f62b3248e	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Craig Topper	9c3da316ec	Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309	2012-04-09 07:19:09 +00:00
Craig Topper	e5893f64e8	Remove unnecessary 'else' on an 'if' that always returns llvm-svn: 154308	2012-04-09 05:59:53 +00:00
Craig Topper	e3ad4834ae	Optimize code slightly. No functionality change. llvm-svn: 154307	2012-04-09 05:55:33 +00:00
Craig Topper	5894fe430a	Replace some explicit checks with asserts for conditions that should never happen. llvm-svn: 154305	2012-04-09 05:16:56 +00:00
Benjamin Kramer	bb6ff08766	Silence sign-compare warning. llvm-svn: 154297	2012-04-08 19:04:45 +00:00
Duncan Sands	2f1dc3814b	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Nadav Rotem	71d07ae5cb	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	5f8397a934	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Owen Anderson	98f2c0c384	Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901	2012-04-02 22:10:29 +00:00
Nadav Rotem	702f080767	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Nadav Rotem	b078350872	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Chris Lattner	1cc25e8a40	fix what looks like a real logic bug, found by PVS-Studio (part of PR12357) llvm-svn: 153513	2012-03-27 16:27:21 +00:00
Craig Topper	aaeae98936	When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend. llvm-svn: 153078	2012-03-20 05:28:39 +00:00
Duncan Sands	3fb2fc6edb	Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala. llvm-svn: 153035	2012-03-19 15:35:44 +00:00
Nadav Rotem	6fd1d32c63	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784	2012-03-15 08:49:06 +00:00
Bill Wendling	df170db2f6	Add a xform to the DAG combiner. Transform: (fsub x, (fadd x, y)) -> (fneg y) and (fsub x, (fadd y, x)) -> (fneg y) if 'unsafe math' is specified. <rdar://problem/7540295> llvm-svn: 152777	2012-03-15 05:12:00 +00:00
Evan Cheng	d5f8e5766c	Fortify r152675 a bit. Although I'm not able to come up with a test case that would trigger the truncation case. llvm-svn: 152678	2012-03-13 22:16:11 +00:00
Evan Cheng	7bf83096df	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to (i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675	2012-03-13 22:00:52 +00:00
Benjamin Kramer	e1e549d617	Give dagcombiner's worklist some inline capacity. llvm-svn: 152454	2012-03-10 00:23:58 +00:00
Evan Cheng	80893ce5f5	Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162	2012-03-06 23:33:32 +00:00
Owen Anderson	2ee7c4dfc5	Make it possible for a target to mark FSUB as Expand. This requires providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal. llvm-svn: 152079	2012-03-06 00:29:31 +00:00
James Molloy	862fe49c55	Teach the DAGCombiner that certain loadext nodes followed by ANDs can be converted to zeroexts. llvm-svn: 150957	2012-02-20 12:02:38 +00:00

1 2 3 4 5 ...

892 Commits