llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-13 19:46:04 +00:00

Author	SHA1	Message	Date
Matt Arsenault	bf0db918b2	R600: Implement getRecipEstimate This requires a new hook to prevent expanding sqrt in terms of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are all the same rate, so this expansion would just double the number of instructions and cycles. llvm-svn: 225828	2015-01-13 20:53:23 +00:00
Olivier Sallenave	325096980b	Added TLI hook for isFPExtFree. Some of the FMA combine heuristics are now guarded with that hook. llvm-svn: 225795	2015-01-13 15:06:36 +00:00
Matt Arsenault	a982e4f82b	Combine fcmp + select to fminnum / fmaxnum if no nans and legal Also require unsafe FP math for no since there isn't a way to test for signed zeros. llvm-svn: 225744	2015-01-13 00:43:00 +00:00
Hal Finkel	0ce7f372e5	[DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities) As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA, we need to extend the incoming operands so that the resulting node will really be legal. This is currently enabled only for PowerPC, and it happens to work there regardless, but this should fix the functionality for everyone else should anyone else wish to use it. llvm-svn: 225492	2015-01-09 01:29:29 +00:00
Hal Finkel	33ead6f901	Partial fix to r225380 (More FMA folding opportunities) As pointed out by Aditya (and Owen), there are two things wrong with this code. First, it adds patterns which elide FP extends when forming FMAs, and that might not be profitable on all targets (it belongs behind the pre-existing aggressive-FMA-formation flag). This is fixed by this change. Second, the resulting nodes might have operands of different types (the extensions need to be re-added). That will be fixed in the follow-up commit. llvm-svn: 225485	2015-01-09 00:45:54 +00:00
Ahmed Bougacha	2b6917b020	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Olivier Sallenave	0451532996	More FMA folding opportunities. llvm-svn: 225380	2015-01-07 20:54:17 +00:00
Olivier Sallenave	e64ad7cedd	Test commit llvm-svn: 225368	2015-01-07 19:45:17 +00:00
Craig Topper	d3c02f177a	Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert. llvm-svn: 225160	2015-01-05 10:15:49 +00:00
Alexey Samsonov	553185ee4b	Revert "merge consecutive stores of extracted vector elements" This reverts commit r224611. This change causes crashes in X86 DAG->DAG Instruction Selection. llvm-svn: 225031	2014-12-31 00:40:28 +00:00
Mehdi Amini	d38920891e	Always assert in DAGCombine and not only when -debug is enabled Right now in DAG Combine check the validity of the returned type only when -debug is given on the command line. However usually the test cases in the validation does not use -debug. An Assert build should always check this. llvm-svn: 224779	2014-12-23 18:59:02 +00:00
Michael Kuperstein	f4536ea6e8	[DagCombine] Improve DAGCombiner BUILD_VECTOR when it has two sources of elements This partially fixes PR21943. For AVX, we go from: vmovq (%rsi), %xmm0 vmovq (%rdi), %xmm1 vpermilps $-27, %xmm1, %xmm2 ## xmm2 = xmm1[1,1,2,3] vinsertps $16, %xmm2, %xmm1, %xmm1 ## xmm1 = xmm1[0],xmm2[0],xmm1[2,3] vinsertps $32, %xmm0, %xmm1, %xmm1 ## xmm1 = xmm1[0,1],xmm0[0],xmm1[3] vpermilps $-27, %xmm0, %xmm0 ## xmm0 = xmm0[1,1,2,3] vinsertps $48, %xmm0, %xmm1, %xmm0 ## xmm0 = xmm1[0,1,2],xmm0[0] To the expected: vmovq (%rdi), %xmm0 vmovhpd (%rsi), %xmm0, %xmm0 retq Fixing this for AVX2 is still open. Differential Revision: http://reviews.llvm.org/D6749 llvm-svn: 224759	2014-12-23 08:59:45 +00:00
Sanjay Patel	0428a5786e	merge consecutive stores of extracted vector elements Add a path to DAGCombiner::MergeConsecutiveStores() to combine multiple scalar stores when the store operands are extracted vector elements. This is a partial fix for PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ). For the new test case, codegen improves from: vmovss %xmm0, (%rdi) vextractps $1, %xmm0, 4(%rdi) vextractps $2, %xmm0, 8(%rdi) vextractps $3, %xmm0, 12(%rdi) vextractf128 $1, %ymm0, %xmm0 vmovss %xmm0, 16(%rdi) vextractps $1, %xmm0, 20(%rdi) vextractps $2, %xmm0, 24(%rdi) vextractps $3, %xmm0, 28(%rdi) vzeroupper retq To: vmovups %ymm0, (%rdi) vzeroupper retq Patch reviewed by Nadav Rotem. Differential Revision: http://reviews.llvm.org/D6698 llvm-svn: 224611	2014-12-19 20:23:41 +00:00
Michael Kuperstein	047b1a0400	[DAGCombine] Slightly improve lowering of BUILD_VECTOR into a shuffle. This handles the case of a BUILD_VECTOR being constructed out of elements extracted from a vector twice the size of the result vector. Previously this was always scalarized. Now, we try to construct a shuffle node that feeds on extract_subvectors. This fixes PR15872 and provides a partial fix for PR21711. Differential Revision: http://reviews.llvm.org/D6678 llvm-svn: 224429	2014-12-17 12:32:17 +00:00
Matt Arsenault	810cb62962	Add target hook for whether it is profitable to reduce load widths Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084	2014-12-12 00:00:24 +00:00
Owen Anderson	558012a3fc	Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64. llvm-svn: 223760	2014-12-09 06:50:39 +00:00
Simon Pilgrim	be24ab367b	[InstCombine] Minor optimization for bswap with binary ops Added instcombine optimizations for BSWAP with AND/OR/XOR ops: OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) ) OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) ) Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well: fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y)) Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone. Differential Revision: http://reviews.llvm.org/D6407 llvm-svn: 223349	2014-12-04 09:44:01 +00:00
Elena Demikhovsky	f1de34b84d	Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again. Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 223348	2014-12-04 09:40:44 +00:00
Duncan P. N. Exon Smith	9bc81fbe92	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
Elena Demikhovsky	9e5089a938	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Andrea Di Biagio	0225b5bf6f	[DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero. Before this patch, the DAGCombiner only tried to convert build_vector dag nodes into shuffles if all operands were either extract_vector_elt or undef. This patch improves that logic and teaches the DAGCombiner how to deal with build_vector dag nodes where one or more operands are zero. A build_vector dag node with some zero operands is turned into a shuffle only if the resulting shuffle mask is legal for the target. llvm-svn: 222536	2014-11-21 14:32:06 +00:00
Andrea Di Biagio	26e8f4d166	[DAG] Refactor the shuffle combining logic in DAGCombiner. NFC. This patch simplifies the logic that combines a pair of shuffle nodes into a single shuffle if there is a legal mask. Also added comments to better describe the algorithm. No functional change intended. llvm-svn: 222522	2014-11-21 11:33:07 +00:00
Hao Liu	44e5d7a131	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510	2014-11-21 06:39:58 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Oliver Stannard	d29db9b949	Fix optimisations of SELECT_CC which assumed result is boolean Some optimisations in DAGCombiner cause miscompilations for targets that use TargetLowering::UndefinedBooleanContent, because they assume that the results of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and XORed. These optimisations are only valid for targets that use ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent. This is a follow-up to D6210/r221693. llvm-svn: 222123	2014-11-17 10:49:31 +00:00
Andrea Di Biagio	e13a0b81f4	[DAG] Improved target independent vector shuffle folding logic. This patch teaches the DAGCombiner how to combine shuffles according to rules: shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2) llvm-svn: 222090	2014-11-15 22:56:25 +00:00
Oliver Stannard	8c2c67e63c	LLVM incorrectly folds xor into select LLVM replaces the SelectionDAG pattern (xor (set_cc cc x y) 1) with (set_cc !cc x y), which is only correct when the xor has type i1. Instead, we should check that the constant operand to the xor is all ones. llvm-svn: 221693	2014-11-11 17:36:01 +00:00
Andrea Di Biagio	ce46b97b48	[X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks. This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. llvm-svn: 221343	2014-11-05 13:04:14 +00:00
Paul Robinson	ad06e430ce	Normally an 'optnone' function goes through fast-isel, which does not call DAGCombiner. But we ran into a case (on Windows) where the calling convention causes argument lowering to bail out of fast-isel, and we end up in CodeGenAndEmitDAG() which does run DAGCombiner. So, we need to make DAGCombiner check for 'optnone' after all. Commit includes the test that found this, plus another one that got missed in the original optnone work. llvm-svn: 221168	2014-11-03 18:19:26 +00:00
Louis Gerbarg	e8f9c78247	Fix incorrect invariant check in DAG Combine Earlier this summer I fixed an issue where we were incorrectly combining multiple loads that had different constraints such alignment, invariance, temporality, etc. Apparently in one case I made copt paste error and swapped alignment and invariance. Tests included. rdar://18816719 llvm-svn: 220933	2014-10-30 22:21:03 +00:00
NAKAMURA Takumi	f51a34ec1f	Whitespace. llvm-svn: 220857	2014-10-29 15:23:11 +00:00
Sanjay Patel	957efc23bb	Use rsqrt (X86) to speed up reciprocal square root calcs This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 llvm-svn: 220570	2014-10-24 17:02:16 +00:00
Benjamin Kramer	7ad22403fb	Strength reduce constant-sized vectors into arrays. No functionality change. llvm-svn: 220412	2014-10-22 19:55:26 +00:00
Matt Arsenault	7c93690be0	Add minnum / maxnum codegen llvm-svn: 220342	2014-10-21 23:01:01 +00:00
Jan Vesely	af62cf4db0	SelectionDAG: Add sext_inreg optimizations v2: use dyn_cast fixup comments v3: use cast Reviewed-by: Matt Arsenault <arsenm2@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220044	2014-10-17 14:45:25 +00:00
Sanjay Patel	3d497cd778	Improve sqrt estimate algorithm (fast-math) This patch changes the fast-math implementation for calculating sqrt(x) from: y = 1 / (1 / sqrt(x)) to: y = x * (1 / sqrt(x)) This has 2 benefits: less code / faster code and one less estimate instruction that may lose precision. The only target that will be affected (until http://reviews.llvm.org/D5658 is approved) is PPC. The difference in codegen for PPC is 2 less flops for a single-precision sqrtf or vector sqrtf and 4 less flops for a double-precision sqrt. We also eliminate a constant load and extra register usage. Differential Revision: http://reviews.llvm.org/D5682 llvm-svn: 219445	2014-10-09 21:26:35 +00:00
Eric Christopher	40cba91ad1	Remove unnecessary include. llvm-svn: 219368	2014-10-08 23:38:40 +00:00
Eric Christopher	f55d4714d2	Use both the cached TLI and the subtarget off of the DAG in the DAG combiner. llvm-svn: 219367	2014-10-08 23:38:39 +00:00
Hal Finkel	9808595319	[DAGCombine] Remove SIGN_EXTEND-related inf-loop The patch's author points out that, despite the function's documentation, getSetCCResultType is only used to get the SETCC result type (with one here-removed problematic exception). In one case, getSetCCResultType was being used to get the predicate type to use for a SELECT node, and then SIGN_EXTENDing (or truncating) to get the input predicate to match that type. Unfortunately, this was happening inside visitSIGN_EXTEND, and creating new SIGN_EXTEND nodes was causing an infinite loop. In addition, this behavior was wrong if a target was not using ZeroOrNegativeOneBooleanContent. Lastly, the extension/truncation seems unnecessary here: SELECT is defined as: Select(COND, TRUEVAL, FALSEVAL). If the type of the boolean COND is not i1 then the high bits must conform to getBooleanContents. So here we remove this use of getSetCCResultType and update getSetCCResultType's documentation to reflect its actual uses. Patch by deadal nix! llvm-svn: 219141	2014-10-06 20:19:47 +00:00
Sanjay Patel	7bc9185ab5	Fast-math fold: x / (y * sqrt(z)) -> x * (rsqrt(z) / y) The motivation is to recognize code such as this from /llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c: float distance = sqrt(dx * dx + dy * dy + dz * dz); float mag = dt / (distance * distance * distance); Without this patch, we don't match the sqrt as a reciprocal sqrt, so for PPC the new testcase in this patch produces: addis 3, 2, .LCPI4_2@toc@ha lfs 4, .LCPI4_2@toc@l(3) addis 3, 2, .LCPI4_1@toc@ha lfs 0, .LCPI4_1@toc@l(3) fcmpu 0, 1, 4 beq 0, .LBB4_2 # BB#1: frsqrtes 4, 1 addis 3, 2, .LCPI4_0@toc@ha lfs 5, .LCPI4_0@toc@l(3) fnmsubs 13, 1, 5, 1 fmuls 6, 4, 4 fmadds 1, 13, 6, 5 fmuls 1, 4, 1 fres 4, 1 <--- reciprocal of reciprocal square root fnmsubs 1, 1, 4, 0 fmadds 4, 4, 1, 4 .LBB4_2: fmuls 1, 4, 2 fres 2, 1 fnmsubs 0, 1, 2, 0 fmadds 0, 2, 0, 2 fmuls 1, 3, 0 blr After the patch, this simplifies to: frsqrtes 0, 1 addis 3, 2, .LCPI4_1@toc@ha fres 5, 2 lfs 4, .LCPI4_1@toc@l(3) addis 3, 2, .LCPI4_0@toc@ha lfs 7, .LCPI4_0@toc@l(3) fnmsubs 13, 1, 4, 1 fmuls 6, 0, 0 fnmsubs 2, 2, 5, 7 fmadds 1, 13, 6, 4 fmadds 2, 5, 2, 5 fmuls 0, 0, 1 fmuls 0, 0, 2 fmuls 1, 3, 0 blr Differential Revision: http://reviews.llvm.org/D5628 llvm-svn: 219139	2014-10-06 19:31:18 +00:00
Chandler Carruth	daa1ff985c	[x86, dag] Teach the DAG combiner to prune inputs toa vector_shuffle that are unused. This allows the combiner to delete math feeding shuffles where the math isn't actually necessary. This improves some of the vperm2x128 tests that regressed when the vector shuffle lowering started actually generating vperm instructions rather than forcibly decomposing them. Sadly, this isn't enough to get this really right because we still form a completely unnecessary permutation. To fix that, we also need to fold shuffles which just rearrange concatenated or inserted subvectors. llvm-svn: 219086	2014-10-05 19:14:34 +00:00
Sanjay Patel	ab7f460bca	Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC. llvm-svn: 218700	2014-09-30 20:44:23 +00:00
Sanjay Patel	8fde95cb2b	Split the estimate() interface into separate functions for each type. NFC. It was hacky to use an opcode as a switch because it won't always match (rsqrte != sqrte), and it looks like we'll need to add more special casing per arch than I had hoped for. Eg, x86 will prefer a different NR estimate implementation. ARM will want to use it's 'step' instructions. There also don't appear to be any new estimate instructions in any arch in a long, long time. Altivec vloge and vexpte may have been the first and last in that field... llvm-svn: 218698	2014-09-30 20:28:48 +00:00
Andrea Di Biagio	c7c524129b	[DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle. Currently, the DAG Combiner only tries to convert type-legal build_vector nodes into shuffles. This patch simply moves the logic that checks if a build_vector has a legal value type up before we even start analyzing the operands. This allows to early exit immediately from method 'visitBUILD_VECTOR' if the node type is known to be illegal. No functional change intended. llvm-svn: 218677	2014-09-30 15:30:22 +00:00
James Molloy	463db9a77c	[AArch64] Redundant store instructions should be removed as dead code If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed. This problem is found in spec2006-197.parser. For example, stur w10, [x11, #-4] stur w10, [x11, #-4] Then one of the two stur instructions can be removed. Patch by David Xu! llvm-svn: 218569	2014-09-27 17:02:54 +00:00
Sanjay Patel	bdf1e38856	Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2). This is purely refactoring. No functional changes intended. PowerPC is the only target that is currently using this interface. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) And: z = y / x into: z = y * rcpe(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction along with the number of refinement steps needed to make the estimate usable. Differential Revision: http://reviews.llvm.org/D5484 llvm-svn: 218553	2014-09-26 23:01:47 +00:00
David Xu	418da223dd	Revert patch ofr218493 llvm-svn: 218494	2014-09-26 02:28:03 +00:00
David Xu	64f661ee0b	Redundant store instructions should be removed as dead code llvm-svn: 218493	2014-09-26 02:02:09 +00:00
Sanjay Patel	6a42292795	Use SDValue bool operator to reduce code. No functional change. llvm-svn: 218314	2014-09-23 16:24:20 +00:00
Sanjay Patel	b67bd262ea	Refactor reciprocal square root estimate into target-independent function; NFC. This is purely a plumbing patch. No functional changes intended. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . The first step is to add a target hook for RSQRTE, take the already target-independent code selfishly hoarded by PPC, and put it into DAGCombiner. Next steps: The code in DAGCombiner::BuildRSQRTE() should be refactored further; tests that exercise that logic need to be added. Logic in PPCTargetLowering::BuildRSQRTE() should be hoisted into DAGCombiner. X86 and AArch64 overrides for TargetLowering.BuildRSQRTE() should be added. Differential Revision: http://reviews.llvm.org/D5425 llvm-svn: 218219	2014-09-21 15:19:15 +00:00

1 2 3 4 5 ...

1309 Commits