456 Commits

Author SHA1 Message Date
Benjamin Kramer
27361a7124 SimplifyCFG: GEPs with constant indices are cheap enough to be executed unconditionally.
llvm-svn: 126445
2011-02-24 22:46:11 +00:00
Benjamin Kramer
8d6a8c130b SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch.
Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler.

llvm-svn: 125056
2011-02-07 22:37:28 +00:00
Benjamin Kramer
62aa46b852 SimplifyCFG: Also transform switches that represent a range comparison but are not sorted into sub+icmp.
This transforms another 1000 switches in gcc.c.

llvm-svn: 124826
2011-02-03 22:51:41 +00:00
Benjamin Kramer
f4ea1d5f79 SimplifyCFG: Turn switches into sub+icmp+branch if possible.
This makes the job of the later optzn passes easier, allowing the vast amount of
icmp transforms to chew on it.

We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting
binary on i386-linux.

The testcase from README.txt now compiles into
  decl  %edi
  cmpl  $3, %edi
  sbbl  %eax, %eax
  andl  $1, %eax
  ret

llvm-svn: 124724
2011-02-02 15:56:22 +00:00
Evan Cheng
d983eba7dc Re-apply r124518 with fix. Watch out for invalidated iterator.
llvm-svn: 124526
2011-01-29 04:46:23 +00:00
Evan Cheng
65b8ccf6ac Revert r124518. It broke Linux self-host.
llvm-svn: 124522
2011-01-29 02:43:04 +00:00
Evan Cheng
d4eff31476 Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand.
llvm-svn: 124518
2011-01-29 01:29:26 +00:00
Evan Cheng
aaa9606b2f Revert r124462. There are a few big regressions that I need to fix first.
llvm-svn: 124478
2011-01-28 07:12:38 +00:00
Evan Cheng
417fca86c4 - Stop simplifycfg from duplicating "ret" instructions into unconditional
branches. PR8575, rdar://5134905, rdar://8911460.
- Allow codegen tail duplication to dup small return blocks after register
  allocation is done.

llvm-svn: 124462
2011-01-28 02:19:21 +00:00
Frits van Bommel
8e158495f1 Factor the actual simplification out of SimplifyIndirectBrOnSelect and into a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect.
No functional change.

llvm-svn: 123234
2011-01-11 12:52:11 +00:00
Chris Lattner
6b8b4855ff simplify this a bit.
llvm-svn: 122156
2010-12-18 20:22:49 +00:00
Benjamin Kramer
e5f49c4ff2 SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build.
llvm-svn: 122054
2010-12-17 10:48:14 +00:00
Chris Lattner
d14b0f1db7 improve switch formation to handle small range
comparisons formed by comparisons.  For example,
this:

void foo(unsigned x) {
  if (x == 0 || x == 1 || x == 3 || x == 4 || x == 6) 
    bar();
}

compiles into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$6, %edi
	ja	LBB0_2
## BB#1:                                ## %entry
	movl	%edi, %eax
	movl	$91, %ecx
	btq	%rax, %rcx
	jb	LBB0_3

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$2, %edi
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	cmpl	$6, %edi
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movl	%edi, %eax
	movl	$88, %ecx
	btq	%rax, %rcx
	jb	LBB0_4

This catches a bunch of cases in GCC, which look like this:

 %804 = load i32* @which_alternative, align 4, !tbaa !0
 %805 = icmp ult i32 %804, 2
 %806 = icmp eq i32 %804, 3
 %or.cond121 = or i1 %805, %806
 %807 = icmp eq i32 %804, 4
 %or.cond124 = or i1 %or.cond121, %807
 br i1 %or.cond124, label %.thread, label %808

turning this into a range comparison.

llvm-svn: 122045
2010-12-17 06:20:15 +00:00
Chris Lattner
e893e2601e make qsort predicate more conformant by returning 0 for equal values.
llvm-svn: 121838
2010-12-15 04:52:41 +00:00
Chris Lattner
7499b452c1 - Insert new instructions before DomBlock's terminator,
which is simpler than finding a place to insert in BB.
 - Don't perform the 'if condition hoisting' xform on certain
   i1 PHIs, as it interferes with switch formation.

This re-fixes "example 7", without breaking the world hopefully.

llvm-svn: 121764
2010-12-14 08:46:09 +00:00
Chris Lattner
335f0e4ad4 fix two significant issues with FoldTwoEntryPHINode:
first, it can kick in on blocks whose conditions have been
folded to a constant, even though one of the edges will be
trivially folded.

second, it doesn't clean up the "if diamond" that it just 
eliminated away.  This is a problem because other simplifycfg
xforms kick in depending on the order of block visitation,
causing pointless work.

llvm-svn: 121762
2010-12-14 08:01:53 +00:00
Chris Lattner
dc20a7d38c remove the instsimplify logic I added in r121754. It is apparently
breaking the selfhost builds, though I can't fathom how.

llvm-svn: 121761
2010-12-14 07:53:03 +00:00
Chris Lattner
9ac168d0ab clean up logic, convert std::set to SmallPtrSet, handle the case
when all 2-entry phis are simplified away.

llvm-svn: 121760
2010-12-14 07:41:39 +00:00
Chris Lattner
9fd838d31b tidy up a bit, move DEBUG down to when we commit to doing the transform so we
don't print it unless the xform happens.

llvm-svn: 121758
2010-12-14 07:23:10 +00:00
Chris Lattner
b42d293faa use SimplifyInstruction instead of reimplementing part of it.
llvm-svn: 121757
2010-12-14 07:20:29 +00:00
Chris Lattner
fb73de482c simplify GetIfCondition by using getSinglePredecessor.
llvm-svn: 121756
2010-12-14 07:15:21 +00:00
Chris Lattner
0f4d67bd88 use AddPredecessorToBlock in 3 places instead of a manual loop.
llvm-svn: 121755
2010-12-14 07:09:42 +00:00
Chris Lattner
a07cc6f4fd make FoldTwoEntryPHINode use instsimplify a bit, make
GetIfCondition faster by avoiding pred_iterator.  No
really interesting change.

llvm-svn: 121754
2010-12-14 07:00:00 +00:00
Chris Lattner
d7beca3782 improve DEBUG's a bit, switch to eraseFromParent() to simplify
code a bit, switch from constant folding to instsimplify.

llvm-svn: 121751
2010-12-14 06:17:25 +00:00
Chris Lattner
5a9d59d918 reapply my recent change that disables a piece of the switch formation
work, but fixes 400.perlbmk.

llvm-svn: 121749
2010-12-14 05:57:30 +00:00
Owen Anderson
3e5648896e Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state
where I'm confident there were no crashes or miscompilations.  XFAIL the test added since then for now.

llvm-svn: 121733
2010-12-13 23:49:28 +00:00
Chris Lattner
a6e5d5694a temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk.
llvm-svn: 121728
2010-12-13 23:02:19 +00:00
Chris Lattner
2d434e594e add some DEBUG's.
llvm-svn: 121711
2010-12-13 19:55:30 +00:00
Benjamin Kramer
1e155ab7e1 Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780.
llvm-svn: 121705
2010-12-13 18:20:38 +00:00
Chris Lattner
fb836f8c1a reinstate my patch: the miscompile was caused by an inverted branch in the
'and' case.

llvm-svn: 121695
2010-12-13 08:12:19 +00:00
Chris Lattner
79db357d80 Completely disable the optimization I added in r121680 until
I can track down a miscompile.  This should bring the buildbots
back to life

llvm-svn: 121693
2010-12-13 07:41:29 +00:00
Chris Lattner
fbeb55844b Make simplifycfg reprocess newly formed "br (cond1 | cond2)" conditions
when simplifying, allowing them to be eagerly turned into switches.  This
is the last step required to get "Example 7" from this blog post:
http://blog.regehr.org/archives/320

On X86, we now generate this machine code, which (to my eye) seems better
than the ICC generated code:

_crud:                                  ## @crud
## BB#0:                                ## %entry
	cmpb	$33, %dil
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	addb	$-34, %dil
	cmpb	$58, %dil
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movzbl	%dil, %eax
	movabsq	$288230376537592865, %rcx ## imm = 0x400000017001421
	btq	%rax, %rcx
	jb	LBB0_4
LBB0_3:                                 ## %lor.rhs
	xorl	%eax, %eax
	ret
LBB0_4:                                 ## %lor.end
	movl	$1, %eax
	ret

llvm-svn: 121690
2010-12-13 07:00:06 +00:00
Chris Lattner
1d05761df4 make this logic a bit simpler.
llvm-svn: 121689
2010-12-13 06:36:51 +00:00
Chris Lattner
25c3af35d8 split all the guts of SimplifyCFGOpt::run out into one function
per terminator kind.

llvm-svn: 121688
2010-12-13 06:25:44 +00:00
Chris Lattner
cb570f87e5 fix a bug in r121680 that upset the various buildbots.
llvm-svn: 121687
2010-12-13 05:34:18 +00:00
Chris Lattner
a6db741f3d refactor the speculative execution logic to be factored into the cond branch code instead of
doing a cfg search for every block simplified.

llvm-svn: 121686
2010-12-13 05:26:52 +00:00
Chris Lattner
466f54ffcf simplify a bunch of code.
llvm-svn: 121685
2010-12-13 05:20:28 +00:00
Chris Lattner
6df7bdd810 move HoistThenElseCodeToIf up to a more logical and efficient-to-handle place.
llvm-svn: 121684
2010-12-13 05:15:29 +00:00
Chris Lattner
2e3832d9a0 move 'MergeBlocksIntoPredecessor' call earlier. Use
getSinglePredecessor to simplify code.

llvm-svn: 121683
2010-12-13 05:10:48 +00:00
Chris Lattner
a69c443459 factor new code out to a SimplifyBranchOnICmpChain helper function.
llvm-svn: 121681
2010-12-13 05:03:41 +00:00
Chris Lattner
a442f24a36 enhance the "change or icmp's into switch" xform to handle one value in an
'or sequence' that it doesn't understand.  This allows us to optimize
something insane like this:

int crud (unsigned char c, unsigned x)
 {
   if(((((((((( (int) c <= 32 ||
                    (int) c == 46) || (int) c == 44)
                  || (int) c == 58) || (int) c == 59) || (int) c == 60)
               || (int) c == 62) || (int) c == 34) || (int) c == 92)
            || (int) c == 39) != 0)
     foo();
 }

into:

define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone {
entry:
  %cmp = icmp ult i8 %c, 33
  br i1 %cmp, label %if.then, label %switch.early.test

switch.early.test:                                ; preds = %entry
  switch i8 %c, label %if.end [
    i8 39, label %if.then
    i8 44, label %if.then
    i8 58, label %if.then
    i8 59, label %if.then
    i8 60, label %if.then
    i8 62, label %if.then
    i8 46, label %if.then
    i8 92, label %if.then
    i8 34, label %if.then
  ]

by pulling the < comparison out ahead of the newly formed switch.

llvm-svn: 121680
2010-12-13 04:50:38 +00:00
Chris Lattner
5a177e681e merge two very similar functions into one that has a bool argument.
llvm-svn: 121678
2010-12-13 04:26:26 +00:00
Chris Lattner
9b1af510cb don't bother handling non-canonical icmp's
llvm-svn: 121676
2010-12-13 04:18:32 +00:00
Chris Lattner
395252d93e inline a function, making the result much simpler.
llvm-svn: 121675
2010-12-13 04:15:19 +00:00
Chris Lattner
62cc76e9cc Fix my previous patch to handle a degenerate case that the llvm-gcc
bootstrap buildbot tripped over.

llvm-svn: 121674
2010-12-13 03:43:57 +00:00
Chris Lattner
11dafaa3ec convert some methods to be static functions
llvm-svn: 121673
2010-12-13 03:30:12 +00:00
Chris Lattner
4642d79fb0 zap two more std::sorts.
llvm-svn: 121672
2010-12-13 03:24:30 +00:00
Chris Lattner
d9bacc088a fix a fairly serious oversight with switch formation from
or'd conditions.  Previously we'd compile something like this:

int crud (unsigned char c) {
   return c == 62 || c == 34 || c == 92;
}

into:

  switch i8 %c, label %lor.rhs [
    i8 62, label %lor.end
    i8 34, label %lor.end
  ]

lor.rhs:                                          ; preds = %entry
  %cmp8 = icmp eq i8 %c, 92
  br label %lor.end

lor.end:                                          ; preds = %entry, %entry, %lor.rhs
  %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ]
  %lor.ext = zext i1 %0 to i32
  ret i32 %lor.ext

which failed to merge the compare-with-92 into the switch.  With this patch
we simplify this all the way to:

  switch i8 %c, label %lor.rhs [
    i8 62, label %lor.end
    i8 34, label %lor.end
    i8 92, label %lor.end
  ]

lor.rhs:                                          ; preds = %entry
  br label %lor.end

lor.end:                                          ; preds = %entry, %entry, %entry, %lor.rhs
  %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ]
  %lor.ext = zext i1 %0 to i32
  ret i32 %lor.ext

which is much better for codegen's switch lowering stuff.  This kicks in 33 times
on 176.gcc (for example) cutting 103 instructions off the generated code.

llvm-svn: 121671
2010-12-13 03:18:54 +00:00
Chris Lattner
7c8e6047d6 convert an std::sort to array_pod_sort.
llvm-svn: 121669
2010-12-13 02:00:58 +00:00
Chris Lattner
1475987634 move the "br (X == 0 | X == 1), T, F" -> switch optimization to a new
location in simplifycfg.  In the old days, SimplifyCFG was never run on
the entry block, so we had to scan over all preds of the BB passed into
simplifycfg to do this xform, now we can just check blocks ending with
a condbranch.  This avoids a scan over all preds of every simplified 
block, which should be a significant compile-time perf win on functions
with lots of edges.  No functionality change.

llvm-svn: 121668
2010-12-13 01:57:34 +00:00