475 Commits

Author SHA1 Message Date
Nadav Rotem
53d32211b7 FoldBranchToCommonDest merges branches into a single branch with or/and of the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable.
llvm-svn: 194524
2013-11-12 22:37:16 +00:00
Benjamin Kramer
7c30260ab3 SimplifyCFG: Use existing constant folding logic when forming switch tables.
Both simpler and more powerful than the hand-rolled folding logic.

llvm-svn: 194475
2013-11-12 12:24:36 +00:00
Nadav Rotem
5ba1c6ced8 SimplifyCFG has a heuristics for out-of-order processors that decides when it is worthwhile to merge branches. It tries to estimate if the operands of the instruction that we want to hoist are ready. This commit marks function arguments as 'ready' because they require no calculation. This boosts libquantum and a few other workloads from the testsuite.
llvm-svn: 194346
2013-11-10 04:13:31 +00:00
Tom Stellard
e1631ddf93 SimplifyCFG: Don't duplicate calls to functions marked noduplicate v2
v2:
  - Use CI->cannotDuplicate()

llvm-svn: 193115
2013-10-21 20:07:30 +00:00
Matt Arsenault
fa64659bd8 Teach SimplifyCFG about address spaces
llvm-svn: 193104
2013-10-21 18:55:08 +00:00
Michael Gottesman
63c63ac21e Fix the predecessor removal logic in r193045.
Additionally some small comment/stylistic fixes are included as well.

llvm-svn: 193068
2013-10-21 05:20:11 +00:00
Michael Gottesman
c024f3258a Teach simplify-cfg how to correctly create covered lookup tables for switches on iN with N >= 3.
One optimization simplify-cfg performs is the converting of switches to
lookup tables if the switch has > 4 cases. This is done by:

1. Finding the max/min case value and calculating the switch case range.
2. Create a lookup table basic block.
3. Perform a check in the switch's BB to see if the input value is in
the switch's case range. If the input value satisfies said predicate
branch to the lookup table BB, otherwise branch to the switch's default
destination BB using the default value as the result.

The conditional check consists of subtracting the min case value of the
table from any input iN value and then ensuring that said value is
unsigned less than the size of the lookup table represented as an iN
value.

If the lookup table is a covered lookup table, the size of the table will be N
which is 0 as an iN value. Thus the comparison will be an `icmp ult` of an iN
value against 0 which is always false yielding the incorrect result.

This patch fixes this problem by recognizing if we have a covered lookup table
and if we do, unconditionally jumps to the lookup table BB since the covering
property of the lookup table implies no input values could not be handled by
said BB.

rdar://15268442

llvm-svn: 193045
2013-10-20 07:04:37 +00:00
Benjamin Kramer
8817cca5ce Provide basic type safety for array_pod_sort comparators.
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.

llvm-svn: 191175
2013-09-22 14:09:50 +00:00
Matt Arsenault
8227b9f69c Use type helper functions.
llvm-svn: 190113
2013-09-06 00:37:24 +00:00
Tom Stellard
aa664d9b92 Factor FlattenCFG out from SimplifyCFG
Patch by: Mei Ye

llvm-svn: 187764
2013-08-06 02:43:45 +00:00
Alexey Samsonov
9096968de5 Fix dereferencing end iterator in SimplifyCFG. Patch by Ye Mei.
llvm-svn: 187646
2013-08-02 08:06:43 +00:00
Rafael Espindola
caa776be91 Fix -Wdocumentation warnings.
llvm-svn: 187336
2013-07-28 23:43:28 +00:00
Tom Stellard
8b1e021e85 SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions
Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches.  The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.

Patch by: Mei Ye

llvm-svn: 187278
2013-07-27 00:01:07 +00:00
Craig Topper
b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Benjamin Kramer
371722288c SimplifyCFG: Teach switch generation some patterns that instcombine forms.
This allows us to create switches even if instcombine has munged two of the
incombing compares into one and some bit twiddling. This was motivated by enum
compares that are common in clang.

llvm-svn: 185632
2013-07-04 14:22:02 +00:00
Rafael Espindola
a5e536ab0e Second part of pr16069
The problem this time seems to be a thinko. We were assuming that in the CFG

A
| \
|  B
| /
C

speculating the basic block B would cause only the phi value for the B->C edge
to be speculated. That is not true, the phi's are semantically in the edges, so
if the A->B->C path is taken, any code needed for A->C is not executed and we
have to consider it too when deciding to speculate B.

llvm-svn: 183226
2013-06-04 14:11:59 +00:00
Hans Wennborg
5cf30be6e4 Typo: s/caes/cases/ in SimplifyCFG
llvm-svn: 183219
2013-06-04 11:22:30 +00:00
David Majnemer
c82f27af2a SimplifyCFG: Do not transform PHI to select if doing so would be unsafe
PR16069 is an interesting case where an incoming value to a PHI is a
trap value while also being a 'ConstantExpr'.

We do not consider this case when performing the 'HoistThenElseCodeToIf'
optimization.

Instead, make our modifications more conservative if we detect that we
cannot transform the PHI to a select.

llvm-svn: 183152
2013-06-03 20:43:12 +00:00
David Majnemer
8e7dd2f628 SimplifyCFG: Small cleanup, use ICmpInst::isEquality()
llvm-svn: 183151
2013-06-03 20:39:50 +00:00
David Majnemer
91142c485e SimplifyCFG: Fix typo in comment for ComputeSpeculationCost
llvm-svn: 183078
2013-06-01 19:43:23 +00:00
Benjamin Kramer
ad5c24f161 More symbols that should be static.
llvm-svn: 182590
2013-05-23 16:09:15 +00:00
Arnold Schwaighofer
474df6d3ed SimplifyCFG: If convert single conditional stores
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

llvm-svn: 180731
2013-04-29 21:28:24 +00:00
Arnold Schwaighofer
6eb32b31bd Revert "SimplifyCFG: If convert single conditional stores"
There is the temptation to make this tranform dependent on target information as
it is not going to be beneficial on all (sub)targets. Therefore, we should
probably do this in MI Early-Ifconversion.

This reverts commit r179957. Original commit message:

"SimplifyCFG: If convert single conditional stores

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

a[i] =
may-alias with a[i] load
if (cond)
    a[i] = Y
into an unconditional store.

a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up."

llvm-svn: 179980
2013-04-21 13:09:04 +00:00
Arnold Schwaighofer
3546ccf465 SimplifyCFG: If convert single conditional stores
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up.

llvm-svn: 179957
2013-04-20 21:42:09 +00:00
Hans Wennborg
c9e1d99279 simplifycfg: Fix integer overflow converting switch into icmp.
If a switch instruction has a case for every possible value of its type,
with the same successor, SimplifyCFG would replace it with an icmp ult,
but the computation of the bound overflows in that case, which inverts
the test.

Patch by Jed Davis!

llvm-svn: 179587
2013-04-16 08:35:36 +00:00
Bill Wendling
9534d8885f Don't remove a landing pad if the invoke requires a table entry.
An invoke may require a table entry. For instance, when the function it calls
is expected to throw.
<rdar://problem/13360379>

llvm-svn: 176827
2013-03-11 20:53:00 +00:00
Andrew Trick
a0a5ca06b9 SimplifyCFG fix for volatile load/store.
Fixes rdar:13349374.

Volatile loads and stores need to be preserved even if the language
standard says they are undefined. "volatile" in this context means "get
out of the way compiler, let my platform handle it".

Additionally, this is the only way I know of with llvm to write to the
first page (when hardware allows) without dropping to assembly.

llvm-svn: 176599
2013-03-07 01:03:35 +00:00
Chandler Carruth
329b590e6e Re-revert r173342, without losing the compile time improvements, flat
out bug fixes, or functionality preserving refactorings.

llvm-svn: 173610
2013-01-27 06:42:03 +00:00
Chandler Carruth
ceff222dea Switch this code away from Value::isUsedInBasicBlock. That code either
loops over instructions in the basic block or the use-def list of the
value, neither of which are really efficient when repeatedly querying
about values in the same basic block.

What's more, we already know that the CondBB is small, and so we can do
a much more efficient test by counting the uses in CondBB, and seeing if
those account for all of the uses.

Finally, we shouldn't blanket fail on any such instruction, instead we
should conservatively assume that those instructions are part of the
cost.

Note that this actually fixes a bug in the pass because
isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my
next commit, but the fix for it would make this code suddenly take the
compile time hit I thought it already was taking, so I wanted to go
ahead and migrate this code to a faster & better pattern.

The bug in isUsedInBasicBlock was also causing other tests to test the
wrong thing entirely: for example we weren't actually disabling
speculation for floating point operations as intended (and tested), but
the test passed because we failed to speculate them due to the
isUsedInBasicBlock failure.

llvm-svn: 173417
2013-01-25 05:40:09 +00:00
Benjamin Kramer
1c4e323fdd Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed.
Original commit message:
Plug TTI into the speculation logic, giving it a real cost interface
that can be specialized by targets.

The goal here is not to be more aggressive, but to just be more accurate
with very obvious cases. There are instructions which are known to be
truly free and which were not being modeled as such in this code -- see
the regression test which is distilled from an inner loop of zlib.

Everywhere the TTI cost model is insufficiently conservative I've added
explicit checks with FIXME comments to go add proper modelling of these
cost factors.

If this causes regressions, the likely solution is to make TTI even more
conservative in its cost estimates, but test cases will help here.

llvm-svn: 173357
2013-01-24 16:44:25 +00:00
Chandler Carruth
321c6a7c50 Revert r173342 temporarily. It appears to cause a very late miscompile
of stage2 in a bootstrap. Still investigating....

llvm-svn: 173343
2013-01-24 13:24:24 +00:00
Chandler Carruth
5f4519309f Plug TTI into the speculation logic, giving it a real cost interface
that can be specialized by targets.

The goal here is not to be more aggressive, but to just be more accurate
with very obvious cases. There are instructions which are known to be
truly free and which were not being modeled as such in this code -- see
the regression test which is distilled from an inner loop of zlib.

Everywhere the TTI cost model is insufficiently conservative I've added
explicit checks with FIXME comments to go add proper modelling of these
cost factors.

If this causes regressions, the likely solution is to make TTI even more
conservative in its cost estimates, but test cases will help here.

llvm-svn: 173342
2013-01-24 12:39:29 +00:00
Chandler Carruth
01bffaad03 Address a large chunk of this FIXME by accumulating the cost for
unfolded constant expressions rather than checking each one
independently.

llvm-svn: 173341
2013-01-24 12:05:17 +00:00
Chandler Carruth
8a21005cca Switch the constant expression speculation cost evaluation away from
a cost fuction that seems both a bit ad-hoc and also poorly suited to
evaluating constant expressions.

Notably, it is missing any support for trivial expressions such as
'inttoptr'. I could fix this routine, but it isn't clear to me all of
the constraints its other users are operating under.

The core protection that seems relevant here is avoiding the formation
of a select instruction wich a further chain of select operations in
a constant expression operand. Just explicitly encode that constraint.

Also, update the comments and organization here to make it clear where
this needs to go -- this should be driven off of real cost measurements
which take into account the number of constants expressions and the
depth of the constant expression tree.

llvm-svn: 173340
2013-01-24 11:53:01 +00:00
Chandler Carruth
7481ca8ff5 Rephrase the speculating scan of the conditional BB to be phrased in
terms of cost rather than hoisting a single instruction.

This does *not* change the cost model! We still set the cost threshold
at 1 here, it's just that we track it by accumulating cost rather than
by storing an instruction.

The primary advantage is that we no longer leave no-op intrinsics in the
basic block. For example, this will now move both debug info intrinsics
and a single instruction, instead of only moving the instruction and
leaving a basic block with nothing bug debug info intrinsics in it, and
those intrinsics now no longer ordered correctly with the hoisted value.

Instead, we now splice the entire conditional basic block's instruction
sequence.

This also places the code for checking the safety of hoisting next to
the code computing the cost.

Currently, the only observable side-effect of this change is that debug
info intrinsics are no longer abandoned. I'm not sure how to craft
a test case for this, and my real goal was the refactoring, but I'll
talk to Dave or Eric about how to add a test case for this.

llvm-svn: 173339
2013-01-24 11:52:58 +00:00
Chandler Carruth
76aacbd874 Simplify the PHI node operand rewriting.
Previously, the code would scan the PHI nodes and build up a small
setvector of candidate value pairs in phi nodes to go and rewrite. Once
certain the rewrite could be performed, the code walks the set, and for
each one re-scans the entire PHI node list looking for nodes to rewrite
operands.

Instead, scan the PHI nodes once to check for hazards, and then scan it
a second time to rewrite the operands to selects. No set vector, and
a max of two scans.

The only downside is that we might form identical selects, but
instcombine or anything else should fold those easily, and it seems
unlikely to happen often.

llvm-svn: 173337
2013-01-24 10:40:51 +00:00
Chandler Carruth
e2a779f3a7 Give the basic block variables here names based on the if-then-end
structure being analyzed. No functionality changed.

llvm-svn: 173334
2013-01-24 09:59:39 +00:00
Chandler Carruth
1d20c02f55 Lift a cheap early exit test above loops and other complex early exit
tests. No need to pay the high cost when we're never going to do
anything.

No functionality changed.

llvm-svn: 173331
2013-01-24 08:22:40 +00:00
Chandler Carruth
8a4a16618f Spiff up the comment on this method, making the example a bit more
pretty in doxygen, adding some of the details actually present in
a classic example where this matters (a loop from gzip and many other
compression algorithms), and a cautionary note about the risks inherent
in the transform. This has come up on the mailing lists recently, and
I suspect folks reading this code could benefit from going and looking
at the MI pass that can really deal with these issues.

llvm-svn: 173329
2013-01-24 08:05:06 +00:00
Duncan Sands
5924545c0c Initialize the components of this class. Otherwise GCC thinks that Array may be
used uninitialized, since it fails to understand that Array is only used when
SingleValue is not, and outputs a warning.  It also seems generally safer given
that the constructor is non-trivial and has plenty of early exits.

llvm-svn: 173242
2013-01-23 09:09:50 +00:00
Chandler Carruth
0b4ef9cedc Make SimplifyCFG simply depend upon TargetTransformInfo and pass it
through as a reference rather than a pointer. There is always *some*
implementation of this available, so this simplifies code by not having
to test for whether it is available or not.

Further, it turns out there were piles of places where SimplifyCFG was
recursing and not passing down either TD or TTI. These are fixed to be
more pedantically consistent even though I don't have any particular
cases where it would matter.

llvm-svn: 171691
2013-01-07 03:53:25 +00:00
Chandler Carruth
d3e73556d6 Move TargetTransformInfo to live under the Analysis library. This no
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
2013-01-07 03:08:10 +00:00
Chandler Carruth
6db43e6ca3 Switch SimplifyCFG over to the TargetTransformInfo interface rather than
the ScalarTargetTransformInfo interface.

llvm-svn: 171617
2013-01-05 10:05:26 +00:00
Chandler Carruth
9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth
ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Chandler Carruth
d9ef81e133 Fix non-determinism introduced in r168970 and pointed out by Duncan.
We're iterating over a non-deterministically ordered container looking
for two saturating flags. To do this correctly, we have to saturate
both, and only stop looping if both saturate to their final value.
Otherwise, which flag we see first changes the result.

This is also a micro-optimization of the previous version as now we
don't go into the (possibly expensive) test logic once the first
violation of either constraint is detected.

llvm-svn: 168989
2012-11-30 09:34:29 +00:00
Chandler Carruth
77d433dafe Rearrange the comments, control flow, and variable names; no
functionality changed.

Evan's commit r168970 moved the code that the primary comment in this
function referred to to the other end of the function without moving the
comment, and there has been a steady creep of "boolean" logic in it that
is simpler if handled via early exit. That way each special case can
have its own comments. I've also made the variable name a bit more
explanatory than "AllFit". This is in preparation to fix the
non-deterministic output of this function.

llvm-svn: 168988
2012-11-30 09:26:25 +00:00
Evan Cheng
65df808f62 Fix logic to determine whether to turn a switch into a lookup table. When
the tables cannot fit in registers (i.e. bitmap), do not emit the table
if it's using an illegal type.

rdar://12779436

llvm-svn: 168970
2012-11-30 02:02:42 +00:00
Hans Wennborg
7b8af0ea05 SimplifyCFG: Don't assume non-null ScalarTargetTransformInfo.
Patch by Pekka Jääskeläinen!

llvm-svn: 168176
2012-11-16 18:22:08 +00:00
Andrew Trick
7656f6dbf7 misspell
llvm-svn: 168058
2012-11-15 18:40:31 +00:00