17 Commits

Author SHA1 Message Date
Kyle Butt
2facd194a2 Revert "Codegen: Tail-duplicate during placement."
This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4.

llvm-svn: 283647
2016-10-08 01:47:05 +00:00
Kyle Butt
37e676d857 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283619
2016-10-07 22:33:20 +00:00
Kyle Butt
25ac35d822 Revert "Codegen: Tail-duplicate during placement."
This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc.

Issue with loop info and block removal revealed by polly.
I have a fix for this issue already in another patch, I'll re-roll this
together with that fix, and a test case.

llvm-svn: 283292
2016-10-05 01:39:29 +00:00
Kyle Butt
adabac2d57 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well.

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283274
2016-10-04 23:54:18 +00:00
Kyle Butt
3ffb8529bc Revert "Codegen: Tail-duplicate during placement."
This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2.

Causing crashes on aarch64 build.

llvm-svn: 283172
2016-10-04 00:38:23 +00:00
Kyle Butt
396bfdd707 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

llvm-svn: 283164
2016-10-04 00:00:09 +00:00
Dan Gohman
b7c2400fa7 [WebAssembly] Optimize away return instructions using fallthroughs.
This saves a small amount of code size, and is a first small step toward
passing values on the stack across block boundaries.

Differential Review: http://reviews.llvm.org/D20450

llvm-svn: 270294
2016-05-21 00:21:56 +00:00
Dan Gohman
7100809080 [WebAssembly] Rename $discard to $drop in the assembly output.
llvm-svn: 269862
2016-05-17 23:19:03 +00:00
Dan Gohman
0cfb5f852d [WebAssembly] Move register stackification and coloring to a late phase.
Move the register stackification and coloring passes to run very late, after
PEI, tail duplication, and most other passes. This means that all code emitted
and expanded by those passes is now exposed to these passes. This also
eliminates the need for prologue/epilogue code to be manually stackified,
which significantly simplifies the code.

This does require running LiveIntervals a second time. It's useful to think
of these late passes not as late optimization passes, but as a domain-specific
compression algorithm based on knowledge of liveness information. It's used to
compress the code after all conventional optimizations are complete, which is
why it uses LiveIntervals at a phase when actual optimization passes don't
typically need it.

Differential Revision: http://reviews.llvm.org/D20075

llvm-svn: 269012
2016-05-10 04:24:02 +00:00
Derek Schuff
d4207ba0f6 [WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback
Summary:
MRI::eliminateFrameIndex can emit several instructions to do address
calculations; these can usually be stackified. Because instructions with
FI operands can have subsequent operands which may be expression trees,
find the top of the leftmost tree and insert the code before it, to keep
the LIFO property.

Also use stackified registers when writing back the SP value to memory
in the epilog; it's unnecessary because SP will not be used after the
epilog, and it results in better code.

Differential Revision: http://reviews.llvm.org/D18234

llvm-svn: 263725
2016-03-17 17:00:29 +00:00
Derek Schuff
f9c0a5c377 Revert "[WebAssembly] Stackify code emitted by eliminateFrameIndex"
This reverts r261685 due to wasm test breakage.

llvm-svn: 261702
2016-02-23 22:13:21 +00:00
Derek Schuff
b21570cc1d [WebAssembly] Stackify code emitted by eliminateFrameIndex
llvm-svn: 261685
2016-02-23 21:25:17 +00:00
Derek Schuff
dc5f6aa4bb [WebAssembly] Stackify function prologs and epilogs
The instructions are the same, but fewer locals are used.

Differential Revision: http://reviews.llvm.org/D17428

llvm-svn: 261452
2016-02-20 21:46:50 +00:00
Dan Gohman
ece881d518 [WebAssembly] Add a test for the mem-intrinsic code in WebAssemblyPeephole.cpp
llvm-svn: 258895
2016-01-27 01:37:52 +00:00
Derek Schuff
90d9e8d370 [WebAssembly] Omit no-op adds for non-mem uses of FrameIndex
Differential Revision: http://reviews.llvm.org/D16554

llvm-svn: 258872
2016-01-26 22:47:43 +00:00
JF Bastien
1a6c7608b1 WebAssembly: don't optimize memcpy/memmove/memcpy to frame index
r258781 optimized memcpy/memmove/memcpy so the intrinsic call can return its first argument, but missed the frame index case. Teach it to ignore that case so C code doesn't assert out in these cases.

llvm-svn: 258851
2016-01-26 20:22:42 +00:00
Dan Gohman
bdf08d5da6 [WebAssembly] Optimize memcpy/memmove/memcpy calls.
These calls return their first argument, but because LLVM uses an intrinsic
with a void return type, they can't use the returned attribute. Generalize
the store results pass to optimize these calls too.

llvm-svn: 258781
2016-01-26 04:01:11 +00:00