Benjamin Kramer
f0e5d2f032
LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations.
...
For example on x86 with SSE4.2 a <8 x i8> add reduction becomes
movdqa %xmm0, %xmm1
movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1]
paddw %xmm0, %xmm1
pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0]
paddw %xmm1, %xmm0
phaddw %xmm0, %xmm0
pextrb $0, %xmm0, %edx
instead of
pextrb $2, %xmm0, %esi
pextrb $0, %xmm0, %edx
addb %sil, %dl
pextrb $4, %xmm0, %esi
addb %dl, %sil
pextrb $6, %xmm0, %edx
addb %sil, %dl
pextrb $8, %xmm0, %esi
addb %dl, %sil
pextrb $10, %xmm0, %edi
pextrb $14, %xmm0, %edx
addb %sil, %dil
pextrb $12, %xmm0, %esi
addb %dil, %sil
addb %sil, %dl
llvm-svn: 170439
2012-12-18 18:40:20 +00:00
Nadav Rotem
e5e28b48c8
Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today.
...
llvm-svn: 170157
2012-12-13 23:11:54 +00:00
Nadav Rotem
36510f7194
Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
...
llvm-svn: 170051
2012-12-13 00:21:03 +00:00
Nadav Rotem
6027bdf898
Fix indentation.
...
llvm-svn: 170005
2012-12-12 19:39:36 +00:00
Nadav Rotem
d0bb22bba3
LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
...
llvm-svn: 170004
2012-12-12 19:29:45 +00:00
Nadav Rotem
6798a04b15
Fix the ascii drawing that was ruined when I split the H and CPP
...
llvm-svn: 169955
2012-12-12 01:33:47 +00:00
Nadav Rotem
4fa2e3d5af
fix a typo.
...
llvm-svn: 169953
2012-12-12 01:31:10 +00:00
Nadav Rotem
aeb17df802
LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag.
...
llvm-svn: 169950
2012-12-12 01:11:46 +00:00
Nadav Rotem
f707bf4ca3
PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
...
llvm-svn: 169916
2012-12-11 21:30:14 +00:00
Nadav Rotem
e266efb70b
Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars.
...
llvm-svn: 169904
2012-12-11 18:58:10 +00:00
Nadav Rotem
dbb3328194
Fix PR14565. Don't if-convert loops that have switch statements in them.
...
llvm-svn: 169813
2012-12-11 04:55:10 +00:00
Nadav Rotem
07df5ac1a1
Split the LoopVectorizer into H and CPP.
...
llvm-svn: 169771
2012-12-10 21:39:02 +00:00
Nadav Rotem
7b5b55c195
Add support for reverse induction variables. For example:
...
while (i--)
sum+=A[i];
llvm-svn: 169752
2012-12-10 19:25:06 +00:00
Paul Redmond
2adb13c100
LoopVectorize: support vectorizing intrinsic calls
...
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.
Reviewed by: Nadav
llvm-svn: 169711
2012-12-09 20:42:17 +00:00
Paul Redmond
f7cd6b391a
test commit.
...
llvm-svn: 169709
2012-12-09 19:46:31 +00:00
Nadav Rotem
a8f026e2d4
LoopVectorizer: Increase the number of pointers that can be tested at runtime. If we cant prove statically that the pointers are disjoint then we add the runtime check.
...
llvm-svn: 169334
2012-12-04 23:25:24 +00:00
Nadav Rotem
87fc988c5d
Enable if-conversion during vectorization.
...
llvm-svn: 169331
2012-12-04 22:59:52 +00:00
Nadav Rotem
93fa5ef957
Fix a bug in vectorization of if-converted reduction variables. If the
...
reduction variable is not used outside the loop then we ran into an
endless loop. This change checks if we found the original PHI.
llvm-svn: 169324
2012-12-04 22:40:22 +00:00
Nadav Rotem
a10b311aec
Add support for reduction variables when IF-conversion is enabled.
...
llvm-svn: 169288
2012-12-04 18:17:33 +00:00
Nadav Rotem
07674cb566
Give scalar if-converted blocks half the score because they are not always executed due to CF.
...
llvm-svn: 169223
2012-12-04 07:11:52 +00:00
Nadav Rotem
628c2dba60
Add the last part that is needed for vectorization of if-converted code.
...
Added the code that actually performs the if-conversion during vectorization.
We can now vectorize this code:
for (int i=0; i<n; ++i) {
unsigned k = 0;
if (a[i] > b[i]) <------ IF inside the loop.
k = k * 5 + 3;
a[i] = k; <---- K is a phi node that becomes vector-select.
}
llvm-svn: 169217
2012-12-04 06:15:11 +00:00
NAKAMURA Takumi
f99b535fdb
LoopVectorize.cpp: Suppress a warning. [-Wunused-variable]
...
llvm-svn: 169195
2012-12-04 00:49:34 +00:00
NAKAMURA Takumi
8b07bc579b
Fix whitespace.
...
llvm-svn: 169194
2012-12-04 00:49:28 +00:00
Nadav Rotem
d479a57f68
minor renaming, documentation and cleanups.
...
llvm-svn: 169175
2012-12-03 22:57:09 +00:00
Nadav Rotem
fad16be973
IF-conversion: teach the cost-model how to grade if-converted loops.
...
llvm-svn: 169171
2012-12-03 22:46:31 +00:00
Nadav Rotem
eee203d885
Now that we have a basic if-conversion infrastructure we can rename the
...
"single basic block loop vectorizer" to "innermost loop vectorizer".
llvm-svn: 169158
2012-12-03 21:33:08 +00:00
Nadav Rotem
a30aba7a01
Add initial support for IF-conversion. This patch implements the first 1/3,
...
which is the legality of the if-conversion transformation. The next step is to
implement the cost-model for the if-converted code as well as the
vectorization itself.
llvm-svn: 169152
2012-12-03 21:06:35 +00:00
Chandler Carruth
ed0881b2a6
Use the new script to sort the includes of every file under lib.
...
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Nadav Rotem
3ae24ee08a
minor cleanups
...
llvm-svn: 169048
2012-11-30 22:37:11 +00:00
Nadav Rotem
6b494be886
Remove the use of LPPassManager. We can remove LPM because we dont need to run any additional loop passes on the new vector loop.
...
llvm-svn: 169016
2012-11-30 17:27:53 +00:00
Nadav Rotem
8dd6ee8df5
When broadcasting invariant scalars into vectors, place the broadcast code in the preheader.
...
llvm-svn: 168927
2012-11-29 19:25:41 +00:00
Nadav Rotem
caf5acfd14
Move the code that uses SCEVs prior to creating the new loops.
...
llvm-svn: 168601
2012-11-26 19:51:46 +00:00
Nadav Rotem
ee7ede76f4
Move the max vector width to a constant parameter. No functionality change.
...
llvm-svn: 168570
2012-11-25 16:48:08 +00:00
Nadav Rotem
ef33b5076c
Fix the document style.
...
llvm-svn: 168569
2012-11-25 16:39:01 +00:00
Nadav Rotem
12192f19eb
Refactor the ptr runtime check generation code. No functionality change.
...
llvm-svn: 168568
2012-11-25 16:27:16 +00:00
Nadav Rotem
b15d9fe24d
Rename method. No functionality change.
...
llvm-svn: 168560
2012-11-25 09:13:57 +00:00
Nadav Rotem
bf5173460f
The induction-pointer work is inspired by a research paper. This commit adds a reference.
...
llvm-svn: 168559
2012-11-25 09:09:26 +00:00
Nadav Rotem
ea3824f160
Add support for pointer induction variables even when there is no integer induction variable.
...
llvm-svn: 168558
2012-11-25 08:41:35 +00:00
Nadav Rotem
c3c07e62e8
LoopVectorizer: Add initial support for pointer induction variables (for example: *dst++ = *src++).
...
At the moment we still require to have an integer induction variable (for example: i++).
llvm-svn: 168231
2012-11-17 00:27:03 +00:00
Nadav Rotem
0565b5a279
LoopVectorize: Division reductions generate incorrect code. Remove the part of the code that deals with divs.
...
Thanks to Paul Redmond for catching this while reviewing the code.
llvm-svn: 168142
2012-11-16 06:51:17 +00:00
Nadav Rotem
a43bcddc8d
use the getSplat API. Patch by Paul Redmond.
...
llvm-svn: 167892
2012-11-14 00:02:13 +00:00
Nadav Rotem
12930749ab
Fix a comment typo and add comments.
...
llvm-svn: 167684
2012-11-11 05:15:00 +00:00
Nadav Rotem
1cfef3e9ee
Add support for memory runtime check. When we can, we calculate array bounds.
...
If the arrays are found to be disjoint then we run the vectorized version of
the loop. If they are not, we run the scalar code.
llvm-svn: 167608
2012-11-09 07:09:44 +00:00
Chandler Carruth
acc748b2b5
Fix sign compare warning. Patch by Mahesha HS.
...
llvm-svn: 167282
2012-11-02 05:24:00 +00:00
Nadav Rotem
4cb8cdab5e
LoopVectorize: Preserve NSW, NUW and IsExact flags.
...
llvm-svn: 167174
2012-10-31 21:40:39 +00:00
Nadav Rotem
ec3ab49dda
Put the threshold magic number in a variable.
...
llvm-svn: 167134
2012-10-31 16:22:16 +00:00
Nadav Rotem
1265ea8f8d
Remove enum values since they are not used anymore.
...
llvm-svn: 167131
2012-10-31 16:14:06 +00:00
Nadav Rotem
ce77ab0c24
LoopVectorize: Do not vectorize loops with tiny constant trip counts.
...
llvm-svn: 167101
2012-10-31 03:31:07 +00:00
Nadav Rotem
ff7889196b
Add support for loops that don't start with Zero.
...
This is important for loops in the LAPACK test-suite.
These loops start at 1 because they are auto-converted from fortran.
llvm-svn: 167084
2012-10-31 00:45:26 +00:00
Nadav Rotem
47a299dcc9
Add documentation.
...
llvm-svn: 167055
2012-10-30 22:06:26 +00:00