(i == 5334 || i == 5335)
to:
((i & -2) == 5334)
This transformation has some incorrect side conditions. Specifically, the
transformation is only applied when the right-hand side constant (5334 in
the example) is a power of two not equal and not equal to the negated mask.
These side conditions were added in r258904 to fix PR26323. The correct side
condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334].
It's a little bit hard to see why these transformations are correct and what
the side conditions ought to be. Here is a CVC3 program to verify them for
64-bit values:
ONE : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63);
x : BITVECTOR(64);
y : BITVECTOR(64);
z : BITVECTOR(64);
mask : BITVECTOR(64) = BVSHL(ONE, z);
QUERY( (y & ~mask = y) =>
((x & ~mask = y) <=> (x = y OR x = (y | mask)))
);
Please note that each pattern must be a dual implication (<--> or iff). One
directional implication can create spurious matches. If the implication is
only one-way, an unsatisfiable condition on the left side can imply a
satisfiable condition on the right side. Dual implication ensures that
satisfiable conditions are transformed to other satisfiable conditions and
unsatisfiable conditions are transformed to other unsatisfiable conditions.
Here is a concrete example of a unsatisfiable condition on the left
implying a satisfiable condition on the right:
mask = (1 << z)
(x & ~mask) == y --> (x == y || x == (y | mask))
Substituting y = 3, z = 0 yields:
(x & -2) == 3 --> (x == 3 || x == 2)
The version of this code before r258904 had no side-conditions and
incorrectly justified itself in comments through one-directional
implication.
Thanks to Chandler for the suggestion!
Author: Thomas Jablin (tjablin)
Reviewers: chandlerc majnemer hfinkel cycheng
http://reviews.llvm.org/D21417
llvm-svn: 272873
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
the normal rule that the global must have a unique address can be broken without
being observable by the program by performing comparisons against the global's
address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
its own copy of the global if it requires one, and the copy in each linkage unit
must be the same)
- It is a constant or a function (which means that the program cannot observe that
the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709
A basic block could contain:
%cp = cleanuppad []
cleanupret from %cp unwind to caller
This basic block is empty and is thus a candidate for removal. However,
there can be other uses of %cp outside of this basic block. This is
only possible in unreachable blocks.
Make our transform more correct by checking that the pad has a single
user before removing the BB.
This fixes PR28005.
llvm-svn: 271816
A cleanuppad is not cheap, they turn into many instructions and result
in additional spills and fills. It is not worth keeping a cleanuppad
around if all it does is hold a lifetime.end instruction.
N.B. We first try to merge the cleanuppad with another cleanuppad to
avoid dropping the lifetime and debug info markers.
llvm-svn: 270314
Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG.
Reviewers: spatel, davidxl
Subscribers: danielcdh, llvm-commits
Differential Revision: http://reviews.llvm.org/D20307
llvm-svn: 269995
Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation.
Reviewers: hfinkel, davidxl, spatel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19948
llvm-svn: 269137
Retrying r268550/r268751 which were reverted at r268577/r268765 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken another guess at fixing
the problem in this version of the patch and will watch for another failure.
Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.
Differential Revision: http://reviews.llvm.org/D19674
llvm-svn: 268767
Retrying r268550 which was reverted at r268577 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken a guess at fixing
the problem in this version of the patch and will watch for another failure.
Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.
Differential Revision: http://reviews.llvm.org/D19674
llvm-svn: 268751
MemorySanitizer: use-of-uninitialized-value
0x4910e47 in count /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:159:12
0x4910e47 in countLeadingZeros<unsigned long> /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:183
0x4910e47 in FitWeights /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:855
0x4910e47 in SimplifyCondBranchToCondBranch /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:2895
This reverts commit 609f4dd4bf3bc735c8c047a4d4b0a8e9e4d202e2.
llvm-svn: 268577
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.
Differential Revision: http://reviews.llvm.org/D19674
llvm-svn: 268550
This patch fixes PR27615.
@llvm.dbg.value instructions no longer count towards the maximum number of
instructions to look back at in the instruction list when searching for a
store instruction. This should make the output consistent between debug and
non-debug build.
Patch by Henric Karlsson <henric.karlsson@ericsson.com>!
Differential Revision: http://reviews.llvm.org/D19912
llvm-svn: 268512
Make it possible that TryToSimplifyUncondBranchFromEmptyBlock merges empty
basic block including lifetime intrinsics as well as phi nodes and
unconditional branch into its successor or predecessor(s).
If successor of empty block has single predecessor, all contents including
lifetime intrinsics are sinked into the successor. Otherwise, they are
hoisted into its predecessor(s) and then merged into the predecessor(s).
Patch by Josh Yoon <josh.yoon@samsung.com>!
Differential Revision: http://reviews.llvm.org/D19257
llvm-svn: 268254
There's no existing test for this path, and I don't know how to expose
it in a regression test, but I'm assuming there's some reason this
path exists.
llvm-svn: 267813
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.
llvm-svn: 267515
This patch improves SimplifyCFG to catch cases like:
if (a < b) {
if (a > b) <- known to be false
unreachable;
}
Phabricator Revision: http://reviews.llvm.org/D18905
llvm-svn: 266767
This is almost identical to:
http://reviews.llvm.org/rL264527
This doesn't solve PR27344; it just allows the profile weights to survive.
To solve the bug, we need to use the profile weights in the backend.
llvm-svn: 266442
Clarify what this RemapFlag actually means.
- Change the flag name to match its intended behaviour.
- Clearly document that it's not supposed to affect globals.
- Add a host of FIXMEs to indicate how to fix the behaviour to match
the intent of the flag.
RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock. Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).
When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.
This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.
llvm-svn: 265628
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.
llvm-svn: 264697
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.
llvm-svn: 264596
This is similar to D18133 where we allowed profile weights on select instructions.
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.
A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648
Differential Revision: http://reviews.llvm.org/D18220
llvm-svn: 263716
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636
This doesn't accomplish anything on its own. It's the first step towards preserving
and using branch weights with selects.
The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to
determine how to correctly propagate the weights.
Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.
The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.
Differential Revision: http://reviews.llvm.org/D18133
llvm-svn: 263482
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date: Fri Mar 11 17:15:50 2016 +0000
Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.
Reviewers: chandlerc
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18023
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
91177308-0d34-0410-b5e6-96231b3b80d8
until we can figure out what to do about clang and Release build testing.
This reverts commit 263258.
llvm-svn: 263321
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.
Reviewers: chandlerc
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18023
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
The cleanupret instruction has an invariant that it's 'from' operand be
a cleanuppad. This invariant was violated when we removed a dead block
which removed a cleanuppad leaving behind a cleanupret with an undef
'from' operand.
This was solved in r261731 by staving off the removal of the dead block
to a later pass.
However, it occured to me that we do not need to do this.
Instead, we can simply avoid processing the cleanupret if it has an
undef 'from' operand because we know that it will be removed soon.
llvm-svn: 261754
DeleteDeadBlock was called indiscriminately, leading to cleanuprets with
undef cleanuppad references.
Instead, try to drain the BB of most of it's instructions if it is
unreachable. We can then remove the BB if it solely consists of a
terminator (and maybe some phis).
llvm-svn: 261731
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.
Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators. (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)
There should be NFC here.
llvm-svn: 261498
Cleanuppads may be merged together if one is the only predecessor of the
other in which case a simple transform can be performed: replace the
a cleanupret with a branch and remove an unnecessary cleanuppad.
Differential Revision: http://reviews.llvm.org/D17459
llvm-svn: 261390
Summary:
Performing this optimization duplicates the call to the convergent
function and adds new control-flow dependencies, which is a no-no.
Reviewers: jingyue
Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17128
llvm-svn: 260730
D16251)
Summary:
This is a simpler fix to the problem than the dominator approach in
http://reviews.llvm.org/D16251. It adds only values into the gather() while loop
that have been seen before.
The actual endless loop is in the constant compare gather() routine in
Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the
queue:
%.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
Here is what happens at the IR level:
for.cond.i: ; preds = %if.end6.i,
%if.end.i54
%ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ]
%ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<<
%cmp2.i = icmp ult i32 %ix.0.i, %11
br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit
if.end6.i: ; preds = %for.body.i
%cmp10.i = icmp ugt i32 %conv.i, %add9.i
%.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<<
When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i.
The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
(Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i).
And
now there is use of .ret.0.off0.i before a definition which triggers the
“endless” loop in gather():
while(!DFT.empty()) {
V = DFT.pop_back_val(); // V is .ret.0.off0.i
if (Instruction *I = dyn_cast<Instruction>(V)) {
// If it is a || (or && depending on isEQ), process the operands.
if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) {
DFT.push_back(I->getOperand(1)); // This is now .ret.0.off0.i also
DFT.push_back(I->getOperand(0));
continue; // “endless loop” for .ret.0.off0.i
}
Reviewers: reames, ahatanak
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16839
llvm-svn: 259730
This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=26308
With the switch to using the TTI cost model in:
http://reviews.llvm.org/rL228826
...it became possible to hit a zero-cost cycle of instructions (gep -> phi -> gep...),
so we need a cap for the recursion in DominatesMergePoint().
A recursion depth parameter was already added for a different reason in:
http://reviews.llvm.org/rL255660
...so we can just set a limit for it.
I pulled "10" out of the air and made it an independent parameter that we can play with.
It might be higher than it needs to be given the currently low default value of
PHINodeFoldingThreshold (2). That's the starting cost value that we enter the recursion
with, and most instructions have cost set to TCC_Basic (1), so I don't think we're going
to speculate more than 2 instructions with the current parameters.
As noted in the review and the TODO comment, we can do better than just limiting recursion
depth.
Differential Revision: http://reviews.llvm.org/D16637
llvm-svn: 258971