This makes unit tests compatible with the assertion added in
https://github.com/llvm/llvm-project/pull/106524, by setting the
isSigned flag to the correct value or changing how the value is
constructed.
Inside computeConstantDifference(), handle the case where both sides are
of the form `C * %x`, in which case we can strip off the common
multiplication (as long as we remember to multiply by it for the
following difference calculation).
There is an obvious alternative implementation here, which would be to
directly decompose multiplies inside the "Multiplicity" accumulation.
This does work, but I've found this to be both significantly slower
(because everything has to work on APInt) and more complex in
implementation (e.g. because we now need to match back the new More/Less
with an arbitrary factor) without providing more power in practice. As
such, I went for the simpler variant here.
This is the last step to make computeConstantDifference() sufficiently
powerful to replace existing uses of
`cast<SCEVConstant>(getMinusSCEV())` with it.
computeConstantDifference() can currently look through addrecs with
identical steps, and then through adds with identical operands (apart
from constants).
However, it fails to handle minor variations, such as two nested add
recs, or an outer add with an inner addrec (rather than the other way
around).
This patch supports these cases by adding a loop over the
simplifications, limited to a small number of iterations. The motivation
is the same as in #101339, to make
computeConstantDifference() powerful enough to replace existing uses of
`dyn_cast<SCEVConstant>(getMinusSCEV())` with it. Though as the IR test
diff shows, other callers may also benefit.
These are the final few places in LLVM that use instruction pointers to
insert instructions -- use iterators instead, which is needed for
debug-info correctness in the future. Most of this is a gentle
scattering of getIterator calls or not deref-then-addrofing iterators.
libfuzzer does require a storage change to keep built instruction
positions in a container though. The unit-test changes are very
straightforwards.
This leaves us in a position where libfuzzer can't fuzz on either of
debug-info records, however I don't believe that fuzzing of debug-info
is in scope for the library.
Currently it only deals with the case where we're subtracting adds with
at most one non-constant operand. This patch extends it to cancel out
common operands for the subtraction of arbitrary add expressions.
The background here is that I want to replace a getMinusSCEV() call in
LAA with computeConstantDifference():
93fecc2577/llvm/lib/Analysis/LoopAccessAnalysis.cpp (L1602-L1603)
This particular call is very expensive in some cases (e.g. lencod with
LTO) and computeConstantDifference() could achieve this much more
cheaply, because it does not need to construct new SCEV expressions.
However, the current computeConstantDifference() implementation is too
weak for this and misses many basic cases. This is a step towards making
it more powerful while still keeping it pretty fast.
The cache triggers almost never, and seems unlikely to help with
performance. However, when it does, it is likely to cause the comparator
to become inconsistent due to a bad interaction of the depth limit and
cache hits. This leads to crashes in debug builds. See the new unit test
for a reproducer.
The use-def walk in forgetValue() was skipping instructions with
non-SCEVable types. However, SCEV may look past with.overflow
intrinsics returning aggregates.
Fixes#97586.
applyLoopGuards doesn't always preserve information when there are multiple assumes.
This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:
1. __builtin_assume(TC % 8 == 0);
2. __builtin_assume(TC > 0);
3. __builtin_assume(TC < 100);
Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)
Looking at this SCEV, it doesn't preserve the divisibility by 8 information.
After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)
By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.
Differential Revision: https://reviews.llvm.org/D144947
The new function has been added to SCEV that allows to raise the number 2
to the desired power.
Authored-by: Dmitry Bakunevich
Differential Revision: https://reviews.llvm.org/D144381
applyLoopGuards doesn't always preserve information when there are multiple assumes.
This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:
1. __builtin_assume(TC % 8 == 0);
2. __builtin_assume(TC > 0);
3. __builtin_assume(TC < 100);
Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)
Looking at this SCEV, it doesn't preserve the divisibility by 8 information.
After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)
By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.
Differential Revision: https://reviews.llvm.org/D141850
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This is the last major stepping stone before being able to allocate the node via the folding set allocator. That will in turn allow more general SCEV predicate expression trees.
Data references in a loop should not access elements over the
statically allocated size. So we can infer a loop max trip count
from this undefined behavior.
Reviewed By: reames, mkazantsev, nikic
Differential Revision: https://reviews.llvm.org/D109821
The ceiling variant was recently added (due to the work towards D105216), and we're spending a lot of time trying to find optimizations for the expression. This patch brute forces the space of i8 unsigned divides and checks that we get a correct (well consistent with APInt) result for both udiv and udiv ceiling.
(This is basically what I've been doing locally in a hand rolled C++ program, and I realized there no good reason not to check it in as a unit test which directly exercises the logic on constants.)
Differential Revision: https://reviews.llvm.org/D106083
Rules:
1. SCEVUnknown is a pointer if and only if the LLVM IR value is a
pointer.
2. SCEVPtrToInt is never a pointer.
3. If any other SCEV expression has no pointer operands, the result is
an integer.
4. If a SCEVAddExpr has exactly one pointer operand, the result is a
pointer.
5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other
pointer operands, the result is a pointer.
6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a
pointer.
7. Otherwise, the SCEV expression is invalid.
I'm not sure how useful rule 6 is in practice. If we exclude it, we can
guarantee that ScalarEvolution::getPointerBase always returns a
SCEVUnknown, which might be a helpful property. Anyway, I'll leave that
for a followup.
This is basically mop-up at this point; all the changes with significant
functional effects have landed. Some of the remaining changes could be
split off, but I don't see much point.
Differential Revision: https://reviews.llvm.org/D105510
Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr
operands. As a consequence, we always drop flags when building
expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`.
We should be able to retain NUW flags common among all inlined
SCEVAddExpr and the original flags.
Reviewed By: nikic, mkazantsev
Differential Revision: https://reviews.llvm.org/D103877
SCEV currently tries to prove implications of x pred y by also
trying to imply ~y pred ~x. This is expensive in terms of
compile-time (in fact, the majority of isImpliedCond compile-time
is spent here) and generally not fruitful. The issue is that this
also swaps the operands and thus breaks canonical ordering. If
originally we were trying to prove an implication like
X > C1 -> Y > C2, then we'll now try to prove X > C1 -> C3 > ~Y,
which will not work.
The only real case where we can get some use out of this transform
is if the original conditions were in the form X > C1 -> Y < C2, were
then swapped to X > C1 -> C2 > Y and are then swapped again here to
X > C1 -> ~Y > C3.
As such, handle this at a higher level, where we are doing the
swapping in the first place. There's four different ways that we
can line up a predicate and a swapped predicate, so we use some
heuristics to pick some profitable way.
Because we now try this transform at a higher level
(isImpliedCondOperands rather than isImpliedCondOperandsHelper),
we can also prove additional facts. Of the added tests, one was
proven previously while the other wasn't.
Differential Revision: https://reviews.llvm.org/D90926
URem operations with constant power-of-2 second operands are modeled as
such. This patch on its own has very little impact (e.g. no changes in
CodeGen for MultiSource/SPEC2000/SPEC2006 on X86 -O3 -flto), but I'll
soon post follow-up patches that make use of it to more accurately
determine the trip multiple.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89821
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.
This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.
The return was due to revert of underlying patch that this one depends on.
Unit test temporarily disabled because the required logic in SCEV is switched
off due to compile time reasons.
Differential Revision: https://reviews.llvm.org/D89548
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.
This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.
Differential Revision: https://reviews.llvm.org/D89548
Reviewed By: fhahn
The initial version of the patch was reverted because it missed the check that
the predicate being proved is actually guarded by this check on 1st iteration.
If it was not executed on 1st iteration (but possibly executes after that), then
it is incorrect to use reasoning about IV start to prove it.
Added the test where the miscompile was seen. Unfortunately, my attempts
to reduce it with bugpoint did not succeed; it can further be reduced when
we understand how to do it without losing the initial bug's notion.
Returning assuming the miscompiles are now gone.
Differential Revision: https://reviews.llvm.org/D88208
The logic there only considers `SLT/SGT` predicates. We can use the same logic
for proving `ULT/UGT` predicates if all involved values are non-negative.
Adding full-scale support for unsigned might be challenging because of code amount,
so we can consider this in the future.
Differential Revision: https://reviews.llvm.org/D88087
Reviewed By: reames
If we know that some predicate is true for AddRec and an invariant
(w.r.t. this AddRec's loop), this fact is, in particular, true on the first
iteration. We can try to prove the facts we need using the start value.
The motivating example is proving things like
```
isImpliedCondOperands(>=, X, 0, {X,+,-1}, 0}
```
Differential Revision: https://reviews.llvm.org/D88208
Reviewed By: reames
For some expressions, we can use information from loop guards when
we are looking for a maximum. This patch applies information from
loop guards to the expression used to compute the maximum backedge
taken count in howFarToZero. It currently replaces an unknown
expression X with UMin(X, Y), if the loop is guarded by
X ult Y.
This patch is minimal in what conditions it applies, and there
are a few TODOs to generalize.
This partly addresses PR40961. We will also need an update to
LV to address it completely.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67178
This patch extends SCEVParameterRewriter to support rewriting unknown
epxressions to arbitrary SCEV expressions. It will be used by further
patches.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67176
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
This patch was originally committed as b8a3c34eee06, but broke the
modules build, as LoopAccessAnalysis was using the Expander.
The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
Teach SCEV about the @loop.decrement.reg intrinsic, which has exactly the same
semantics as a sub expression. This allows us to query hardware-loops, which
contain this @loop.decrement.reg intrinsic, so that we can calculate iteration
counts, exit values, etc. of hardwareloops.
This "int_loop_decrement_reg" intrinsic is defined as "IntrNoDuplicate". Thus,
while hardware-loops and tripcounts now become analysable by SCEV, this
prevents the usual loop transformations from applying transformations on
hardware-loops, which is what we want at this point, for which I have added
test cases for loopunrolling and IndVarSimplify and LFTR.
Differential Revision: https://reviews.llvm.org/D71563
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
Without this patch computeConstantDifference returns None for cases like
these:
computeConstantDifference(%x, %x)
computeConstantDifference({%x,+,16}, {%x,+,16})
Differential Revision: https://reviews.llvm.org/D65474
llvm-svn: 368193
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57173
llvm-svn: 352913
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57172
llvm-svn: 352911