9637 Commits

Author SHA1 Message Date
Nikita Popov
ac87480bd8 [SCEV] Recognize min/max intrinsics
Recognize umin/umax/smin/smax intrinsics and convert them to the
already existing SCEV nodes of the same name.

In the future we'll want SCEVExpander to also produce the intrinsics,
but we're not ready for that yet.

Differential Revision: https://reviews.llvm.org/D87160
2020-09-05 16:30:11 +02:00
Nikita Popov
73104b0751 [InstSimplify] Fold min/max based on dominating condition
If we have a dominating condition that x >= y, then umax(x, y) is x,
etc. I'm doing this in InstSimplify as the corresponding transform
for the select form is also done there.

Differential Revision: https://reviews.llvm.org/D87168
2020-09-05 16:16:40 +02:00
Arthur Eubanks
c9771391ce [NewPM][Lint] Port -lint to NewPM
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87057
2020-09-03 13:03:44 -07:00
Arthur Eubanks
e440b4933a Revert "[NewPM][Lint] Port -lint to NewPM"
This reverts commit 883399c8402188520870f99e7d8b3244f000e698.
2020-09-02 21:34:29 -07:00
Arthur Eubanks
883399c840 [NewPM][Lint] Port -lint to NewPM
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87057
2020-09-02 21:13:01 -07:00
Craig Topper
b16e8687ab [CodeGenPrepare][X86] Teach optimizeGatherScatterInst to turn a splat pointer into GEP with scalar base and 0 index
This helps SelectionDAGBuilder recognize the splat can be used as a uniform base.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D86371
2020-09-02 20:44:12 -07:00
Eli Friedman
96ef6998df [InstCombine] Fix a couple crashes with extractelement on a scalable vector.
Differential Revision: https://reviews.llvm.org/D86989
2020-09-02 18:02:07 -07:00
Jordan Rupprecht
c90f15d25a [NFC] Fix unused var in release build 2020-09-01 13:05:56 -07:00
Florian Hahn
0d966ae4b2 [Loads] Add canReplacePointersIfEqual helper.
This patch adds an initial, incomeplete and unsound implementation of
canReplacePointersIfEqual to check if a pointer value A can be replaced
by another pointer value B, that are deemed to be equivalent through
some means (e.g. information from conditions).

Note that is in general not sound to blindly replace pointers based on
equality, for example if they are based on different underlying objects.

LLVM's memory model is not completely settled as of now; see
https://bugs.llvm.org/show_bug.cgi?id=34548 for a more detailed
discussion.

The initial version of canReplacePointersIfEqual only rejects a very
specific case: replacing a pointer with a constant expression that is
not dereferenceable. Such a replacement is problematic and can be
restricted relatively easily without impacting most code. Using it to
limit replacements in GVN/SCCP/CVP only results in small differences in
7 programs out of MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto.

This patch is supposed to be an initial step to improve the current
situation and the helper should be made stricter in the future. But this
will require careful analysis of the impact on performance.

Reviewed By: aqjune

Differential Revision: https://reviews.llvm.org/D85524
2020-09-01 20:57:41 +01:00
Alina Sbirlea
c292fba46f [MemorySSA] Update phi map with replacement value. 2020-09-01 11:56:40 -07:00
Alina Sbirlea
63844c116a [MemorySSA] Clean up single value phis.
MemoryPhis with a single value are correct, but can lead to errors when
updating. Clean up single entry Phis newly added when cloning blocks.
Resolves PR46574.
2020-08-31 19:26:08 -07:00
Nikita Popov
88b310f64b [InstSimplify] Reduce code duplication in simplifySelectWithICmpCond (NFC)
Canonicalize icmp ne to icmp eq and implement all the folds only once.
2020-08-29 22:38:49 +02:00
Nikita Popov
a5be86fde5 [InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322)
Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.

This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.

This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.

Differential Revision: https://reviews.llvm.org/D86834
2020-08-29 21:59:39 +02:00
Nikita Popov
a400a61721 [LVI] Remove unnecessary lambda capture (NFC) 2020-08-29 21:33:19 +02:00
Nikita Popov
6d88f6efd4 Reapply [LVI] Normalize pointer behavior
This got reverted because a dependency was reverted. It has since
been reapplied, so reapply this as well.

-----

Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.

The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.

This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this more powerful than the previous implementation as
a side-effect, and this does actually seem benefitial in practice.

Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.

I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.

Differential Revision: https://reviews.llvm.org/D69914
2020-08-29 21:17:03 +02:00
Roman Lebedev
c1b3e32118
[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable
See
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
and
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html

InstSimply is not allowed to perform simplifications to instructions
that are not def-reachable from the original instruction.
2020-08-29 09:58:08 +03:00
Owen Anderson
ed90f15efb Revert "[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block"
This reverts commit 6102310d814ad73eab60a88b21dd70874f7a056f.  It
appears to cause compilation non-determinism and caused stage3
mismatches.
2020-08-28 23:43:42 +00:00
serge-sans-paille
2296182181 Skip analysis re-computation when no changes are reported
This is a follow-up to https://reviews.llvm.org/D80707, generalized to
CallGraphSCC, Loop and Region

Differential Revision: https://reviews.llvm.org/D86442
2020-08-28 21:41:01 +02:00
David Sherwood
f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Florian Hahn
fd6ebea50d [MemLoc] Support memcmp in MemoryLocation::getForArgument.
This patch adds support for memcmp in MemoryLocation::getForArgument.
memcmp reads from the first 2 arguments up to the number of bytes of the
third argument.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86725
2020-08-28 10:19:54 +01:00
Martin Storsjö
db1ec04963 [ValueTracking] Remove a stray semicolon. NFC.
This silences warnings when built with GCC at least.
2020-08-28 09:24:10 +03:00
serge-sans-paille
b1f4e5979b (Expensive) Check for Loop, SCC and Region pass return status
This generalizes the logic introduced in https://reviews.llvm.org/D80916 to
other passes.

It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report
their status.

Differential Revision: https://reviews.llvm.org/D86589
2020-08-28 07:56:35 +02:00
Alina Sbirlea
d370836c20 [MemorySSA] Assert defining access is not a MemoryUse. 2020-08-27 18:21:10 -07:00
Vitaly Buka
23524fdece [ValueTracking] Replace recursion with Worklist
Now findAllocaForValue can handle nontrivial phi cycles.
2020-08-27 14:44:49 -07:00
Vitaly Buka
a40660551e [StackSafety] Ignore allocas with partial lifetime markers
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86672
2020-08-27 13:54:41 -07:00
Vitaly Buka
a6927c8621 [NFC][ValueTracking] Add OffsetZero into findAllocaForValue
For StackLifetime after finding alloca we need to check that
values ponting to the begining of alloca.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86692
2020-08-27 13:46:22 -07:00
Roman Lebedev
b85f91fdce
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.
2020-08-27 22:32:03 +03:00
Simon Moll
c48b06c44f [sda][nfc] clang-formatting 2020-08-27 18:27:44 +02:00
Roman Lebedev
6102310d81
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
2020-08-27 18:47:04 +03:00
Vitaly Buka
469debe027 [ValueTracking] Support select in findAllocaForValue 2020-08-27 02:13:52 -07:00
Nikita Popov
d7c119d89c [InstSimplify] Fold min/max intrinsic based on icmp of operands
This is a reboot of D84655, now performing the inner icmp
simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp()
fold based on this, by moving the logic into icmp of min/max instead,
making it more general. We can't drop the folds for constant operands,
because those also allow undef, which we exclude here.

The tests use assumes for exhaustive coverage, and have a few
more examples of misc folds we get based on icmp simplification.

Differential Revision: https://reviews.llvm.org/D85929
2020-08-26 22:02:57 +02:00
Arthur Eubanks
098d3f9827 [InstSimplify] Simplify to vector constants when possible
InstSimplify should do all transformations that ConstProp does, but
one thing that ConstProp does that InstSimplify wouldn't is inline
vector instructions that are constants, e.g. into a ret.

Previously vector instructions wouldn't be inlined in InstSimplify
because llvm::Simplify*Instruction() would return nullptr for specific
instructions, such as vector instructions that were actually constants,
if it couldn't simplify them.

This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and
SimplifyShuffleVectorInst to return a vector constant when possible.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85946
2020-08-26 11:40:36 -07:00
Juneyoung Lee
684b43c0cf [IR] Add NoUndef attribute to Intrinsics.td
This patch adds NoUndef to Intrinsics.td.
The attribute is attached to llvm.assume's operand, because llvm.assume(undef)
is UB.
It is attached to pointer operands of several memory accessing intrinsics
as well.

This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check
unnecessary, so it is removed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86576
2020-08-27 02:54:48 +09:00
Mircea Trofin
7cfcecece0 [MLInliner] Simplify TFUTILS_SUPPORTED_TYPES
We only need the C++ type and the corresponding TF Enum. The other
parameter was used for the output spec json file, but we can just
standardize on the C++ type name there.

Differential Revision: https://reviews.llvm.org/D86549
2020-08-25 14:19:39 -07:00
Juneyoung Lee
f753f5b050 [ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands
This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands.

Instead of special-casing llvm.assume, I think it is also a viable option to
add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86477
2020-08-26 04:40:21 +09:00
Nikita Popov
3a54b6a4b7 [MemDep] Use BatchAA when computing pointer dependencies
We're not changing IR while running a single MemDep query, so it's
safe to cache alias analysis results using BatchAA. This adds BatchAA
usage to getSimplePointerDependencyFrom(), which is non-intrusive --
covering larger parts (like a whole processNonLocalLoad query) is
also possible, but requires threading BatchAA through a bunch of APIs.

For the ThinLTO configuration, this is a 1% geomean improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D85583
2020-08-25 21:34:34 +02:00
Ta-Wei Tu
abbd652dd6 [LoopNest] False negative of arePerfectlyNested with LCSSA loops
Summary: The LCSSA pass (required for all loop passes) sometimes adds
additional blocks containing LCSSA variables, and checkLoopsStructure
may return false even when the loops are perfectly nested in this case.
This is because the successor of the exit block of the inner loop now
points to the LCSSA block instead of the latch block of the outer loop.
Examples are shown in the test nests-with-lcssa.ll.

To fix the issue, the successor of the exit block of the inner loop can
now point to a block in which all instructions are LCSSA phi node
(except the terminator), and the sole successor of that block should
point to the latch block of the outer loop.

Reviewed By: Whitney, etiotto

Differential Revision: https://reviews.llvm.org/D86133
2020-08-25 16:20:52 +00:00
Mircea Trofin
8c63df2416 [MLInliner] Support training that doesn't require partial rewards
If we use training algorithms that don't need partial rewards, we don't
need to worry about an ir2native model. In that case, training logs
won't contain a 'delta_size' feature either (since that's the partial
reward).

Differential Revision: https://reviews.llvm.org/D86481
2020-08-24 17:36:29 -07:00
Sam Parker
2e194fe73b [SCEV] Still trying to fix windows buildbots 2020-08-24 10:26:48 +01:00
Alina Sbirlea
f55ad3973d [DomTree] Extend update API to allow a post CFG view.
Extend the `applyUpdates` in DominatorTree to allow a post CFG view,
different from the current CFG.
This patch implements the functionality of updating an already up to
date DT, to the desired PostCFGView.
Combining a set of updates towards an up to date DT and a PostCFGView is
not yet supported.

Differential Revision: https://reviews.llvm.org/D85472
2020-08-21 17:23:08 -07:00
Arthur Eubanks
b79889c2b1 [opt][NewPM] Add basic-aa in legacy PM compatibility mode
The legacy PM alias analysis pipeline by default includes basic-aa.
When running `opt -foo-pass` under the NPM and -disable-basic-aa is not
specified, use basic-aa.

This decreases the number of check-llvm failures under NPM from 913 to 752.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D86167
2020-08-21 14:05:07 -07:00
Roman Lebedev
5d7c5a5e99
[NFC] Port InstCount pass to new pass manager 2020-08-21 12:39:42 +03:00
Yevgeny Rouban
18bc400f97 [NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks
Both AfterPass and AfterPassInvalidated pass instrumentation
callbacks get additional parameter of type PreservedAnalyses.
This patch was created by @fedor.sergeev. I have just slightly
changed it.

Reviewers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81555
2020-08-21 16:10:42 +07:00
David Green
2b69efded0 [ARM][LV] Add a preferPredicatedReductionSelect target hook
As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to tail predicate most
reduction loops.

Differential Revision: https://reviews.llvm.org/D85980
2020-08-21 08:48:12 +01:00
Sanjay Patel
6f3511a01a [ValueTracking] define/use max recursion depth in header
There's a potential motivating case to increase this limit in PR47191:
http://bugs.llvm.org/PR47191

But first we should make it less hacky. The limit in InstCombine is directly tied
to this value because an increase there can cause asserts in the underlying value
tracking calls if not changed together. The usage in VectorUtils is independent,
but the comment suggests that we should use the same value unless there's a known
reason to diverge. There are similar limits in codegen analysis, but I think we
should leave those independent in case we intentionally want the optimization
power/cost to be different there.

Differential Revision: https://reviews.llvm.org/D86113
2020-08-19 16:56:59 -04:00
Mehdi Amini
a407ec9b6d Revert "Revert "[NFC][llvm] Make the contructors of ElementCount private.""
Was reverted because MLIR/Flang builds were broken, these APIs have been
fixed in the meantime.
2020-08-19 17:26:36 +00:00
Mehdi Amini
4fc56d70aa Revert "[NFC][llvm] Make the contructors of ElementCount private."
This reverts commit 264afb9e6aebc98c353644dd0700bec808501cab.
(and dependent 6b742cc48 and fc53bd610f)

MLIR/Flang are broken.
2020-08-19 17:21:37 +00:00
Francesco Petrogalli
264afb9e6a [NFC][llvm] Make the contructors of ElementCount private.
Differential Revision: https://reviews.llvm.org/D86120
2020-08-19 16:26:44 +00:00
Mircea Trofin
62fc44ca3c [MLInliner] In development mode, obtain the output specs from a file
Different training algorithms may produce models that, besides the main
policy output (i.e. inline/don't inline), produce additional outputs
that are necessary for the next training stage. To facilitate this, in
development mode, we require the training policy infrastructure produce
a description of the outputs that are interesting to it, in the form of
a JSON file. We special-case the first entry in the JSON file as the
inlining decision - we care about its value, so we can guide inlining
during training - but treat the rest as opaque data that we just copy
over to the training log.

Differential Revision: https://reviews.llvm.org/D85674
2020-08-17 16:56:47 -07:00
Tyker
a79e604462 [AssumeBundles] Fix Bug in Assume Queries
this bug was causing miscompile.
now clang cant properly selfhost with -mllvm --enable-knowledge-retention

Reviewed By: jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D83507
2020-08-17 21:36:53 +02:00