D138014 restricted AST to work on immutable IR. This means it is
also safe to use a single BatchAA instance for the entire AST
lifetime, instead of only batching parts of individual queries.
The primary motivation for this is not compile-time, but rather
having a central place to control cross-iteration AA, which will
be used by D137958.
Differential Revision: https://reviews.llvm.org/D137955
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Fixes warnings (or errors, if someone injects -Werror in their build system,
which happens in fact with some folks vendoring LLVM too) with Clang 16:
```
+/var/tmp/portage.notmp/portage/sys-devel/llvm-15.0.4/work/llvm_build-abi_x86_64.amd64/CMakeFiles/CMakeTmp/src.c:3:9: warning: a function declaration without a prototype
is deprecated in all versions of C [-Wstrict-prototypes]
-/var/tmp/portage.notmp/portage/sys-devel/llvm-14.0.4/work/llvm_build-abi_x86_64.amd64/CMakeFiles/CMakeTmp/src.c:3:9: error: a function declaration without a prototype is
deprecated in all versions of C [-Werror,-Wstrict-prototypes]
int main() {return 0;}
^
void
```
Differential Revision: https://reviews.llvm.org/D137503
The MaximalStaticExpansionPass was already ported to the NPM in
02f640672e2875c4e7578366bb2e22582548d7c1. Allow adding it to the default
NPM pass builder pipeline using -polly-enable-mse.
Followup to D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.
Currently, FunctionModRefBehavior tracks whether the function reads
or writes memory (ModRefInfo) and which locations it can access
(argmem, inaccessiblemem and other). This patch changes it to track
ModRef information per-location instead.
To give two examples of why this is useful:
* D117095 highlights a weakness of ModRef modelling in the presence
of operand bundles. For a memcpy call with deopt operand bundle,
we want to say that it can read any memory, but only write argument
memory. This would allow them to be treated like any other calls.
However, we currently can't express this and have to say that it
can read or write any memory.
* D127383 would ideally be modelled as a separate threadid location,
where threadid Refs outside pre-split coroutines can be ignored
(like other accesses to constant memory). The current representation
does not allow modelling this precisely.
The patch as implemented is intended to be NFC, but there are some
obvious opportunities for improvements and simplification. To fully
capitalize on this we would also want to change the way we represent
memory attributes on functions, but that's a larger change, and I
think it makes sense to separate out the FunctionModRefBehavior
refactoring.
Differential Revision: https://reviews.llvm.org/D130896
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Reviewed By: inclyc
Differential Revision: https://reviews.llvm.org/D131167
The pattern matching optimization of Polly detects and optimizes dense general
matrix-matrix multiplication. The generated code is close to high performance
implementations of matrix-matrix multiplications, which are contained in
manually tuned libraries. The described pattern matching optimization is
a particular case of tensor contraction optimization, which was
introduced in [1].
This patch generalizes the pattern matching to the case of tensor contractions
using the form of data dependencies and memory accesses produced by tensor
contractions [1].
Optimization of tensor contractions will be added in the next patch. Following
the ideas introduced in [2], it will logically represent tensor contraction
operands as matrix multiplication operands and use an approach for
optimization of matrix-matrix multiplications.
[1] - Gareev R., Grosser T., Kruse M. High-Performance Generalized Tensor
Operations: A Compiler-Oriented Approach // ACM Transactions on
Architecture and Code Optimization (TACO). 2018. Vol. 15, no. 3.
P. 34:1–34:27. DOI: 10.1145/3235029.
[2] - Matthews D. High-Performance Tensor Contraction without BLAS // SIAM
Journal on Scientific Computing. 2018. Vol. 40, no. 1. P. C 1—C 24.
DOI: 110.1137/16m108968x.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114336
The IR Verifier requires that every call instruction to an inlineable
function (among other things, its implementation must be visible in the
translation unit) must also have !dbg metadata attached to it. When
parallelizing, Polly emits calls to OpenMP runtime function out of thin
air, or at least not directly derived from a bounded list of previous
instruction. While we could search for instructions in the SCoP that has
some debug info attached to it, there is no guarantee that we find any.
Our solution is to generate a new DILocation that points to line 0 to
represent optimized code.
The OpenMP function implementation is usually not available in the
user's translation unit, but can become visible in an LTO build. For
the bug to appear, libomp must also be built with debug symbols.
IMHO, the IR verifier rule is too strict. Runtime functions can
also be inserted by other optimization passes, such as
LoopIdiomRecognize. When inserting a call to e.g. memset, it uses the
DebugLoc from a StoreInst from the unoptimized code. It is not
required to have !dbg metadata attached either.
Fixes#56692
The copy statements inserted by the matrix-multiplication optimization
introduce new dependencies between the copy statements and other
statements. As a result, the DependenceInfo must be recomputed.
Not recomputing them caused IslAstInfo to deduce that some loops are
parallel but cause race conditions when accessing the packed arrays.
As a result, matrix-matrix multiplication currently cannot be
parallelized.
Also see discussion at https://reviews.llvm.org/D125202
Binary size of `clang` is trivial; namely, numerical value doesn't
change when measured in MiB, and `.data` section increases from 139Ki to
173 Ki.
Differential Revision: https://reviews.llvm.org/D128070
This patch implements the `MaximalStaticExpansion` and its printer in NPM.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D125870
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.
The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.
Differential Revision: https://reviews.llvm.org/D125557
This make is obivious that a class was not intended to be derived from.
NPM analysis pass can unfortunately not marked as final because they are
derived from a llvm::Checker<T> template internally by the NPM.
Also normalize the use of classes/structs
* NPM passes are structs
* Legacy passes are classes
* structs that have methods and are not a visitor pattern are classes
* structs have public inheritance by default, remove "public" keyword
* Use typedef'ed type instead of inline forward declaration
Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D123677
The legacy LoopUnswitch pass is only used in the legacy pass manager
pipeline, which is deprecated.
The NewPM replacement is SimpleLoopUnswitch and I think it is time to
remove the legacy LoopUnswitch code.
Fixes#31000.
Reviewed By: aeubanks, Meinersbur, asbirlea
Differential Revision: https://reviews.llvm.org/D124376
Rather than checking the bitcast pointer element types, compare
the element type of the access and the GEP result type.
The entire code is dubious due to the inspection of GEP structure,
but this at least preserves the spirit of the existing code.