Patch created using the following command line:
```bash
codespell polly --skip="*.pdf,polly/lib/External/*" --write-changes \
--ignore-words-list=couter,createor,distribues,doble,identty,indention,indx,olt,ore,padd,sais,te,theses
```
This patch introduces the initial implementation for annotating loops
created by Polly. Polly generates RunTimeChecks (RTCs), which result in
loop versioning. Specifically, the loop created by Polly is executed
when the RTCs pass, otherwise, the original loop is executed.
This patch adds the "llvm.loop.vectorize.enable" metadata, setting it to
true for loops created by Polly. Disabling vectorization for the original
fallback loop is already merged in #119188.
This behavior is controlled by the 'polly-annotate-metadata-vectorize'
flag, and the annotations are applied only when this flag is enabled.
This flag is set to false by default.
NOTE: This commit is initial patch in effort to make polly interact with
Loop Vectorizer via metadata.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
The patch #102460 already implements separate DT/LI/SE for parallel sub
function. Crashes have been reported while region generator tries using
oringinal function's DT while creating new parallel sub function due to
checks in #101198. This patch aims at fixing those cases by switching
the DT/LI while generating parallel function using Region Generator.
Fixes#117877
After patch 5ce47a5, some assert crashes occur in Polly. This issue
arises because an instruction from one function queries the Dominator
Tree (DT) of another function. To fix this, the `isHoistableLoad`
function now skips instructions that belong to different function while
iterating.
The patch sets the vectorization metadata to false for Polly's fallback
loops. These are the loops executed when RTCs fail. This minimizes the
multiple loop versioning carried out by Polly and subsequently by the
Loop Vectorizer.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
The patch adds a nullptr check before accessing the loop blocks in
'hasPossiblyDistributableLoop' function. The existing check for the
loop’s containment in the region does not capture nullptr cases when the
region covers the entire function. Therefore, it’s better to exit if the
basic block isn’t part of any loop
Fixes#113772.
For both mlir and polly, the lit internal shell is the default shell for
running lit tests. However, if the user wanted to switch back to the
external shell by setting `LIT_USE_INTERNAL_SHELL=0`, the `not` used in
the body of the `if` conditional changes `use_lit_shell` to be True
instead of the intended False. Removing `not` allows for this lit config
to work as intended.
Fixes https://github.com/llvm/llvm-project/issues/106459.
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update scripts where possible, and by
find + replace where not.
The base concept is same as existing reduction algorithm where we get
the list of candidate pairs <store,load>. But the existing algorithm
works only if there is single binary operation between the load and
store.
Example sum += a[i];
This algorithm extends to work with more than single binary operation as
well. It is implemented using data flow reduction detection on basic
block level. We propagate the loads, the number of times the load is
used(flows into instruction) and binary operation performed until we
reach a store.
Example sum += a[i] + b[i];
```
sum(Ld) a[i](Ld)
\ + /
tmp b[i](Ld)
\ + /
sum(St)
```
In the above case the candidate pairs are formed by associating sum with
all of its load inputs which are sum, a[i] and b[i]. Then check
functions are used to filter a valid reduction pair ie {sum,sum}.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
Even as the NPM has been in use by Polly for a while now, the majority
of the tests continue using the LPM passes. This patch ports the tests
to use the NPM passes (for example, by replacing a flag such as
-polly-detect with -passes=polly-detect following the NPM syntax for
specifying passes) with some exceptions for some missing features in the
new passes.
Relanding #90632.
Even as the NPM has been in use by Polly for a while now, the
majority of the tests continue using the LPM passes. This patch
ports the tests to use the NPM passes (for example, by replacing
a flag such as -polly-detect with -passes=polly-detect following
the NPM syntax for specifying passes) with some exceptions for
some missing features in the new passes. Additionally, the lit
substitution %loadPolly is replaced by the substitution of what
was %loadNPMPolly and %loadNPMPolly is removed.
This flag enable the user to print debug Info from all the passes and
helpers inside polly at once. This will help a novice user as well to
work in polly without explicitly having to know which parts of polly has
actually kicked in and pass them via -debug-only.
To fix long compile time issue of Schedule optimizer, patch #77280 sets
the upper cap on max ISL operations. In case of bailing out when ISL
quota is hit, error handling behavior was restored manually. This commit
replaces the restoration code with IslMaxOperationsGuard helper and also
removes redundant early return.
Existing reduction detection algorithm does two types of memory checks
before marking a load store pair as reduction.
Second check is to verify there is no other memory access in ScopStmt
overlapping with the memory of load and store that forms the reduction.
Existing check misses cases where there could be probable overlap such
as
A[V] += A[P];
In the above case there is chance of overlap between A[V] and A[P] which
is missed.
This commit addresses this by removing the parameter from space before
checking for compatible space.
Part 1 of this patch :
[75297](https://github.com/llvm/llvm-project/pull/75297)
Polly currently uses `getDebugLoc` in a few places to produce diagnostic
output; this is correct when interacting with specific instructions, but
may be incorrect when dealing with instruction ranges if debug
intrinsics are included. As a general rule, the debug locations attached
to debug intrinsics may be misleading compared to the surrounding
instructions, and are not generally used for anything other than
determining variable scope info; the recommended approach is therefore
to use `getStableDebugLoc` instead, which skips over debug intrinsics.
This is necessary to fix test failures that occur when enabling
non-instruction debug info, which removes debug intrinsics from basic
blocks and thus alters the diagnostic output of Polly (despite causing
no functional change).
Existing reduction detection algorithm does two types of memory checks
before marking a load store pair as reduction.
First is to check if load and store are pointing to the same memory. This
check right now detects the following case as reduction. sum[0] = sum[1]
+ A[i]
This is because the check compares only base of the memory addresses
involved and not their indices. This patch addresses this issue and
introduces some debug prints. Added couple of test cases to verify the
functionality of patch as well.
There is no upper cap set on current Schedule Optimizer to compute
schedule. In some cases a very long compile time taken to compute the
schedule resulting in hang kind of behavior. This patch introduces a
flag 'polly-schedule-computeout' to pass the capwhich is initialized to
300000. This patch handles the compute out cases by bailing out and
exiting gracefully.
Fixed the test that failed in previous commit.
Fixes#69090
This reverts commit d6c4d4c9b910e8ad5ed7cd4825a143742041c1f4.
Broke buildldbots with asserts disabled; -debug-only is only available in
asserts builds.
There is no upper cap set on current Schedule Optimizer to compute
schedule. In some cases a very long compile time taken to compute the
schedule resulting in hang kind of behavior. This patch introduces a
flag 'polly-schedule-computeout' to pass the capwhich is initialized to
300000. This patch handles the compute out cases by bailing out and
exiting gracefully.
Fixes#69090
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.
After D154102 multi-line labels would get split incorrectly.
When CFG is generated for a function with basic block name longer
than 80 lines, then the header separator will be placed after the
line break for the label name instead of after the whole label name.
The fix is simple by just moving the insert of | character before the
line splitting happens.
Differential Revision: https://reviews.llvm.org/D159207
This change adds separators for basic block names, which makes it
easier to find a basic block based on its name and separates it
from the code.
Currently there is also a chance that the basic block label will
be present twice, that is in case the basic block has explicit
numbering, this change fixes this bug.
Differential Revision: https://reviews.llvm.org/D154102
Before this patch, we can only use the MaxBECount for an AddRec's range
computation if the MaxBECount has <= bit width of the AddRec. This patch
reasons that if a MaxBECount has > bit width, and is <= the max value of
AddRec's bit width, we can still use the MaxBECount.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D151698
This is an ongoing series of commits that are reformatting our
Python code. This catches the last of the python files to
reformat. Since they where so few I bunched them together.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: jhenderson, #libc, Mordante, sivachandra
Differential Revision: https://reviews.llvm.org/D150784
As long as aliasee has `@llvm.used` or `@llvm.compiler.used` references, we cannot do the related replace or delete operations. Even if it is a Local Linkage, we cannot infer if there is no other use for it, such as asm or other future added cases.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D145293
Polly-ACC is unmaintained and since it has never been ported to the NPM pipeline, since D136621 it is not even accessible anymore without manually specifying the passes on the `opt` command line.
Since there is no plan to put it to a maintainable state, remove it from Polly.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D142580
Polly's internal vectorizer is not well maintained and is known to not work in some cases such as region ScopStmts. Unlike LLVM's LoopVectorize pass it also does not have a target-dependent cost heuristics, and we recommend using LoopVectorize instead of -polly-vectorizer=polly.
In the future we hope that Polly can collaborate better with LoopVectorize, like Polly marking a loop is safe to vectorize with a specific simd width, instead of replicating its functionality.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D142640