As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
For both mlir and polly, the lit internal shell is the default shell for
running lit tests. However, if the user wanted to switch back to the
external shell by setting `LIT_USE_INTERNAL_SHELL=0`, the `not` used in
the body of the `if` conditional changes `use_lit_shell` to be True
instead of the intended False. Removing `not` allows for this lit config
to work as intended.
Fixes https://github.com/llvm/llvm-project/issues/106459.
CodeGenIntrinsic changes:
- Use `const` Record pointers, and `StringRef` when possible.
- Default initialize several fields with their definition instead of in
the constructor.
- Simplify various string checks in the constructor using StringRef
starts_with()/ends_with() functions.
- Eliminate first argument to `setDefaultProperties` and use `TheDef`
class member instead.
IntrinsicEmitter changes:
- Emit `namespace llvm::Intrinsic` instead of nested namespaces.
- End generated comments with a .
- Use range based for loops, and early continue within loops.
- Emit `static constexpr` instead of `static const` for arrays.
- Change `compareFnAttributes` to use std::tie() to compare intrinsic
attributes and return a default value when all attributes are equal.
STLExtras:
- Add std::replace wrapper which takes a range.
DominatorTree, LoopInfo, and ScalarEvolution are function-level analyses
that expect to be called only on instructions and basic blocks of the
function they were original created for. When Polly outlined a parallel
loop body into a separate function, it reused the same analyses seemed
to work until new checks to be added in #101198.
This patch creates new analyses for the subfunctions. GenDT, GenLI, and
GenSE now refer to the analyses of the current region of code. Outside
of an outlined function, they refer to the same analysis as used for the
SCoP, but are substituted within an outlined function.
Additionally to the cross-function queries of DT/LI/SE, we must not
create SCEVs that refer to a mix of expressions for old and generated
values. Currently, SCEVs themselves do not "remember" which
ScalarEvolution analysis they were created for, but mixing them is just
as unexpected as using DT/LI across function boundaries. Hence
`SCEVLoopAddRecRewriter` was combined into `ScopExpander`.
`SCEVLoopAddRecRewriter` only replaced induction variables but left
SCEVUnknowns to reference the old function. `SCEVParameterRewriter`
would have done so but its job was effectively superseded by
`ScopExpander`, and now also `SCEVLoopAddRecRewriter`. Some issues
persist put marked with a FIXME in the code. Changing them would
possibly cause this patch to be not NFC anymore.
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update scripts where possible, and by
find + replace where not.
The base concept is same as existing reduction algorithm where we get
the list of candidate pairs <store,load>. But the existing algorithm
works only if there is single binary operation between the load and
store.
Example sum += a[i];
This algorithm extends to work with more than single binary operation as
well. It is implemented using data flow reduction detection on basic
block level. We propagate the loads, the number of times the load is
used(flows into instruction) and binary operation performed until we
reach a store.
Example sum += a[i] + b[i];
```
sum(Ld) a[i](Ld)
\ + /
tmp b[i](Ld)
\ + /
sum(St)
```
In the above case the candidate pairs are formed by associating sum with
all of its load inputs which are sum, a[i] and b[i]. Then check
functions are used to filter a valid reduction pair ie {sum,sum}.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
Move PassInstrumentationAnalysis into PassInstrumentation.h and stop
including it in PassManager.h (effectively inverting the direction of
the dependency).
Most places using PassManager are not interested in PassInstrumentation,
and we no longer have any uses of it in PassManager.h itself (only in
PassManagerImpl.h).
This patch simplifies instruction creation by replacing all overloads of
instruction constructors/Create methods that are identical other than
the Instruction *InsertBefore/BasicBlock *InsertAtEnd/BasicBlock::iterator
InsertBefore argument with a single version that takes an InsertPosition
argument. The InsertPosition class can be implicitly constructed from
any of the above, internally converting them to the appropriate
BasicBlock::iterator value which can then be used to insert the
instruction (or to not insert it if an invalid iterator is passed).
The upshot of this is that code will be deduplicated, and all callsites
will switch to calling the new unified version without any changes
needed to make the compiler happy. There is at least one exception to
this; the construction of InsertPosition is a user-defined conversion,
so any caller that was already relying on a different user-defined
conversion won't work. In all of LLVM and Clang this happens exactly
once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca
with an AssertingVH<Instruction> argument, which must now be cast to an
Instruction* by using `&*`. If this is more common elsewhere, it could
be fixed by adding an appropriate constructor to InsertPosition.
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
Even as the NPM has been in use by Polly for a while now, the majority
of the tests continue using the LPM passes. This patch ports the tests
to use the NPM passes (for example, by replacing a flag such as
-polly-detect with -passes=polly-detect following the NPM syntax for
specifying passes) with some exceptions for some missing features in the
new passes.
Relanding #90632.
Even as the NPM has been in use by Polly for a while now, the
majority of the tests continue using the LPM passes. This patch
ports the tests to use the NPM passes (for example, by replacing
a flag such as -polly-detect with -passes=polly-detect following
the NPM syntax for specifying passes) with some exceptions for
some missing features in the new passes. Additionally, the lit
substitution %loadPolly is replaced by the substitution of what
was %loadNPMPolly and %loadNPMPolly is removed.
Reverts commit d68826dfbd98, which changes the previous default behavior
of always breaking before a stream insertion operator `<<` if both
operands are string literals.
Also reverts the related commits 27f547968cce and bf05be5b87fc.
See the discussion in #88483.
This patch addresses the (performance )suggestions by checkcpp static
analyzer for couple of files. Here we use const reference for the
suggested function arguments.
Fixes#82263.
This flag enable the user to print debug Info from all the passes and
helpers inside polly at once. This will help a novice user as well to
work in polly without explicitly having to know which parts of polly has
actually kicked in and pass them via -debug-only.
These are the last remaining "trivial" changes to passes that use
Instruction pointers for insertion. All of this should be NFC, it's just
changing the spelling of how we identify a position.
In one or two locations, I'm also switching uses of getNextNode etc to
using std::next with iterators. This too should be NFC.
---------
Merged by: Stephen Tozer <stephen.tozer@sony.com>
It's becoming potentially unsafe to insert a PHI instruction using a plain
Instruction pointer. Switch all the remaining sites that create and insert
PHIs to use iterators instead. For example, the code in
ComplexDeinterleavingPass.cpp is definitely at-risk of mixing PHIs and
debug-info.
To fix long compile time issue of Schedule optimizer, patch #77280 sets
the upper cap on max ISL operations. In case of bailing out when ISL
quota is hit, error handling behavior was restored manually. This commit
replaces the restoration code with IslMaxOperationsGuard helper and also
removes redundant early return.
Existing reduction detection algorithm does two types of memory checks
before marking a load store pair as reduction.
Second check is to verify there is no other memory access in ScopStmt
overlapping with the memory of load and store that forms the reduction.
Existing check misses cases where there could be probable overlap such
as
A[V] += A[P];
In the above case there is chance of overlap between A[V] and A[P] which
is missed.
This commit addresses this by removing the parameter from space before
checking for compatible space.
Part 1 of this patch :
[75297](https://github.com/llvm/llvm-project/pull/75297)
Polly currently uses `getDebugLoc` in a few places to produce diagnostic
output; this is correct when interacting with specific instructions, but
may be incorrect when dealing with instruction ranges if debug
intrinsics are included. As a general rule, the debug locations attached
to debug intrinsics may be misleading compared to the surrounding
instructions, and are not generally used for anything other than
determining variable scope info; the recommended approach is therefore
to use `getStableDebugLoc` instead, which skips over debug intrinsics.
This is necessary to fix test failures that occur when enabling
non-instruction debug info, which removes debug intrinsics from basic
blocks and thus alters the diagnostic output of Polly (despite causing
no functional change).
Existing reduction detection algorithm does two types of memory checks
before marking a load store pair as reduction.
First is to check if load and store are pointing to the same memory. This
check right now detects the following case as reduction. sum[0] = sum[1]
+ A[i]
This is because the check compares only base of the memory addresses
involved and not their indices. This patch addresses this issue and
introduces some debug prints. Added couple of test cases to verify the
functionality of patch as well.
This changes the AliasSetTracker to track memory locations instead of
pointers in its alias sets. The motivation for this is outlined in an RFC
posted on LLVM discourse:
https://discourse.llvm.org/t/rfc-dont-merge-memory-locations-in-aliassettracker/73336
In the data structures of the AST implementation, I made the choice to
replace the linked list of `PointerRec` entries (that had to go anyway)
with a simple flat vector of `MemoryLocation` objects, but for the
`AliasSet` objects referenced from a lookup table, I retained the
mechanism of a linked list, reference counting, forwarding, etc. The
data structures could be revised in a follow-up change.
There is no upper cap set on current Schedule Optimizer to compute
schedule. In some cases a very long compile time taken to compute the
schedule resulting in hang kind of behavior. This patch introduces a
flag 'polly-schedule-computeout' to pass the capwhich is initialized to
300000. This patch handles the compute out cases by bailing out and
exiting gracefully.
Fixed the test that failed in previous commit.
Fixes#69090
This reverts commit d6c4d4c9b910e8ad5ed7cd4825a143742041c1f4.
Broke buildldbots with asserts disabled; -debug-only is only available in
asserts builds.
There is no upper cap set on current Schedule Optimizer to compute
schedule. In some cases a very long compile time taken to compute the
schedule resulting in hang kind of behavior. This patch introduces a
flag 'polly-schedule-computeout' to pass the capwhich is initialized to
300000. This patch handles the compute out cases by bailing out and
exiting gracefully.
Fixes#69090
Currently there's no component for LLVMPolly and PollyISL, however
they are added to exports whether or not they are installed. This commit
calls add_llvm_install_targets in the add_polly_library function to
allow installation of LLVMPolly and PollyISL via distribution
components, so they can be installed without also installing libPolly.a.
Closes: https://github.com/llvm/llvm-project/pull/66598
Otherwise link may fail if user provided additional library to link with via CMAKE_EXE_LINKER_FLAGS. Concrete example is using custom allocator, LLVMSupport provides needed -lpthread in that case.
Closes: https://github.com/llvm/llvm-project/pull/65424
This patch pulls out the memory checks from the base reduction detection
algorithm. This is the first one in the reduction patch series, to
reduce the difference in future patches.
The header file has been deprecated since:
commit f09cf34d00625e57dea5317a3ac0412c07292148
Author: Archibald Elliott <archibald.elliott@arm.com>
Date: Tue Dec 20 10:24:02 2022 +0000
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.