Create the `ReportedErrors` class to track the number of reported errors
during verification. The class will block reporting errors if some other
thread is currently reporting an error.
I've encountered a case where there were many different verifications
reporting errors at the same time on different threads. This ensures
that we don't start printing the error from one case until we are
completely done printing errors from other cases. Most of the time
`AbortOnError = true` so we usually abort after reporting the first
error.
Depends on https://github.com/llvm/llvm-project/pull/111602.
This patch implements `Interval::comesBefore(const Interval &Other)`
which returns true if this interval is strictly before Other in program
order. The function asserts that the intervals are disjoint.
This patch reapplies #111173, fixing a bug when instantiating dependent
expressions that name a member template that is later explicitly
specialized for a class specialization that is implicitly instantiated.
The bug is addressed by adding the `hasMemberSpecialization` function,
which return `true` if _any_ redeclaration is a member specialization.
This is then used when determining the instantiation pattern for a
specialization of a template, and when collecting template arguments for
a specialization of a template.
Consider #109148:
```c++
template <typename ...Ts>
void f() {
[] {
(^Ts);
};
}
```
When we encounter `^Ts`, we try to parse a block and subsequently call
`DiagnoseUnexpandedParameterPack()` (in `ActOnBlockArguments()`), which
sees `Ts` and sets `ContainsUnexpandedParameterPack` to `true` in the
`LambdaScopeInfo` of the enclosing lambda. However, the entire block is
subsequently discarded entirely because it isn’t even syntactically
well-formed. As a result, `ContainsUnexpandedParameterPack` is `true`
despite the lambda’s body no longer containing any unexpanded packs,
which causes an assertion the next time
`DiagnoseUnexpandedParameterPack()` is called.
This pr moves handling of unexpanded parameter packs into
`CapturingScopeInfo` instead so that the same logic is used for both
blocks and lambdas. This fixes this issue since the
`ContainsUnexpandedParameterPack` flag is now part of the block (and
before that, its `CapturingScopeInfo`) and no longer affects the
surrounding lambda directly when the block is parsed. Moreover, this
change makes blocks actually usable with pack expansion.
This fixes#109148.
It checks that a Region can have non-contiguous instructions, and that
when iterating through it you don't get the instructions in-between that
aren't part of the Region.
This commit adds parsing of type modifiers for the MAP clause: CLOSE,
OMPX_HOLD, and PRESENT. The support for ALWAYS has already existed.
The new modifiers are not yet handled in lowering: when present, a TODO
message is emitted and compilation stops.
Use a word boundary, current code was currently failing when parsing the
definition of because it would also match
`CooperativeMatrixOp` from a later mention of
`SPIRV_KHR_CooperativeMatrixOperandsAttr`.
The 'gang' clause is used to specify parallel execution of loops, thus
has some complicated rules depending on the 'loop's associated compute
construct. This patch implements all of those.
Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate
sections that would be discarded by GC.
Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return
false (when Defined) and then return true (been demoted to Undefined).
```
addScriptReferencedSymbolsToSymTable
shouldAddProvideSym(f1): false
// the RHS (bar) is not added to `referencedSymbols` and may be GCed
declareSymbols
shouldAddProvideSym(f1): false
markLive
demoteSymbolsAndComputeIsPreemptible
// demoted f1 to Undefined
processSymbolAssignments
addSymbol
shouldAddProvideSym(f1): true
```
The inconsistency can cause `cmd->expression()` in `addSymbol` to be
evaluated, leading to `symbol not found: bar` errors (since `bar` in the
RHS is not in `referencedSymbols` and is GCed) (#111478).
Fix this by adding a `sym->isUsedInRegularObj` condition, making
`shouldAddProvideSym(f1)` values consistent. In addition, we need a
`sym->exportDynamic` condition to keep provide-shared.s working.
Fixes: ebb326a51fec37b5a47e5702e8ea157cd4f835cd
Pull Request: https://github.com/llvm/llvm-project/pull/111945
Add support for specializing linalg.broadcast and linalg.transform from generic. Also, does some refactoring to reuse specialization checks, migrating some common uses to op interface methods.
Currently we will not be able to inline a large function even if it only
has one live use because the inline cost is still very high after
applying `LastCallToStaticBonus`, which is a constant. This could
significantly impact the performance because CSR spill is very
expensive.
This PR adds a new function `getInliningLastCallToStaticBonus` to TTI to
allow targets to customize this value.
Fixes SWDEV-471398.
Fixes https://github.com/llvm/llvm-project/issues/60942: IEEE semantics
is likely what many frontends want (it definitely is what Rust wants),
and it is what LLVM passes already assume when they use APFloat to
propagate float operations.
This does not reflect what happens on x87, but what happens there is
just plain unsound (https://github.com/llvm/llvm-project/issues/89885,
https://github.com/llvm/llvm-project/issues/44218); there is no coherent
specification that will describe this behavior correctly -- the backend
in combination with standard LLVM passes is just fundamentally buggy in
a hard-to-fix-way.
There's also the questions around flushing subnormals to zero, but [this
discussion](https://discourse.llvm.org/t/questions-about-llvm-canonicalize/79378)
seems to indicate a general stance of: this is specific non-standard
hardware behavior, and generally needs LLVM to be told that basic float
ops do not return the standard result. Just naively running
LLVM-compiled code on hardware configured to flush subnormals will lead
to #89885-like issues.
AFAIK this is also what Alive2 implements (@nunoplopes please correct me
if I am wrong).
The concept of a 'program point' in the original data flow framework is
ambiguous. It can refer to either an operation or a block itself. This
representation has different interpretations in forward and backward
data-flow analysis. In forward data-flow analysis, the program point of
an operation represents the state after the operation, while in backward
data flow analysis, it represents the state before the operation. When
using forward or backward data-flow analysis, it is crucial to carefully
handle this distinction to ensure correctness.
This patch refactors the definition of program point, unifying the
interpretation of program points in both forward and backward data-flow
analysis.
How to integrate this patch?
For dense forward data-flow analysis and other analysis (except dense
backward data-flow analysis), the program point corresponding to the
original operation can be obtained by `getProgramPointAfter(op)`, and
the program point corresponding to the original block can be obtained by
`getProgramPointBefore(block)`.
For dense backward data-flow analysis, the program point corresponding
to the original operation can be obtained by
`getProgramPointBefore(op)`, and the program point corresponding to the
original block can be obtained by `getProgramPointAfter(block)`.
NOTE: If you need to get the lattice of other data-flow analyses in
dense backward data-flow analysis, you should still use the dense
forward data-flow approach. For example, to get the Executable state of
a block in dense backward data-flow analysis and add the dependency of
the current operation, you should write:
``getOrCreateFor<Executable>(getProgramPointBefore(op),
getProgramPointBefore(block))``
In case above, we use getProgramPointBefore(op) because the analysis we
rely on is dense backward data-flow, and we use
getProgramPointBefore(block) because the lattice we query is the result
of a non-dense backward data flow computation.
related dsscussion:
https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8
corresponding PSA:
https://discourse.llvm.org/t/psa-program-point-semantics-change/81479
The purpose of this optimization is to make the VL argument, for
instructions that have a VL argument, as small as possible. This is
implemented by visiting each instruction in reverse order and checking
that if it has a VL argument, whether the VL can be reduced.
By putting this pass before VSETVLI insertion, we see three kinds of
changes to generated code:
1. Eliminate VSETVLI instructions
2. Reduce the VL toggle on VSETVLI instructions that also change vtype
3. Reduce the VL set by a VSETVLI instruction
The list of supported instructions is currently whitelisted for safety.
In the future, we could add more instructions to `isSupportedInstr` to
support even more VL optimization.
We originally wrote this pass because vector GEP instructions do not
take a VL, which leads us to emit code that uses VL=VLMAX to implement
GEP in the RISC-V backend. As a result, some of the vector instructions
will write to lanes, specifically between the intended VL and VLMAX,
that will never be read. As an alternative to this pass, we considered
adding a vector predicated GEP instruction, but this would not fit well
into the intrinsic type system since GEP has a variable number of
arguments, each with arbitrary types. The second approach we considered
was to put this pass after VSETVLI insertion, but we found that it was
more difficult to recognize optimization opportunities, especially
across basic block boundaries -- the data flow analysis was also a bit
more expensive and complex.
While this pass solves the GEP problem, we have expanded it to handle
more cases of VL optimization, and there is opportunity for the analysis
to be improved to enable even more optimization. We have a few follow up
patches to post, but figured this would be a good start.
---------
Co-authored-by: Craig Topper <craig.topper@sifive.com>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
This improves the CI output by providing collapsable sections for
sub-parts of our build.
This was originally opened as #75233.
Co-authored-by: eric <eric@efcs.ca>
Add a new enumeration `SuppressInlineNamespaceMode` to `PrintingPolicy` that
is explicit about how to handle inline namespaces. `SuppressInlineNamespace`
uses that enumeration now instead of a Boolean value.
Specializing a template from an inline namespace should be transparent.
For instance
```
namespace foo {
inline namespace v1 {
template<typename A>
void function(A&);
}
}
namespace foo {
template<>
void function<int>(int&);
}
```
`hasName` should match both declarations of `foo::function`.
Makes the behavior of `matchesNodeFullSlow` and `matchesNodeFullFast`
consistent, fixing an assert inside `HasNameMatcher::matchesNode`.
Lowering fixed-size BUILD_VECTORS without Neon may introduce stack
spills, leading to more stores/reloads than if the stores were not
merged. In some cases, it can also prevent using paired store
instructions.
In the future, we may want to relax when SVE is available, but
currently, the SVE lowerings for BUILD_VECTOR are limited to a few
specific cases.
Fixes 0e913237871e8c9290e82be30be8b3484952eee0 /
https://github.com/llvm/llvm-project/pull/111531
For reasons I can't explain, a clean build works fine for me, and all
the bots are working fine. But if I rebuild in some way the make tool
becomes None.
Looking at the other variables, they had these extra lines so I've added
those for make and it seems to solve the problem.
It would be nice to see what our users think about this change, as this
is something that WG21/EWG quite wants to fix a handful of questionable
issues with UB. Depending on the outcome of this after being committed,
we might instead suggest EWG undeprecate this, and require a bit of
'magic' from the lexer.
Additionally, this patch makes it so we emit this diagnostic ALSO in
cases where the literal name is reserved. It doesn't make sense to limit
that.
---------
Co-authored-by: Vlad Serebrennikov <serebrennikov.vladislav@gmail.com>