A regular expression was used in the lexing process. It made the program
take more than linear time with regards to the length of the input. It
looked like the entire buffer could be scanned for every token lexed.
Now the regular expression is replaced with code. Previously it took 20
minutes for the program to format 125 000 lines of code on my computer.
Now it takes 315 milliseconds.
`ASTImporterLookupTable` did use the `getPrimaryContext` function to get
the declaration context of the inserted items. This is problematic
because the primary context can change during import of AST items, most
likely if a definition of a previously not defined class is imported.
(For any record the primary context is the definition if there is one.)
The use of primary context is really not important, only for namespaces
because these can be re-opened and lookup in one namespace block is not
enough. This special search is now moved into ASTImporter instead of
relying on the lookup table.
This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54.
This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:
lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
Part 2 (and final part) following
https://github.com/llvm/llvm-project/pull/120102
Allows users to do things like:
```
if (o->x.has_value()) {
((*o).x).value();
}
```
where the `->` and `*` are operator overload calls.
A user could instead extract the nested optional into a local variable
once instead of doing two accessor calls back to back, but currently
they are unsure why the code is flagged.
If we have a refutation Z3 query timed out (UNDEF), allow a couple of
retries to improve stability of the query. By default allow 2 retries,
which will give us in maximum of 3 solve attempts per query.
Retries should help mitigating flaky Z3 queries.
See the details in the following RFC:
https://discourse.llvm.org/t/analyzer-rfc-retry-z3-crosscheck-queries-on-timeout/83711
Note that with each attempt, we spend more time per query.
Currently, we have a 15 seconds timeout per query - which are also in
effect for the retry attempts.
---
Why should this help?
In short, retrying queries should bring stability because if a query
runs long
it's more likely that it did so due to some runtime anomaly than it's on
the edge of succeeding. This is because most queries run quick, and the
queries that run long, usually run long by a fair amount.
Consequently, retries should improve the stability of the outcome of the
Z3 query.
In general, the retries shouldn't increase the overall analysis time
because it's really rare we hit the 0.1% of the cases when we would do
retries. But keep in mind that the retry attempts can add up if many
retries are allowed, or the individual query timeout is large.
CPP-5920
Also, simplify the existing test for NamespaceMacros. Like the options
tested by the added tests, it's also a list of arbitrary strings and
initialized to an empty list. (The other existing tests for list of
strings either are initialized to a list of one or more strings or
require specific strings.)
It wraps the body of namespace with additional newlines, turning this code:
```
namespace N {
int function();
}
```
into the following:
```
namespace N {
int function();
}
```
---------
Co-authored-by: Owen Pan <owenpiano@gmail.com>
This fixes#101363 which is a resurrection of a previously opened but
never completed review: https://reviews.llvm.org/D11851
The feature is to allow code like the following not to be broken across
multiple lines:
```
namespace foo { class bar; }
namespace foo { namespace bar { class baz; } }
```
Code like this is commonly used for forward declarations, which are
ideally kept compact. This is also apparently the format that
include-what-you-use will insert for forward declarations.
Also, fix an off-by-one error in `CompactNamespaces` code. For nested
namespaces with 3 or more namespaces, it was incorrectly compacting
lines which were 1 or two spaces over the `ColumnLimit`, leading to
incorrect formatting results.
This is part 1 of caching for smart pointer accessors, building on top
of the CachedConstAccessorsLattice, which caches "normal" accessors.
Smart pointer accessors are a bit different in that they may:
- have aliases to access the same underlying data (but potentially
returning slightly different types like `&` vs `*`). Within a
"checked" sequence users may mix uses of the different aliases and the
check should apply to any of the spellings.
- may have non-const overloads in addition to the const version, where
the non-const doesn't actually modify the container
Part 2 will follow and add transfer functions utilities. It will also
add a user UncheckedOptionalAccessModel. We'd seen false positives when
nesting StatusOr<optional<T>> and optional<StatusOr<T>>, etc. which this
can help address.
Note that PointerUnion::get has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Discussion here:
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520/15?u=szelethus
The original patch, #97298 introduced new timeouts backed by thorough
testing and measurements to keep the running time of Z3 within
reasonable limits. The measurements also showed that only certain
reports and certain TUs were responsible for the poor performance of Z3
refutation.
Unfortunately, it seems like that on machines with different
characteristics (slower machines) the current timeouts don't just axe
0.01% of reports, but many more as well. Considering that timeouts are
inherently nondeterministic as a cutoff point, this lead reports sets
being vastly different on the same projects with the same configuration.
The discussion link shows that all configurations introduced in the
patch with their default values lead to severa nondeterminism of the
analyzer. As we, and others use the analyzer as a gating tool for PRs,
we should revert to the original defaults.
We should respect that
* There are still parts of the analyzer that are either proven or
suspected to contain nondeterministic code (like pointer sets),
* A 15s timeout is more likely to hit the same reports every time on a
wider range of machines, but is still inherently nondeterministic, but
an infinite timeout leads to the tool hanging,
* If you measure the performance of the analyzer on your machines, you
can and should achieve some speedup with little or no observable
nondeterminism.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
If 'GeneratedArgsStorage' ever grows, contained strings may get copied
and data pointers stored in 'GeneratedArgs' may become invalid, pointing
to deallocated memory.
Tuple-like types introduce `VarDecl`s in the AST for their "holding
vars", but AST visitors do not visit those. As a result the `VarDecl`
for the holding var is orphaned when trying to retreive its parents.
Fix a `FlowSensitive` test that assumes that only a `BindingDecl` is
introduced with the given name (the matcher now can also reach the
`VarDecl` for the holding var).
Reland https://github.com/llvm/llvm-project/pull/83237
---
(Original comments)
Currently all the specializations of a template (including
instantiation, specialization and partial specializations) will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.
This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.
For example,
```
// M.cppm
export module M;
export template <class T>
class A {};
export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
// use.cpp
import M;
A<int> a;
```
We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.
Given that the templates are heavily used in C++, this is a pain point
for the performance.
This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.
To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.
Move the checks related to breaking before right braces and right parens
earlier to avoid conflicting checks that prevent breaking based on the
left-hand token. This allows properly formatting declarations with
pointers and references at a minimum.
Currently all the specializations of a template (including
instantiation, specialization and partial specializations) will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.
This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.
For example,
```
// M.cppm
export module M;
export template <class T>
class A {};
export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
// use.cpp
import M;
A<int> a;
```
We should a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.
Given that the templates are heavily used in C++, this is a pain point
for the performance.
This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.
To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.
The patch was reviewed in
https://github.com/llvm/llvm-project/pull/83237 but that PR is a stacked
PR. But I feel the intention of the stacked PRs get lost during the
review process. So I feel it is better to merge the commits into a
single commit instead of merging them in the PR page. It is better for
us to cherry-pick and revert.
The Lexer used in getRawToken is not told to keep whitespace, so when it
skips over escaped newlines, it also ignores whitespace, regardless of
getRawToken's IgnoreWhiteSpace parameter.
Instead of letting this case fall through to lexing, check
for whitespace after skipping over any escaped newlines.
This reverts commit 54db16221c92eb52efbea90ad5b5d2a1d00cda3e.
Check for existence of __SIZOEF_INT128__ so we don't run those
tests on targets that don't have int128.