21 Commits

Author SHA1 Message Date
Martin Braenne
9ecdbe3855 [clang][dataflow] Rename AggregateStorageLocation to RecordStorageLocation and StructValue to RecordValue.
- Both of these constructs are used to represent structs, classes, and unions;
  Clang uses the collective term "record" for these.

- The term "aggregate" in `AggregateStorageLocation` implies that, at some
  point, the intention may have been to use it also for arrays, but it don't
  think it's possible to use it for arrays. Records and arrays are very
  different and therefore need to be modeled differently. Records have a fixed
  set of named fields, which can have different type; arrays have a variable
  number of elements, but they all have the same type.

- Futhermore, "aggregate" has a very specific meaning in C++
  (https://en.cppreference.com/w/cpp/language/aggregate_initialization).
  Aggregates of class type may not have any user-declared or inherited
  constructors, no private or protected non-static data members, no virtual
  member functions, and so on, but we use `AggregateStorageLocations` to model all objects of class type.

In addition, for consistency, we also rename the following:

- `getAggregateLoc()` (in `RecordValue`, formerly known as `StructValue`) to
  simply `getLoc()`.

- `refreshStructValue()` to `refreshRecordValue()`

We keep the old names around as deprecated synonyms to enable clients to be migrated to the new names.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D156788
2023-08-01 20:29:40 +00:00
Martin Braenne
1b334a2ae7 [clang][dataflow] Eliminate ReferenceValue.
There are no remaining uses of this class in the framework.

This patch is part of the ongoing migration to strict handling of value categories (see https://discourse.llvm.org/t/70086 for details).

Reviewed By: ymandel, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D155922
2023-07-27 13:14:47 +00:00
Sam McCall
fc9821a877 Reland "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.

Test problems were due to unspecified order of function arg evaluation.

Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)"

This properly frees the Value hierarchy from managing boolean formulas.

We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.

We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.

For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.

The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.

The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.

Differential Revision: https://reviews.llvm.org/D153469
2023-07-07 11:58:33 +02:00
Sam McCall
2d8cd19512 Revert "Reland "[dataflow] Add dedicated representation of boolean formulas" and followups
These changes are OK, but they break downstream stuff that needs more time to adapt :-(

This reverts commit 71579569f4399d3cf6bc618dcd449b5388d749cc.
This reverts commit 5e4ad816bf100b0325ed45ab1cfea18deb3ab3d1.
This reverts commit 1c3ac8dfa16c42a631968aadd0396cfe7f7778e0.
2023-07-05 14:32:25 +02:00
Sam McCall
5e4ad816bf [dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)
This properly frees the Value hierarchy from managing boolean formulas.

We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.

We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.

For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.

The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.

The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.

Differential Revision: https://reviews.llvm.org/D153469
2023-07-05 13:54:32 +02:00
Sam McCall
1c3ac8dfa1 Reland "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.

Test problems were due to unspecified order of function arg evaluation.
2023-07-05 13:35:16 +02:00
Tom Weaver
7a72ce9822 Revert "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit 2fd614efc1bb9c27f1bc6c3096c60a7fe121e274.

Commit caused failures on the following two build bots:
  http://45.33.8.238/win/80815/step_7.txt
  https://lab.llvm.org/buildbot/#/builders/139/builds/44269
2023-07-04 14:05:54 +01:00
Sam McCall
2fd614efc1 [dataflow] Add dedicated representation of boolean formulas
This is the first step in untangling the two current jobs of BoolValue.

=== Desired end-state: ===

- BoolValue will model C++ booleans e.g. held in StorageLocations.
  this includes describing uncertainty (e.g. "top" is a Value concern)
- Formula describes analysis-level assertions in terms of SAT atoms.

These can still be linked together: a BoolValue may have a corresponding
SAT atom which is constrained by formulas.

=== Done in this patch: ===

BoolValue is left intact, Formula is just the input type to the
SAT solver, and we build formulas as needed to invoke the solver.

=== Incidental changes to debug string printing: ===

- variables renamed from B0 etc to V0 etc
  B0 collides with the names of basic blocks, which is confusing when
  debugging flow conditions.
- debug printing of formulas (Formula and Atom) uses operator<<
  rather than debugString(), so works with gtest.
  Therefore moved out of DebugSupport.h
- Did the same to Solver::Result, and some helper changes to SolverTest,
  so that we get useful messages on unit test failures
- formulas are now printed as infix expressions on one line, rather than
  wrapped/indented S-exprs. My experience is that this is easier to scan
  FCs for small examples, and large ones are unreadable either way.
- most of the several debugString() functions for constraints/results
  are unused, so removed them rather than updating tests.
  Inlined the one that was actually used into its callsite.

Differential Revision: https://reviews.llvm.org/D153366
2023-07-04 12:19:44 +02:00
Sam McCall
d85c233e4b [dataflow] Make SAT solver deterministic
The SAT solver imported its constraints by iterating over an unordered DenseSet.
The path taken, and ultimately the runtime, the specific solution found, and
whether it would time out or complete could depend on the iteration order.

Instead, have the caller specify an ordered collection of constraints.
If this is built in a deterministic way, the system can have deterministic
behavior.
(The main alternative is to sort the constraints by value, but this option
is simpler today).

A lot of nondeterminism still appears to be remain in the framework, so today
the solver's inputs themselves are not deterministic yet.

Differential Revision: https://reviews.llvm.org/D153584
2023-06-26 21:16:25 +02:00
Sam McCall
d630134f1f [dataflow] handle missing case in value debug strings
Differential Revision: https://reviews.llvm.org/D146625
2023-03-25 04:20:06 +01:00
Fangrui Song
1e6adbadc7 [clang] llvm::Optional::value => operator*/operator->
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.

This makes `ninja check-clang` work in the absence of llvm::Optional::value.
2022-12-17 08:10:45 +00:00
Yitzhak Mandelbaum
39b9d4f188 [clang][dataflow] Add support for a Top value in boolean formulas.
Currently, our boolean formulas (`BoolValue`) don't form a lattice, since they
have no Top element. This patch adds such an element, thereby "completing" the
built-in model of bools to be a proper semi-lattice. It still has infinite
height, which is its own problem, but that can be solved separately, through
widening and the like.

Patch 1 for Issue #56931.

Differential Revision: https://reviews.llvm.org/D135397
2022-10-14 17:41:53 +00:00
Wei Yi Tee
dbb95c2a85 [clang][dataflow] Debug string for value kinds.
Differential Revision: https://reviews.llvm.org/D131891
2022-08-19 15:00:01 +00:00
Dmitri Gribenko
b5e3dac33d [clang][dataflow] Add explicit "AST" nodes for implications and iff
Previously we used to desugar implications and biconditionals into
equivalent CNF/DNF as soon as possible. However, this desugaring makes
debug output (Environment::dump()) less readable than it could be.
Therefore, it makes sense to keep the sugared representation of a
boolean formula, and desugar it in the solver.

Reviewed By: sgatev, xazax.hun, wyt

Differential Revision: https://reviews.llvm.org/D130519
2022-07-26 14:19:22 +02:00
Dmitri Gribenko
b5414b566a [clang][dataflow] Add DataflowEnvironment::dump()
Start by dumping the flow condition.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D130398
2022-07-23 01:31:53 +02:00
Dmitri Gribenko
ee6aba85aa [clang][dataflow] Expose stringification functions for SAT solver enums
Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D130399
2022-07-23 01:21:20 +02:00
Dmitri Gribenko
589ddd7fe8 [clang][dataflow] ArrayRef'ize debugString()
Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D130400
2022-07-23 01:16:31 +02:00
Fangrui Song
3c849d0aef Modernize Optional::{getValueOr,hasValue} 2022-07-15 01:20:39 -07:00
Kazu Hirata
cb2c8f694d [clang] Use value instead of getValue (NFC) 2022-07-13 23:39:33 -07:00
Wei Yi Tee
b8d83e8004 [clang][dataflow] Generate readable form of input and output of satisfiability checking.
Differential Revision: https://reviews.llvm.org/D129548
2022-07-13 11:58:51 +00:00
Wei Yi Tee
c9666d2339 [clang][dataflow] Generate readable form of boolean values.
Differential Revision: https://reviews.llvm.org/D129547
2022-07-13 10:35:17 +00:00