55 Commits

Author SHA1 Message Date
Yitzhak Mandelbaum
264976d98e [clang][dataflow] Unify TransferOptions and DataflowAnalysisContext::Options.
Merges `TransferOptions` into the newly-introduced
`DataflowAnalysisContext::Options` and removes explicit parameter for
`TransferOptions`, relying instead on the common options carried by the analysis
context. Given that there was no intent to allow different options between calls
to `transfer`, a common value for the options is preferable.

Differential Revision: https://reviews.llvm.org/D140703
2023-01-10 14:17:25 +00:00
Yitzhak Mandelbaum
01ccf7b3ce Revert "Revert "[clang][dataflow] Only model struct fields that are used in the function being analyzed.""
This reverts commit 2b1a517a92bfdfa3b692a660e19a2bb22513a567. It's a fix forward
with two memory errors fixed, one of which was the cause of the build breakage
in the buildbots.

Original message:

Previously, the model for structs modeled all fields in a struct when
`createValue` was called for that type. This patch adds a prepass on the
function under analysis to discover the fields referenced in the scope and then
limits modeling to only those fields. This reduces wasted memory usage
(modeling unused fields) which can be important for programs that use large
structs.

Note: This patch obviates the need for https://reviews.llvm.org/D123032.
2023-01-09 19:32:10 +00:00
Yitzhak Mandelbaum
2b1a517a92 Revert "[clang][dataflow] Only model struct fields that are used in the function being analyzed."
This reverts commit 5e8f597c2fedc740b71f07dfdb1ef3c2d348b193. It caused msan and ubsan breakages.
2023-01-06 01:07:28 +00:00
Yitzhak Mandelbaum
5e8f597c2f [clang][dataflow] Only model struct fields that are used in the function being analyzed.
Previously, the model for structs modeled all fields in a struct when
`createValue` was called for that type. This patch adds a prepass on the
function under analysis to discover the fields referenced in the scope and then
limits modeling to only those fields.  This reduces wasted memory usage
(modeling unused fields) which can be important for programss that use large
structs.

Note: This patch obviates the need for https://reviews.llvm.org/D123032.

Differential Revision: https://reviews.llvm.org/D140694
2023-01-05 21:46:39 +00:00
Dani Ferreira Franco Moura
d862f66221 [clang][dataflow] Treat unions as structs.
This is a straightfoward way to handle unions in dataflow analysis. Without this change, nullability verification crashes on files that contain unions.

Reviewed By: gribozavr2, ymandel

Differential Revision: https://reviews.llvm.org/D140696
2023-01-03 18:36:24 +00:00
Yitzhak Mandelbaum
38404df9d8 [clang][dataflow] Fix bug in handling of return statements.
The handling of return statements, added in support of context-sensitive
analysis, has a bug relating to functions that return reference
types. Specifically, interpretation of such functions can result in a crash from
a bad cast. This patch fixes the bug and guards all of that code with the
context-sensitive option, since there's no reason to execute at all when
context-sensitive analysis is off.

Differential Revision: https://reviews.llvm.org/D140430
2022-12-22 14:42:17 +00:00
Yitzhak Mandelbaum
ef4635452f [clang][dataflow] Add support for structured bindings of tuple-like types.
This patch adds interpretation of binding declarations resulting from a
structured binding (`DecompositionDecl`) to a tuple-like type. Currently, the
framework only supports binding to a struct.

Fixes issue #57252.

Differential Revision: https://reviews.llvm.org/D139544
2022-12-09 18:58:00 +00:00
Yitzhak Mandelbaum
7da087974f [clang][dataflow][NFC] Fix reachability warning.
Some compilers can't determine that all cases of the switch return (or are
unreachable) and warn about control reaching end of non-void
function. Explicitly mark with `llvm_unreachable`.

Differential Revision: https://reviews.llvm.org/D135978
2022-10-14 19:35:11 +00:00
Yitzhak Mandelbaum
39b9d4f188 [clang][dataflow] Add support for a Top value in boolean formulas.
Currently, our boolean formulas (`BoolValue`) don't form a lattice, since they
have no Top element. This patch adds such an element, thereby "completing" the
built-in model of bools to be a proper semi-lattice. It still has infinite
height, which is its own problem, but that can be solved separately, through
widening and the like.

Patch 1 for Issue #56931.

Differential Revision: https://reviews.llvm.org/D135397
2022-10-14 17:41:53 +00:00
Sam Estep
2efc8f8d65 [clang][dataflow] Add an option for context-sensitive depth
This patch adds a `Depth` field (default value 2) to `ContextSensitiveOptions`, allowing context-sensitive analysis of functions that call other functions. This also requires replacing the `DeclCtx` field on `Environment` with a `CallString` field that contains a vector of decl contexts, to ensure that the analysis doesn't try to analyze recursive or mutually recursive calls (which would result in a crash, due to the way we handle `StorageLocation`s).

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D131809
2022-08-15 19:58:40 +00:00
Sam Estep
b3f1a6bf10 [clang][dataflow] Encode options using llvm::Optional
This patch restructures `DataflowAnalysisOptions` and `TransferOptions` to use `llvm::Optional`, in preparation for adding more sub-options to the `ContextSensitiveOptions` struct introduced here.

Reviewed By: sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131779
2022-08-12 16:29:41 +00:00
Sam Estep
eb91fd5cbc [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-11 12:46:20 +00:00
Evgenii Stepanov
7587065043 Revert "[clang][dataflow] Analyze constructor bodies"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 000c8fef86abb7f056cbea2de99f21dca4b81bf8.
2022-08-10 14:21:56 -07:00
Sam Estep
000c8fef86 [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-10 14:01:45 +00:00
Sam Estep
0eaecbbc23 [clang][dataflow] Handle return statements
This patch adds a `ReturnLoc` field to the `Environment`, serving a similar to the `ThisPointeeLoc` field in the `DataflowAnalysisContext`. It then uses that (along with a new `VisitReturnStmt` method in `TransferVisitor`) to handle non-`void`-returning functions in context-sensitive analysis.

Reviewed By: ymandel, sgatev

Differential Revision: https://reviews.llvm.org/D130600
2022-08-04 17:42:19 +00:00
Yitzhak Mandelbaum
692e03039d [clang][dataflow] Add cache of ControlFlowContexts for function decls.
This patch modifies context-sensitive analysis of functions to use a cache,
rather than recreate the `ControlFlowContext` from a function decl on each
encounter. However, this is just step 1 (of N) in adding support for a
configurable map of "modeled" function decls (see issue #56879). The map will go
from the actual function decl to the `ControlFlowContext` used to model it. Only
functions pre-configured in the map will be modeled in a context-sensitive way.

We start with a cache because it introduces the desired map, while retaining the
current behavior. Here, functions are mapped to their actual implementations
(when available).

Differential Revision: https://reviews.llvm.org/D131039
2022-08-03 15:17:49 +00:00
Sam Estep
a6ddc68487 [clang][dataflow] Handle multiple context-sensitive calls to the same function
This patch enables context-sensitive analysis of multiple different calls to the same function (see the `ContextSensitiveSetBothTrueAndFalse` example in the `TransferTest` suite) by replacing the `Environment` copy-assignment with a call to the new `popCall` method, which  `std::move`s some fields but specifically does not move `DeclToLoc` and `ExprToLoc` from the callee back to the caller.

To enable this, the `StorageLocation` for a given parameter needs to be stable across different calls to the same function, so this patch also improves the modeling of parameter initialization, using `ReferenceValue` when necessary (for arguments passed by reference).

This approach explicitly does not work for recursive calls, because we currently only plan to use this context-sensitive machinery to support specialized analysis models we write, not analysis of arbitrary callees.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130726
2022-07-29 19:40:19 +00:00
Sam Estep
300fbf56f8 [clang][dataflow] Analyze calls to in-TU functions
This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `FIXME` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:54:27 +00:00
Sam Estep
cc9aa157a8 Revert "[clang][dataflow] Analyze calls to in-TU functions"
This reverts commit fa2b83d07ecab3b24b4c5ee2e7dc4b6bbc895317.
2022-07-26 17:30:09 +00:00
Sam Estep
fa2b83d07e [clang][dataflow] Analyze calls to in-TU functions
Depends On D130305

This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) first calls `initGlobalVars`, then maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `TODO` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:27:19 +00:00
Dmitri Gribenko
c0c9d717df [clang][dataflow] Rename iterators from IT to It
The latter way to abbreviate is a lot more common in the LLVM codebase.

Reviewed By: sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D130423
2022-07-25 20:28:47 +02:00
Wei Yi Tee
b611376e7e [clang][dataflow] Singleton pointer values for null pointers.
When a `nullptr` is assigned to a pointer variable, it is wrapped in a `ImplicitCastExpr` with cast kind `CK_NullTo(Member)Pointer`. This patch assigns singleton pointer values representing null to these expressions.

For each pointee type, a singleton null `PointerValue` is created and stored in the `NullPointerVals` map of the `DataflowAnalysisContext` class. The pointee type is retrieved from the implicit cast expression, and used to initialise the `PointeeLoc` field of the `PointerValue`. The `PointeeLoc` created is not mapped to any `Value`, reflecting the absence of value indicated by null pointers.

Reviewed By: gribozavr2, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D128056
2022-06-27 14:17:34 +02:00
Stanislav Gatev
ba53906cef [clang][dataflow] Add support for comma binary operator
Add support for comma binary operator.

Differential Revision: https://reviews.llvm.org/D128013

Reviewed-by: ymandel, xazax.hun
2022-06-17 17:48:21 +00:00
Stanislav Gatev
0e286b77cf [clang][dataflow] Add transfer functions for structured bindings
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Differential Revision: https://reviews.llvm.org/D120495

Reviewed-by: ymandel, xazax.hun
2022-06-02 08:02:26 +00:00
Eric Li
33b598a808 [clang][dataflow] Relax assert on existence of this pointee storage
Support for unions is incomplete (per 99f7d55e) and the `this` pointee
storage location is not set for unions. The assert in
`VisitCXXThisExpr` is then guaranteed to trigger when analyzing member
functions of a union.

This commit changes the assert to an early-return. Any expression may
be undefined, and so having a value for the `CXXThisExpr` is not a
postcondition of the transfer function.

Differential Revision: https://reviews.llvm.org/D126405
2022-05-25 20:58:02 +00:00
Eric Li
5bbef2e3ff [clang][dataflow] Fix double visitation of nested logical operators
Sub-expressions that are logical operators are not spelled out
separately in basic blocks, so we need to manually visit them when we
encounter them. We do this in both the `TerminatorVisitor`
(conditionally) and the `TransferVisitor` (unconditionally), which can
cause cause an expression to be visited twice when the binary
operators are nested 2+ times.

This changes the visit in `TransferVisitor` to check if it has been
evaluated before trying to visit the sub-expression.

Differential Revision: https://reviews.llvm.org/D125821
2022-05-17 20:28:48 +00:00
Eric Li
45643cfcc1 [clang][dataflow] Centralize expression skipping logic
A follow-up to 62b2a47 to centralize the logic that skips expressions
that the CFG does not emit. This allows client code to avoid
sprinkling this logic everywhere.

Add redirects in the transfer function to similarly skip such
expressions by forwarding the visit to the sub-expression.

Differential Revision: https://reviews.llvm.org/D124965
2022-05-05 20:28:11 +00:00
Eric Li
62b2a47a9f [clang][dataflow] Only skip ExprWithCleanups when visiting terminators
`IgnoreParenImpCasts` will remove implicit casts to bool
(e.g. `PointerToBoolean`), such that the resulting expression may not
be of the `bool` type. The `cast_or_null<BoolValue>` in
`extendFlowCondition` will then trigger an assert, as the pointer
expression will not have a `BoolValue`.

Instead, we only skip `ExprWithCleanups` and `ParenExpr` nodes, as the
CFG does not emit them.

Differential Revision: https://reviews.llvm.org/D124807
2022-05-04 15:31:49 +00:00
Yitzhak Mandelbaum
eb2131bdba [clang][dataflow] Do not crash on missing Value for struct-typed variable init.
Remove constraint that an initializing expression of struct type must have an
associated `Value`. This invariant is not and will not be guaranteed by the
framework, because of potentially uninitialized fields.

Differential Revision: https://reviews.llvm.org/D123961
2022-04-19 20:52:29 +00:00
Yitzhak Mandelbaum
d002495b94 [clang][dataflow] Support integral casts
Adds support for implicit casts `CK_IntegralCast` and `CK_IntegralToBoolean`.

Differential Revision: https://reviews.llvm.org/D123037
2022-04-05 13:55:32 +00:00
Yitzhak Mandelbaum
506ec85ba8 [clang][dataflow] Add support for clang's __builtin_expect.
This patch adds basic modeling of `__builtin_expect`, just to propagate the
(first) argument, making the call transparent.

Driveby: adds tests for proper handling of other builtins.

Differential Revision: https://reviews.llvm.org/D122908
2022-04-04 12:20:43 +00:00
Yitzhak Mandelbaum
ef1e1b3106 [clang][dataflow] Add support for (built-in) (in)equality operators
Adds logical interpretation of built-in equality operators, `==` and `!=`.s

Differential Revision: https://reviews.llvm.org/D122830
2022-04-01 17:13:21 +00:00
Stanislav Gatev
b000b7705a [clang][dataflow] Model the behavior of non-standard optional assignment
Model nullopt, value, and conversion assignment operators.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D121863
2022-03-17 17:11:12 +00:00
Stanislav Gatev
092a530ca1 [clang][dataflow] Model the behavior of non-standard optional constructors
Model nullopt, inplace, value, and conversion constructors.

Reviewed-by: ymandel, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D121602
2022-03-15 08:13:13 +00:00
Stanislav Gatev
cf63e9d4ca [clang][dataflow] Add support for nested composite bool expressions
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Differential Revision: https://reviews.llvm.org/D121455
2022-03-14 17:18:30 +00:00
Stanislav Gatev
1e5715857a [clang][dataflow] Extend flow conditions from block terminators
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120984
2022-03-07 17:50:44 +00:00
Stanislav Gatev
03dff12197 Revert "Revert "[clang][dataflow] Add support for global storage values""
This reverts commit 169e1aba55bed9f7ffa000f9f170ab2defbc40b2.

It also fixes an incorrect assumption in `initGlobalVars`.
2022-02-23 13:57:34 +00:00
Stanislav Gatev
169e1aba55 Revert "[clang][dataflow] Add support for global storage values"
This reverts commit 7ea103de140b59a64fc884fa90afd2213619384d.
2022-02-23 10:32:17 +00:00
Stanislav Gatev
7ea103de14 [clang][dataflow] Add support for global storage values
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120149
2022-02-23 08:27:58 +00:00
Stanislav Gatev
a480841566 Add missing break statement in switch. 2022-02-17 09:37:02 +00:00
Stanislav Gatev
dd4dde8d39 [clang][dataflow] Add transfer functions for logical and, or, not.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D119953
2022-02-17 09:09:59 +00:00
Stanislav Gatev
75c22b382f [clang][dataflow] Add a transfer function for CXXBoolLiteralExpr
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D118236
2022-01-26 15:33:00 +00:00
Stanislav Gatev
64ba462b6e [clang][dataflow] Add a transfer function for InitListExpr
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D118119
2022-01-25 16:28:15 +00:00
Stanislav Gatev
8e53ae3d37 [clang][dataflow] Add a transfer function for conditional operator
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D117667
2022-01-19 16:25:05 +00:00
Stanislav Gatev
acd4b03590 Revert "Revert "[clang][dataflow] Add a test to justify skipping past references in UO_Deref""
This reverts commit a0262043bb87fdef68c817722de320a5dd9eb9c9.

Add the -fno-delayed-template-parsing arg to fix the failing test on Windows.
2022-01-19 10:00:01 +00:00
Stanislav Gatev
a0262043bb Revert "[clang][dataflow] Add a test to justify skipping past references in UO_Deref"
This reverts commit 68226e572f41105446413b12ee95ab5540b2b6ac.
2022-01-19 06:46:37 +00:00
Stanislav Gatev
68226e572f [clang][dataflow] Add a test to justify skipping past references in UO_Deref
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D117567
2022-01-18 21:27:43 +00:00
Stanislav Gatev
59e031ff90 [clang][dataflow] Add transfer function for addrof
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D117496
2022-01-18 11:23:08 +00:00
Stanislav Gatev
782eced561 [clang][dataflow] Replace initValueInStorageLocation with createValue
Since Environment's setValue method already does part of the work that
initValueInStorageLocation does, we can factor out a new createValue
method to reduce the duplication.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D117493
2022-01-18 07:09:35 +00:00
Stanislav Gatev
37e6496c80 [clang][dataflow] Add transfer functions for bind temporary and static cast
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Differential Revision: https://reviews.llvm.org/D117339
2022-01-16 17:41:02 +00:00