`MeasureTokenLength` may return an unsigned 0 representing failure in
obtaining length of a token. The analysis now gives up on such cases.
Otherwise, there might be issues caused by unsigned integer "overflow".
Do not warn when a constant sized array is indexed with an expression
that contains bitwise and operation
involving constants and it always results in a bound safe access.
(rdar://136684050)
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
EvaluateAsConstExpr() can return an lvalue which is not compatible
with a subsequent getInt() call. Instead, use EvaluateAsInt() which
will use all techniques availble to get an int result compatible
with the subsequent getInt() call.
This reverts commit 7dd34baf5505d689161c3a8678322a394d7a2929.
Fixed the assertion violation reported by
7dd34baf5505d689161c3a8678322a394d7a2929
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Do not warn when constant sized array is indexed by expressions that
evaluate to a const value. For instance, sizeof(T) expression value can
be evaluated at compile time and if an array is indexed by such an
expression, it's bounds can be validated.
(rdar://140320289)
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Do not warn about unsafe buffer access, when multi-dimensional constant
arrays are accessed and their indices are within the bounds of the
buffer. Warning in such cases would be a false positive. Such a
suppression already exists for 1-d
arrays and it is now extended to multi-dimensional arrays.
(rdar://137926311)
(rdar://140320139)
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Do not warn when two parameter constructor receives pointer address from
a std::addressof method and the span size is set to 1.
(rdar://139298119)
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Do not warn when a string literal is indexed and the idex value is
within the bounds of the length of the string.
(rdar://139106996)
Co-authored-by: MalavikaSamak <malavika2@apple.com>
CXXCtorInitializers are not statements , but they point to an
initializer expression which is. When visiting a FunctionDecl, also
walk through any constructor initializers and run the warning
checks/matchers against their initializer expressions. This catches
warnings for initializing fields and calling other constructors, such
as:
struct C {
C(P* Ptr) : AnUnsafeCtor(Ptr) {}
}
Field initializers can be found by traversing CXXDefaultInitExprs. This
catches warnings in places such as:
struct C {
P* Ptr;
AnUnsafeCtor U{Ptr};
};
We add tests for explicit construction, for field initialization, base
class constructor calls, delegated constructor calls, and aggregate
initialization.
Note that aggregate initialization is not fully covered where a field
specifies an initializer and it's not overridden in the aggregate initialization,
such as in:
struct AggregateViaValueInit {
UnsafeMembers f1;
// FIXME: A construction of this class does initialize the field
// through this initializer, so it should warn. Ideally it should
// also point to where the site of the construction is in
// testAggregateViaValueInit().
UnsafeMembers f2{3};
};
void testAggregateViaValueInit() {
auto A = AggregateViaValueInit();
};
There are 3 tests for different types of aggregate initialization with
FIXMEs documenting this future work.
One attempt to fix this involved returning true from
MatchDescendantVisitor::shouldVisitImplicitCode(), however, it breaks expectations
for field in-class initializers by moving the SourceLocation, possibly
to inside the implicit ctor instead of on the line where the field
initialization happens.
struct C {
P* Ptr;
AnUnsafeCtor U{Ptr}; // expected-warning{{this is never seen then}}
};
Tests are also added for std::span(ptr, size) constructor being called
from a field initializer and a constructor initializer.
Issue #80482
Emit a warning when the raw pointer retrieved from std::vector and
std::array instances are cast to a larger type. Such a cast followed by
a field dereference to the resulting pointer could cause an OOB access.
This is similar to the existing span::data warning.
(rdar://136704278)
Co-authored-by: MalavikaSamak <malavika2@apple.com>
For `snprintf(a, sizeof a, ...)`, the first two arguments form a safe
pattern if `a` is a constant array. In such a case, this commit will
suppress the warning.
(rdar://117182250)
The commit d7dd2c468fecae871ba67e891a3519c758c94b63 crashes for such
an example:
```
void printf() { printf(); }
```
Because it assumes `printf` must have arguments. This commit fixes
this issue.
(rdar://117182250)
Revert commit 23457964392d00fc872fa6021763859024fb38da, and re-land
with a new flag "-Wunsafe-buffer-usage-in-libc-call" for the new
warning.
(rdar://117182250)
[-Wunsafe-buffer-usage] Add warn on unsafe calls to libc functions
Warning about calls to libc functions involving buffer access. Warned
functions are hardcoded by names.
(rdar://117182250)
`QualType::isConstantArrayType()` checks canonical type. So a following
cast should be applied to canonical type as well:
```
if (Ty->isConstantArrayType())
cast<ConstantArrayType>(Ty.getCanonicalType()); // cast<ConstantArrayType>(Ty) is incorrect
```
Extend the unsafe_buffer_usage attribute, so they can also be added to
struct fields. This will cause the compiler to warn about the unsafe
field at their access sites.
Co-authored-by: MalavikaSamak <malavika2@apple.com>
The -Wunsafe-buffer-usage warning should fire on any call to a function
annotated with [[clang::unsafe_buffer_usage]], however it omitted calls
to constructors, since the expression is a CXXConstructExpr which does
not subclass CallExpr. Thus the matcher on callExpr() does not find
these expressions.
Add a new WarningGadget that matches cxxConstructExpr that are calling a
CXXConstructDecl annotated by [[clang::unsafe_buffer_usage]] and fires
the warning. The new UnsafeBufferUsageCtorAttrGadget gadget explicitly
avoids matching against the std::span(ptr, size) constructor because
that is handled by SpanTwoParamConstructorGadget and we never want two
gadgets to match the same thing (and this is guarded by asserts).
The gadgets themselves do not report the warnings, instead each gadget's
Stmt is passed to the UnsafeBufferUsageHandler (implemented by
UnsafeBufferUsageReporter). The Reporter is previously hardcoded that a
CXXConstructExpr statement must be a match for std::span(ptr, size), but
that is no longer the case. We want the Reporter to generate different
warnings (in the -Wunsafe-buffer-usage-in-container subgroup) for the
span contructor. And we will want it to report more warnings for other
std-container-specific gadgets in the future. To handle this we allow
the gadget to control if the warning is general (it calls
handleUnsafeBufferUsage()) or is a std-container-specific warning (it
calls handleUnsafeOperationInContainer()).
Then the WarningGadget grows a virtual method to dispatch to the
appropriate path in the UnsafeBufferUsageHandler. By doing so, we no
longer need getBaseStmt in the Gadget interface. The only use of it for
FixableGadgets was to get the SourceLocation, so we make an explicit
virtual method for that on Gadget. Then the handleUnsafeOperation()
dispatcher can be a virtual method that is only in WarningGadget.
The SpanTwoParamConstructorGadget gadget dispatches to
handleUnsafeOperationInContainer() while the other WarningGadgets all
dispatch to the original handleUnsafeBufferUsage().
Tests are added for annotated constructors, conversion operattors, call
operators, fold expressions, and regular methods.
Issue #80482
UPCAddressofArraySubscriptGadget::getClaimedVarUseSites should skip
parentheses when accessing the DeclRefExpr, otherwise a crash happens
with parenthesized references.
In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.
This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.
To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.
Resolves#85124.
Without the fix gcc warned like
../../clang/lib/Analysis/UnsafeBufferUsage.cpp:2203:26: warning: unused variable 'CArrTy' [-Wunused-variable]
2203 | } else if (const auto *CArrTy = Ctx.getAsConstantArrayType(
| ^~~~~~
Example:
int * const my_var = my_initializer;
Currently when transforming my_var to std::span the fixits:
- replace "int * const my_var = " with "std::span<int> const my_var {"
- add ", SIZE}" after "my_initializer" where SIZE is either inferred or
a placeholder
This patch makes that behavior less intrusive by not modifying variable
cv-qualifiers and initialization syntax.
The new behavior is:
- replace "int *" with "std::span<int>"
- add "{" before "my_initializer"
- add ", SIZE}" after "my_initializer"
This is an improvement on its own - since we don't touch the identifier,
we automatically can handle macros in them.
It also simplifies future work on initializer fixits.
Example:
int arr[10];
int * ptr = arr;
If ptr is unsafe and we transform it to std::span then the fixit we'd
currently provide transforms the code to:
std::span<int> ptr{arr, 10};
That's suboptimal as that repeats the size of the array in the code.
The idiomatic transformation should rely on the span constructor
that takes just the array argument and relies on template
parameter autodeduction to set the span size.
The transformed code should look like:
std::span<int> ptr = arr;
Note that it just should not change the initializer at all and that also
works for other forms of initialization like:
int * ptr {arr};
becoming:
std::span<int> ptr{arr};
This patch changes the initializer handling to the desired (empty)
fixit.
Introducing CArrayToPtrAssignment gadget and implementing fixits for some cases
of array being assigned to pointer.
Key observations:
- const size array can be assigned to std::span and bounds are propagated
- const size array can't be on LHS of assignment
This means array to pointer assignment has no strategy implications.
Fixits are implemented for cases where one of the variables in the assignment is
safe. For assignment of a safe array to unsafe pointer we know that the RHS will
never be transformed since it's safe and can immediately emit the optimal fixit.
Similarly for assignment of unsafe array to safe pointer.
(Obviously this is not and can't be future-proof in regards to what
variables we consider unsafe and that is fine.)
Fixits for assignment from unsafe array to unsafe pointer (from Array to Span
strategy) are not implemented in this patch as that needs to be properly designed
first - we might possibly implement optimal fixits for partially transformed
cases, put both variables in a single fixit group or do something else.
[-Wunsafe-buffer-usage] Ignore safe array subscripts
Don't emit warnings for array subscripts on constant size arrays where the index is constant and within bounds.
Example:
int arr[10];
arr[5] = 0; //safe, no warning
This patch recognizes only array indices that are integer literals - it doesn't understand more complex expressions (arithmetic on constants, etc.).
-Warray-bounds implemented in Sema::CheckArrayAccess() already solves a similar
(opposite) problem, handles complex expressions and is battle-tested.
Adding -Wunsafe-buffer-usage diagnostics to Sema is a non-starter as we need to emit
both the warnings and fixits and the performance impact of the fixit machine is
unacceptable for Sema.
CheckArrayAccess() as is doesn't distinguish between "safe" and "unknown" array
accesses. It also mixes the analysis that decides if an index is out of bounds
with crafting the diagnostics.
A refactor of CheckArrayAccess() might serve both the original purpose
and help us avoid false-positive with -Wunsafe-buffer-usage on constant
size arrrays.
Currently we ignore calls on function pointers (unlike direct calls of
functions and class methods). This patch adds support for function pointers as
well.
The change is to simply replace use of forEachArgumentWithParam matcher in UPC
gadget with forEachArgumentWithParamType.
from the documentation of forEachArgumentWithParamType:
/// Matches all arguments and their respective types for a \c CallExpr or
/// \c CXXConstructExpr. It is very similar to \c forEachArgumentWithParam but
/// it works on calls through function pointers as well.
Currently the matcher also uses hasPointerType() which checks that the
canonical type of an argument is pointer and won't match on arrays decayed to
pointer. Replacing hasPointerType() with isAnyPointerType() which allows
implicit casts allows for the arrays to be matched as well and this way we get
fixits for array arguments to function pointer calls too.
Covers cases where DeclRefExpr referring to a const-size array decays to a
pointer and is used "as a pointer" (e. g. passed to a pointer type
parameter).
Since std::array<T, N> doesn't implicitly convert to pointer to its element
type T* the cast needs to be done explicitly as part of the fixit
when we retrofit std::array to code that previously worked with constant
size array. std::array::data() method is used for the explicit
cast.
In terms of the fixit machine this covers the UPC(DRE) case for Array fixit strategy.
The emitted fixit inserts call to std::array::data() method similarly to
analogous fixit for Span strategy.
Array subscript on a const size array is not bounds-checked. The idiomatic
replacement is std::array which is bounds-safe in hardened mode of libc++.
This commit extends the fixit-producing machine to consider std::array as a
transformation target type and teaches it to handle the array subscript on const
size arrays with a trivial (empty) fixit.
Debug notes for unclaimed DeclRefExpr should report any DRE of an unsafe
variable that is not covered by a Fixable (i. e. fixit for the
particular AST pattern isn't implemented for whatever reason). Currently
not all unclaimed DeclRefExpr-s are reported which is a bug. The debug
notes report only those DREs where the referred VarDecl has at least one
other DeclRefExpr which is claimed (covered by a fixit). If there is an
unsafe VarDecl that has exactly one DRE and the DRE isn't claimed then
the debug note about missing fixit won't be emitted. That is because the
debug note is emitted from within a loop over set of successfully
matched FixableGadgets which by-definition is missing those DRE that are
not matched at all.
The new code simply iterates over all unsafe VarDecls and all of their
unclaimed DREs.
Constructing `std::span` objects with the two parameter constructors
could introduce mismatched bounds information, which defeats the
purpose of using `std::span`. Therefore, we warn every use of such
constructors.
rdar://115817781
The patch fixes the crash introduced by the DataInvocation warning
gadget designed to warn against unsafe invocations of span::data method.
It also now considers the invocation of span::data method inside
parenthesis.
Radar: 121223051
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
…-Wunsafe-buffer-usage,
there maybe accidental re-introduction of new OutOfBound accesses into
the code bases. One such case is invoking span::data() method on a span
variable to retrieve a pointer, which is then cast to a larger type and
dereferenced. Such dereferences can introduce OutOfBound accesses.
To address this, a new WarningGadget is being introduced to warn against
such invocations.
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Reported by Static Analyzer tool:
In getSourceRangeToTokenEnd(clang::Decl const *, clang::SourceManager
const &, clang::LangOptions): A very large function call parameter
exceeding the high threshold is passed by value
pass_by_value: Passing parameter LangOpts of type clang::LangOptions
(size 1784 bytes) by value, which exceeds the high threshold of 512
bytes
- For a better understand of what the unsupported cases are, we add
more information to the debug note---a string of ancestor AST nodes
of the unclaimed DRE. For example, an unclaimed DRE p in an
expression `*(p++)` will result in a string starting with
`DRE ==> UnaryOperator(++) ==> Paren ==> UnaryOperator(*)`.
- To find out the most common patterns of those unsupported use cases,
we add a simple script to build a prefix tree over those strings and
count each prefix. The script reads input line by line, assumes a
line is a list of words separated by `==>`s, and builds a prefix tree
over those lists.
Reviewed by: t-rasmud (Rashmi Mudduluru), NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D158561
- Use Strategy to determine whether to fix a parameter
- Fix the `Strategy` construction so that only variables on the graph
are assigned the `std::span` strategy
Reviewed by: t-rasmud (Rashmi Mudduluru), NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D157441