Fixes https://github.com/llvm/llvm-project/issues/84463
Changes:
- Adapted MemRegion::getDescriptiveName
- Added unittest to check name for a given clang::ento::ElementRegion
- Some format changes due to clang-format
---------
Co-authored-by: Andreas Steinhausen <andreas.steinhausen@concenrio.io>
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Inside the ExprEngine when we process the initializers, we create a
PostInitializer program-point, which will refer to the field being
initialized, see `FieldLoc` inside `ExprEngine::ProcessInitializer`.
When a constructor (of which we evaluate the initializer-list) is
analyzed in top-level context, then the `this` pointer will be
represented by a `SymbolicRegion`, (as it should be).
This means that we will form a `FieldRegion{SymbolicRegion{.}}` as the
initialized region.
```c++
class Bear {
public:
void brum() const;
};
class Door {
public:
// PostInitializer would refer to "FieldRegion{SymRegion{this}}"
// whereas in the store and everywhere else it would be:
// "FieldRegion{ELementRegion{SymRegion{Ty*, this}, 0, Ty}".
Door() : ptr(nullptr) {
ptr->brum(); // Bug
}
private:
Bear* ptr;
};
```
We (as CSA folks) decided to avoid the creation of FieldRegions directly
of symbolic regions in the past:
f8643a9b31
---
In this patch, I propose to also canonicalize it as in the mentioned
patch, into this: `FieldRegion{ElementRegion{SymbolicRegion{Ty*, .}, 0,
Ty}`
This would mean that FieldRegions will/should never simply wrap a
SymbolicRegion directly, but rather an ElementRegion that is sitting in
between.
This patch should have practically no observable effects, as the store
(due to the mentioned patch) was made resilient to this issue, but we
use `PostInitializer::getLocationValue()` for an alternative reporting,
where we faced this issue.
Note that in really rare cases it suppresses now dereference bugs, as
demonstrated in the test. It is because in the past we failed to follow
the region of the PostInitializer inside the StoreSiteFinder visitor -
because it was using this code:
```c++
// If this is a post initializer expression, initializing the region, we
// should track the initializer expression.
if (std::optional<PostInitializer> PIP =
Pred->getLocationAs<PostInitializer>()) {
const MemRegion *FieldReg = (const MemRegion *)PIP->getLocationValue();
if (FieldReg == R) {
StoreSite = Pred;
InitE = PIP->getInitializer()->getInit();
}
}
```
Notice that the equality check didn't pass for the regions I'm
canonicalizing in this patch.
Given the nature of this change, we would rather upstream this patch.
CPP-4954
Make sure all float/double comparison intrinsics specify what happens
with a NaN input. Update some existing descriptions of comparison
results to make them all consistent.
Also replace "yields" with "returns" throughout.
When emitting the storage (or memory copy operations) for constant
initializers, the decision whether to split a constant structure or
array store into a sequence of field stores or to use `memcpy` is
based upon the optimization level and the size of the initializer.
In afe8b93ffdfef5d8879e1894b9d7dda40dee2b8d, we extended this by
allowing constants to be split when the array (or struct) type does
not match the type of data the address to the object (constant) is
expected to contain. This may happen when `emitStoresForConstant` is
called by `EmitAutoVarInit`, as the element type of the address gets
shrunk. When this occurs, let the initializer be split into a bunch
of stores only under `-ftrivial-auto-var-init=pattern`.
Fixes: https://github.com/llvm/llvm-project/issues/84178.
SizeInBytes of empty structure is 0 in C, while 1 in C++. And empty
structure argument of the function is ignored in X86_64 backend.As a
result, the value of variable arguments in C++ is incorrect. fix#77036
Co-authored-by: Longsheng Mou <moulongsheng@huawei.com>
The old logic expects the call to be the last thing we emitted, and
since it kicks in before we emit cleanups, and since `swiftasynccall`
functions always return void, that's likely to be true. "Likely" isn't
very reassuring when we're talking about slapping attributes on random
calls, though. And indeed, while I can't find any way to break the logic
directly in current main, our previous (ongoing?) experiments with
shortening argument temporary lifetimes definitely broke it wide open.
So while this commit is prophylactic for now, it's clearly the right
thing to do, and it can cherry-picked to other branches to fix problems.
This updates a few warnings that were diagnosing no arguments for a
`...` variadic macro parameter as a GNU extension when it actually is a
C++20/C23 extension now.
This fixes#84495.
…ion.
This was reverted because the resolver didn't look as expected in one of
the tests. I believe it had some interaction with #84146. I have now
regenerated it using -target-feature -fp-armv8.
Reverts llvm/llvm-project#84405
In between of passing the precommit tests on github and being merged
some change (perhaps in the AArch64 backend?) landed which resulted
in altering the generated resolver. I will regenerate the tests
perhaps using a less sensitive runline to such changes.
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
We would like the resolver to be generated eagerly, even if the
versioned function is not called from the current translation
unit. Fixes#81494. It further allows Multi Versioning to work
even if the default target version attribute is omitted from
function declarations.
MIPSr6 ISA requires normal load/store instructions support
misunaligned memory access, while it is not always do so
by hardware. On some microarchitectures or some corner cases
it may need support by OS.
Don't confuse with pre-R6's lwl/lwr famlily: MIPSr6 doesn't
support them, instead, r6 requires lw instruction support
misunaligned memory access. So, if -mstrict-align is used for
pre-R6, lwl/lwr won't be disabled.
If -mstrict-align is used for r6 and the access is not well
aligned, some lb/lh instructions will be used to replace lw.
This is useful for OS kernels.
To be back-compatible with GCC, -m(no-)unaligned-access are also
added as Neg-Alias of -m(no-)strict-align.
In `-fbounds-safety`, bounds annotations are considered type attributes
rather than declaration attributes. Constructing them as type attributes
allows us to extend the attribute to apply nested pointers, which is
essential to annotate functions that involve out parameters: `void
foo(int *__counted_by(*out_count) *out_buf, int *out_count)`.
We introduce a new sugar type to support bounds annotated types,
`CountAttributedType`. In order to maintain extra data (the bounds
expression and the dependent declaration information) that is not
trackable in `AttributedType` we create a new type dedicate to this
functionality.
This patch also extends the parsing logic to parse the `counted_by`
argument as an expression, which will allow us to extend the model to
support arguments beyond an identifier, e.g., `__counted_by(n + m)` in
the future as specified by `-fbounds-safety`.
This also adjusts `__bdos` and array-bounds sanitizer code that already
uses `CountedByAttr` to check `CountAttributedType` instead to get the
field referred to by the attribute.
This reverts commit b92d6dd704d789240685a336ad8b25a9f381b4cc. See
github.com/llvm/llvm-project/commit/b92d6dd704d7#commitcomment-139992444
We should use a tool like Visual Studio to clean up the headers.
* This completes support for verifying every declaration found in a
header is discovered in the dylib. Diagnostics are reported for each
class for differences that are representable in TBD files.
* This patch also now captures unavailable attributes that depend on
target triples. This is needed for proper tbd file generation.
Fix https://github.com/llvm/llvm-project/issues/85343
When build lambda expression in lambda instantiation, `ThisType` is
required in `Sema::CheckCXXThisCapture` to build `this` capture. Set
`this` type by import `Sema::CXXThisScopeRAII` and it will be used later
in lambda expression transformation.
Co-authored-by: huqizhi <836744285@qq.com>
Defines a subset of attributes and emits them to a section called
.hexagon.attributes.
The current attributes recorded are the attributes needed by
llvm-objdump to automatically determine target features and eliminate
the need to manually pass features.
Clang supported header searchpaths of the form `-I =/path`, relative to
the sysroot if one is passed, but did not implement that behavior for
`-iquote`, `-isystem`, or `-idirafter`.
This implements the `=` portion of the behavior implemented by GCC for
these flags described in
https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html.
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
Linux kernel uses -fwrapv to change signed integer overflows from
undefined behaviors to defined behaviors. However, the security folks
still want -fsanitize=signed-integer-overflow diagnostics. Their
intention can be expressed with -fwrapv
-fsanitize=signed-integer-overflow (#80089). This mode by default
reports recoverable errors while still making signed integer overflows
defined (most UBSan checks are recoverable by default: you get errors in
stderr, but the program is not halted).
-fsanitize=undefined -fwrapv users likely want to suppress
signed-integer-overflow, unless signed-integer-overflow is explicitly
enabled. Implement this suppression.
this implements part 1 of 2 for #83626
- `CGBuiltin.cpp` - modified to have seperate cases for signed and
unsigned integers.
- `SemaChecking.cpp` - modified to prevent the generation of a double
dot product intrinsic if the builtin were to be called directly.
- `IntrinsicsDirectX.td` creation of the signed and unsigned dot
intrinsics needed for instruction expansion.
- `DXILIntrinsicExpansion.cpp` - handle instruction expansion cases for
integer dot product.
Predefined macro FUNCTION in clang is not returning the same string than
MS for templated functions.
See https://godbolt.org/z/q3EKn5zq4
For the same test case MSVC is returning:
function: TestClass::TestClass
function: TestStruct::TestStruct
function: TestEnum::TestEnum
The initial work for this was in the reverted patch
(https://github.com/llvm/llvm-project/pull/66120). This patch solves the
issues raised in the reverted patch.
This is re-working of #74460, which adds a soft-float ABI for AArch64.
That was reverted because it causes errors when building the linux and
fuchsia kernels.
The problem is that GCC's implementation of the ABI compatibility checks
when using the hard-float ABI on a target without FP registers does it's
checks after optimisation. The previous version of this patch reported
errors for all uses of floating-point types, which is stricter than what
GCC does in practice.
This changes two things compared to the first version:
* Only check the types of function arguments and returns, not the types
of other values. This is more relaxed than GCC, while still guaranteeing
ABI compatibility.
* Move the check from Sema to CodeGen, so that inline functions are only
checked if they are actually used. There are some cases in the linux
kernel which depend on this behaviour of GCC.
This patch vastly simplifies the code handling terminators, without
changing any
behavior. Additionally, the simplification unblocks our ability to
address a
(simple) FIXME in the code to invoke `transferBranch`, even when builtin
options
are disabled.
The checker alpha.security.ArrayBoundV2 performs bounds checking in two
steps: first it checks for underflow, and if it isn't guaranteed then it
assumes that there is no underflow. After this, it checks for overflow,
and if that's guaranteed or the index is tainted then it reports it.
This meant that in situations where overflow and underflow are both
possible (but the index is either tainted or guaranteed to be invalid),
the checker was reporting just an overflow error.
This commit modifies the messages printed in these cases to mention the
possibility of an underflow.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Fixes https://github.com/llvm/llvm-project/issues/53520.
#### Description ####
Provide `intrin0.h` to be the minimal set of intrinsics that the MSVC
STL requires.
The `intrin0.h` header matches the latest header provided by MSVC 1939
which does include some extra intrinsics that the MSVC STL does not use.
Inside `BuiltinHeaders.def` I kept the header description as `intrin.h`.
If you want me to change those to `intrin0.h` for the moved intrinsics
let me know.
This should now allow `immintrin.h` to be used with function targets for
runtime cpu detection of simd instruction sets without worrying about
the compile-time overhead from MSVC STL including `intrin.h` on clang.
I still need to figure out how to best update MSVC STL to detect for the
presence of `intrin0.h` from clang and to use this header over
`intrin.h`.
#### Testing ####
Built clang locally and ran the test suite. I still need to do a pass
over the existing unit tests for the ms intrinsics to make sure there
aren't any gaps. Wanted to get this PR up for discussion first.
Modified latest MSVC STL from github to point to `intrin0.h` for clang.
Wrote some test files that included MSVC STL headers that rely on
intrinsics such as `atomic`, `bit` and `vector`. Built the unit tests
against x86, arm, aarch64, and x64.
#### Benchmarks ####
The following include times are based on the x64 target with the
modified headers in this PR.
These timings were done by using `clang-cl.exe -ftime-trace` and taking
the wall time for parsing `intrin.h` and `intrin0.h`.
`intrin.h` takes ~897ms to parse.
`intrin0.h` takes ~1ms to parse.
If there is anything required or a different approach is preferred let
me know. I would very much like to move this over the finish line so we
can use function targets with clang-cl.
`llvm::MaybeAlign` does this, for example.
It's not an option to simply ignore these derived classes because they
get cast
back to the optional classes (for example, simply when calling the
optional
member functions), and our transfer functions will then run on those
optional
classes and therefore require them to be properly initialized.
Polymorphic entity lowering status is good. The main remaining TODO is
to allow lowering of vector subscripted polymorphic entity, but this
does not deserve blocking all application using polymorphism.
Remove experimental option and enable lowering of polymorphic entity by
default.
This is a relatively rare case, but
- It's still nice to get this right,
- We can remove the special case for this in
`VisitCXXOperatorCallExpr()` (that
simply bails out), and
- With this in place, I can avoid having to add a similar special case
in an
upcoming patch.