388 Commits

Author SHA1 Message Date
Nikolas Klauser
5f909c0ab2
[libc++][NFC] Remove some boilerplate from <string> after #76756 (#108952)
A few functions are now unnecessary, since we can access the members
directly instread now.
2024-10-02 15:12:30 +02:00
Louis Dionne
d95597dc06
[libc++][string] Remove potential non-trailing 0-length array (#108867)
It is a violation of the standard to use 0 length arrays, especially
when not at the end of a structure (not a FAM GNU extension). Compiler
generally accept it, but it's probably better to have a conforming
implementation.

This is a re-application of #105865 which was reverted in 72cfc74
because it broke the data formatters. A LLDB patch has since been landed
that should make this a non-issue.

Co-authored-by: serge-sans-paille <sguelton@mozilla.com>
2024-09-16 16:43:39 -04:00
Nikolas Klauser
27c83382d8
[libc++] Replace __compressed_pair with [[no_unique_address]] (#76756)
This significantly simplifies the code, improves compile times and
improves the object layout of types using `__compressed_pair` in the
unstable ABI. The only downside is that this is extremely ABI sensitive
and pedantically breaks the ABI for empty final types, since the address
of the subobject may change. The ABI of the whole object should not be
affected.

Fixes #91266
Fixes #93069
2024-09-16 11:08:57 +02:00
Nikolas Klauser
17e0686ab1
[libc++][NFC] Use [[__nodiscard__]] unconditionally (#80454)
`__has_cpp_attribute(__nodiscard__)` is always true now, so we might as
well replace `_LIBCPP_NODISCARD`. It's one less macro that can result in
bad diagnostics.
2024-09-12 21:18:43 +02:00
Nikolas Klauser
748023dc32
[libc++][NFC] Replace _LIBCPP_NORETURN and TEST_NORETURN with [[noreturn]] (#80455)
`[[__noreturn__]]` is now always available, so we can simply use the
attribute directly instead of through a macro.
2024-09-11 08:59:46 +02:00
Daniel Thornburgh
d8a8eae6da
Revert "[libc++][string] Remove potential non-trailing 0-length array" (#108091)
Reverts llvm/llvm-project#105865

This breaks a pair of LLDB tests in CI.
2024-09-10 14:08:27 -07:00
serge-sans-paille
ed0da00825
[libc++][string] Remove potential non-trailing 0-length array (#105865)
It is a violation of the standard to use 0 length arrays, especially
when not at the end of a structure (not a FAM GNU extension). Compiler
generally accept it, but it's probably better to have a conforming
implementation.

---------

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2024-09-10 06:24:02 +00:00
Louis Dionne
e4fdbcc28f [libc++] Add miscellaneous missing includes 2024-09-05 08:54:41 -04:00
Louis Dionne
c1a8283fcc
[libc++][modules] Move __noexcept_move_assign_container out of __type_traits (#107140)
That header depends on allocator traits, which is fundamentally tied to
`<memory>`, not to `<type_traits>`. This breaks a cycle betweeen
__type_traits and __memory.
2024-09-04 11:18:30 -04:00
Louis Dionne
348e74139a [libc++][NFC] Run clang-format on libcxx/include
This re-formats a few headers that had become out-of-sync with respect
to formatting since we ran clang-format on the whole codebase. There's
surprisingly few instances of it.
2024-08-30 12:09:36 -04:00
Nikolas Klauser
025f03f01e
[libc++][NFC] Remove unused struct in <string> (#106527) 2024-08-29 17:05:02 +02:00
Louis Dionne
85561dd594 [libc++] Fix bounded iterator hardening mode in C++03 mode 2024-08-26 17:17:27 -04:00
Christopher Di Bella
e04d124a96
[libc++] Call basic_string_view's assume-valid constructor from basic_string operations (#105863)
`basic_string` frequently calls `basic_string_view(data(), size())`,
which accounts for ~15% of the observed overhead when hardening is
enabled. This commit removes unnecessary checks when `basic_string` is
known to already have valid data, by bypassing the public constructor,
so that we eliminate that overhead.
2024-08-26 11:58:35 -04:00
Mark de Wever
4dee6411e0
[libc++] Implements LWG3130. (#101889)
This adds addressof at the required places in [input.output]. Some of
the new tests failed since string used operator& internally. These have
been fixed too.

Note the new fstream tests perform output to a basic_string instead of a
double. Using a double requires num_get specialization

num_get<CharT, istreambuf_iterator<CharT,
char_traits_operator_hijacker<CharT>>

This facet is not present in the locale database so the conversion would
fail due to a missing locale facet. Using basic_string avoids using the
locale.

As a drive-by fixes several bugs in the ofstream.cons tests. These
tested ifstream instead of ofstream with an open mode.

Implements:
- LWG3130 [input.output] needs many addressof

Closes #100246.
2024-08-06 19:47:56 +02:00
Nikolas Klauser
3af26be42e
[libc++] Improve code gen for string's operator== (#100926)
If the string is too long for a short string, we can simply check for
the long bit. If that's false we can do an early return. This improves
the code gen slightly.
2024-08-01 23:04:51 +02:00
Mark de Wever
d0ca9f23e8
[libc++][string] Fixes shrink_to_fit. (#97961)
This ensures that shrink_to_fit does not increase the allocated size.

Partly addresses #95161
2024-07-23 12:13:22 -04:00
David Benjamin
bcf9fb9802
[libc++][hardening] Use bounded iterators in std::vector and std::string (#78929)
~~NB: This PR depends on #78876. Ignore the first commit when reviewing,
and don't merge it until #78876 is resolved. When/if #78876 lands, I'll
clean this up.~~

This partially restores parity with the old, since removed debug build.
We now can re-enable a bunch of the disabled tests. Some things of note:

- `bounded_iter`'s converting constructor has never worked. It needs a
friend declaration to access the other `bound_iter` instantiation's
private fields.

- The old debug iterators also checked that callers did not try to
compare iterators from different objects. `bounded_iter` does not
currently do this, so I've left those disabled. However, I think we
probably should add those. See
https://github.com/llvm/llvm-project/issues/78771#issuecomment-1902999181

- The `std::vector` iterators are bounded up to capacity, not size. This
makes for a weaker safety check. This is because the STL promises not to
invalidate iterators when appending up to the capacity. Since we cannot
retroactively update all the iterators on `push_back()`, I've instead
sized it to the capacity. This is not as good, but at least will stop
the iterator from going off the end of the buffer.

There was also no test for this, so I've added one in the `std`
directory.

- `std::string` has two ambiguities to deal with. First, I opted not to
size it against the capacity. https://eel.is/c++draft/string.require#4
says iterators are invalidated on an non-const operation. Second,
whether the iterator can reach the NUL terminator. The previous debug
tests and the special-case in https://eel.is/c++draft/string.access#2
suggest no. If either of these causes widespread problems, I figure we
can revisit.

- `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator
supported `s.begin().base()`, but I see no promise of this in the
standard. GCC also doesn't support this. I fixed the test to use
`std::to_address`.

- `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer.
(It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`,
and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate
enough to notice. I've added a few more bits to satisfy it.

Fixes #78805
2024-07-22 22:44:25 -07:00
Tacet
9d03275550
[ASan][libc++] Turn off SSO annotations for Apple platforms (#96269)
This commit disables short string AddressSanitizer annotations on Apple
platforms as a temporary solution to the problem reported in #96099.

For more information on Apple's block implementation, please refer to
clang/docs/Block-ABI-Apple.rst [1]. The core issue lies in the fact
that blocks are unaware of their content, causing AddressSanitizer
errors when blocks are moved using `memmove`.

I believe - and I'm not alone - that the issue should ideally be
addressed within the block moving logic. However, this patch provides
a temporary fix until a proper resolution exists in the Blocks runtime.

[1]: https://github.com/llvm/llvm-project/blob/main/clang/docs/Block-ABI-Apple.rst
2024-07-19 17:24:10 -04:00
Nikolas Klauser
2bf91db0c7
[libc++] Use char_traits::copy while inserting when possible (#97201)
This reduces the number of asm lines from 707 to 519 for this snippet:
```c++
auto test(std::string& str, const char* begin, const char* end) {
  str.insert(str.begin(), begin, end);
}
```
While that's not a performance metric, I've never seen a use of `memcpy`
result in a performance regression for any realistic usage.
2024-07-18 18:05:06 +02:00
Hristo Hristov
4a19be5d45
[libc++][strings] P2591R5: Concatenation of strings and string views (#88389)
Implemented: https://wg21.link/P2591R5
- https://eel.is/c++draft/string.syn
- https://eel.is/c++draft/string.op.plus

---------

Co-authored-by: Hristo Hristov <zingam@outlook.com>
2024-07-18 13:26:37 +03:00
Hui
79e8a59523
[libc++] Move allocator assertion into allocator_traits (#94750)
There is code duplication in all containers that static_assert the
allocator matches the allocator requirements in the spec. This check can
be moved into a more centralised place.
2024-06-25 10:13:48 -05:00
Nikolas Klauser
1f98ac095e
[libc++][NFC] Replace _NOEXCEPT and _LIBCPP_CONSTEXPR macros with the keywords in C++11 code (#96387) 2024-06-23 22:03:41 +02:00
Louis Dionne
e2c2ffbe7a
[libc++][NFC] Run clang-format on libcxx/include again (#95874)
As time went by, a few files have become mis-formatted w.r.t.
clang-format. This was made worse by the fact that formatting was not
being enforced in extensionless headers. This commit simply brings all
of libcxx/include in-line with clang-format again.

We might have to do this from time to time as we update our clang-format
version, but frankly this is really low effort now that we've formatted
everything once.
2024-06-18 09:13:45 -04:00
Nikolas Klauser
cb41740187
[libc++] Refactor<__type_traits/is_swappable.h> (#86822)
This changes the `is_swappable` implementation to use variable templates
first and basing the class templates on that. This avoids instantiating
them when the `_v` versions are used, which are generally less resource
intensive.
2024-06-18 11:01:43 +02:00
Nikolas Klauser
6b4b29f859
[libc++][NFC] Remove unnecessary parens in static_asserts (#95605)
These were required a long time ago due to `static_assert` not actually
being available in C++03. Now `static_assert` is simply mapped to
`_Static_assert` in C++03, making the additional parens unnecessary.
2024-06-18 10:45:30 +02:00
Nikolas Klauser
cce1feb7b1
[libc++][NFC] Remove some dead code in string (#94893)
It looks like the last references got removed in c747bd0e2339.
It removed a __zero() function, which was probably created at
some point in the ancient past to optimize copying the string 
representation. The __zero() function got simplified to an 
assignment as part of making string constexpr, rendering this 
code unnecessary.
2024-06-11 16:58:00 -04:00
Louis Dionne
6faae130e4
[libc++] Simplify the definition of string::operator== (#95000)
Instead of hardcoding a loop for small strings, always call
char_traits::compare which ends up desugaring to __builtin_memcmp.

Note that the original code dates back 11 years, when we didn't lower to
intrinsics in `char_traits::compare`.

Fixes #94222
2024-06-11 16:55:56 -04:00
Tacet
6c8356579b
[libc++][ASan] Fix std::basic_string trait type (#91590)
Addresses the comment:
https://github.com/llvm/llvm-project/pull/79536#discussion_r1593652240

Changes the type to `void` instead of `false_type`.

The value is used here:


6f1013a5b3/libcxx/include/__type_traits/is_trivially_relocatable.h (L35-L38)
2024-05-09 18:40:15 +02:00
Tacet
1a96179596
[ASan][libc++] Turn on ASan annotations for short strings (#79536)
This pull request is the third iteration aiming to integrate short
string annotations. This commit includes:
- Enabling basic_string annotations for short strings.
- Setting a value of `__trivially_relocatable` in `std::basic_string` to
`false_type` when compiling with ASan (nothing changes when compiling
without ASan). Short string annotations make `std::basic_string` to not
be trivially relocatable, because memory has to be unpoisoned.
- Adding a `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS` modifier to two
functions.
- Creating a macro `_LIBCPP_ASAN_VOLATILE_WRAPPER` to prevent
problematic stack optimizations (the macro modifies code behavior only
when compiling with ASan).

Previously we had issues with compiler optimization, which we understand
thanks to @vitalybuka. This commit also addresses smaller changes in
short string, since previous upstream attempts.

Problematic optimization was loading two values in code similar to:
```
__is_long() ? __get_long_size() : __get_short_size();
```
We aim to resolve it with the volatile wrapper.

This commit is built on top of two previous attempts which descriptions
are below.

Additionally, in the meantime, annotations were updated (but it
shouldn't have any impact on anything):
- https://github.com/llvm/llvm-project/pull/79292

---

Previous PR: https://github.com/llvm/llvm-project/pull/79049
Reverted:
a16f81f5e3

Previous description:

Originally merged here: https://github.com/llvm/llvm-project/pull/75882
Reverted here: https://github.com/llvm/llvm-project/pull/78627

Reverted due to failing buildbots. The problem was not caused by the
annotations code, but by code in the `UniqueFunctionBase` class and in
the `JSON.h` file. That code caused the program to write to memory that
was already being used by string objects, which resulted in an ASan
error.

Fixes are implemented in:
- https://github.com/llvm/llvm-project/pull/79065
- https://github.com/llvm/llvm-project/pull/79066

Problematic code from `UniqueFunctionBase` for example:
```cpp
    // In debug builds, we also scribble across the rest of the storage.
    memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
```

---

Original description:

This commit turns on ASan annotations in `std::basic_string` for short
stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here:
https://github.com/llvm/llvm-project/pull/72677

Requires to pass CI without fails:
- https://github.com/llvm/llvm-project/pull/75845
- https://github.com/llvm/llvm-project/pull/75858

Annotating `std::basic_string` with default allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677 but annotations for
short strings (SSO - Short String Optimization) are turned off there.
This commit turns them on. This also removes
`_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to
support turning on and off short string annotations.

Support in ASan API exists since
dd1b7b797a.
You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.

This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.

The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.

If you have any questions, please email:

- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
2024-05-07 18:35:25 +02:00
Vitaly Buka
d129ea8d2f
[libcxx] Align __recommend() + 1 by __endian_factor (#90292)
This is detected by asan after #83774

Allocation size will be divided by `__endian_factor` before storing. If
it's not aligned,
we will not be able to recover allocation size to pass into
`__alloc_traits::deallocate`.

we have code like this 
```
 auto __allocation = std::__allocate_at_least(__alloc(), __recommend(__sz) + 1);
    __p               = __allocation.ptr;
    __set_long_cap(__allocation.count);

void __set_long_cap(size_type __s) _NOEXCEPT {
    __r_.first().__l.__cap_     = __s / __endian_factor;
    __r_.first().__l.__is_long_ = true;
  }

size_type __get_long_cap() const _NOEXCEPT {
    return __r_.first().__l.__cap_ * __endian_factor;
  }

inline ~basic_string() {
    __annotate_delete();
    if (__is_long())
      __alloc_traits::deallocate(__alloc(), __get_long_pointer(), __get_long_cap());
  }
```
1. __recommend() -> even size
2. `std::__allocate_at_least(__alloc(), __recommend(__sz) + 1)` - > not
even size
3. ` __set_long_cap() `- > lose one bit of size for __endian_factor == 2
(see `/ __endian_factor`)
4. `__alloc_traits::deallocate(__alloc(), __get_long_pointer(),
__get_long_cap())` -> uses even size (see `__get_long_cap`)
2024-05-02 14:26:38 -07:00
Nikolas Klauser
83bc7b5771
[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the tests (#87094)
This also adds a few tests that were missing.
2024-04-22 22:13:58 +02:00
Nikolas Klauser
61f1f13002
[libc++][NFC] Move basic ASan annotation functions into a utility header (#87220) 2024-04-13 18:24:12 +02:00
Nikolas Klauser
580f60484e
[libc++][NFC] Merge is{,_nothrow,_trivially}{,_copy,_move,_default}{_assignable,_constructible} (#85308)
These headers have become very small by using compiler builtins, often
containing only two declarations. This merges these headers, since
there doesn't seem to be much of a benefit keeping them separate.

Specifically, `is_{,_nothrow,_trivially}{assignable,constructible}` are
kept and the `copy`, `move` and `default` versions of these type traits
are moved in to the respective headers.
2024-03-18 08:29:44 +01:00
Louis Dionne
37dca605c9
[libc++] Clean up includes of <__assert> (#80091)
Originally, we used __libcpp_verbose_abort to handle assertion failures.
That function was declared from all public headers. Since we don't use
that mechanism anymore, we don't need to declare __libcpp_verbose_abort
from all public headers, and we can clean up a lot of unnecessary
includes.

This patch also moves the definition of the various assertion categories
to the <__assert> header, since we now rely on regular IWYU for these
assertion macros.

rdar://105510916
2024-02-29 10:12:22 -05:00
Louis Dionne
3e33b6f5de [libc++][NFC] Reformat a few files that had gotten mis-formatted
Those appear to be oversights when committing patches
in the last few months.
2024-02-08 10:12:42 -05:00
Nikolas Klauser
4e112e5c1c
Reapply "[libc++] Optimize vector growing of trivially relocatable types" (#80558)
This reapplies #76657. Non-trivial elements didn't get destroyed
previously. This fixes the bug and adds tests for all the vector
insertion functions.
2024-02-04 00:28:29 +01:00
Kirill Stoimenov
2352fdd202 Revert "[libc++] Optimize vector growing of trivially relocatable types (#76657)"
Broke sanitizer bots: https://lab.llvm.org/buildbot/#/builders/5/builds/40641

This reverts commit 67eee4a029797c09129889c3655416d1be487cfe.
2024-02-02 20:43:51 +00:00
Nikolas Klauser
67eee4a029
[libc++] Optimize vector growing of trivially relocatable types (#76657)
This patch introduces a new trait to represent whether a type is
trivially
relocatable, and uses that trait to optimize the growth of a std::vector
of trivially relocatable objects.

```
--------------------------------------------------
Benchmark                           old        new
--------------------------------------------------
bm_grow<int>                    1354 ns    1301 ns
bm_grow<std::string>            5584 ns    3370 ns
bm_grow<std::unique_ptr<int>>   3506 ns    1994 ns
bm_grow<std::deque<int>>       27114 ns   27209 ns
```

This also changes to order of moving and destroying the objects when
growing the vector. This should not affect our conformance.
2024-02-02 17:13:55 +01:00
Tacet
4017f04e31
Remove unnecessary _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS (#79574)
This macro is unnecessary with `basic_string& operator=(value_type
__c)`.
2024-01-30 14:43:51 +01:00
Louis Dionne
f89d707e5f
[libc++] Accept __VA_ARGS__ in conditional _NOEXCEPT_ macro (#79877)
This prevents having to use double parentheses in common cases.
2024-01-30 08:33:48 -05:00
Tacet
a315fb1c57
[ASan][libc++] Correct (explicit) annotation size (#79292)
A quick examination suggests that the current code in the codebase does
not lead to incorrect annotations. However, the intention is for the
object after the function to be annotated in a way that only its
contents are unpoisoned and the rest is poisoned. This commit makes it
explicit and avoids potential issues in future.

In addition, I have implemented a few tests for a function that helped
me identify the specific argument value.

Notice: there is no known scenario where old code results in incorrect
annotation.
2024-01-25 20:41:38 +01:00
Eric
04ce0baf01
Unconditionally lower std::string's alignment requirement from 16 to 8. (#68925)
Unconditionally change std::string's alignment to 8.

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR #68807, which enabled the change behind
an ABI flag.
2024-01-24 13:52:46 -06:00
Louis Dionne
382f70a877
[libc++][NFC] Rewrite function call on two lines for clarity (#79141)
Previously, there was a ternary conditional with a less-than comparison
appearing inside a template argument, which was really confusing because
of the <...> of the function template. This patch rewrites the same
statement on two lines for clarity.
2024-01-24 09:28:58 -05:00
Thurston Dang
a16f81f5e3 Revert "[ASan][libc++] Turn on ASan annotations for short strings (#79049)"
This reverts commit cb528ec5e6331ce207c7b835d7ab963bd5e13af7.

Reason: buildbot breakage (https://lab.llvm.org/buildbot/#/builders/5/builds/40364):
SUMMARY: AddressSanitizer: container-overflow /b/sanitizer-x86_64-linux-fast/build/libcxx_build_asan_ubsan/include/c++/v1/string:1870:29 in __get_long_pointer
2024-01-23 22:56:22 +00:00
Tacet
cb528ec5e6
[ASan][libc++] Turn on ASan annotations for short strings (#79049)
Originally merged here: https://github.com/llvm/llvm-project/pull/75882
Reverted here: https://github.com/llvm/llvm-project/pull/78627

Reverted due to failing buildbots. The problem was not caused by the
annotations code, but by code in the `UniqueFunctionBase` class and in
the `JSON.h` file. That code caused the program to write to memory that
was already being used by string objects, which resulted in an ASan
error.

Fixes are implemented in:
- https://github.com/llvm/llvm-project/pull/79065
- https://github.com/llvm/llvm-project/pull/79066

Problematic code from `UniqueFunctionBase` for example:
```cpp
#ifndef NDEBUG
    // In debug builds, we also scribble across the rest of the storage.
    memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
#endif
```

---

Original description:

This commit turns on ASan annotations in `std::basic_string` for short
stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here:
https://github.com/llvm/llvm-project/pull/72677

Requires to pass CI without fails:
- https://github.com/llvm/llvm-project/pull/75845
- https://github.com/llvm/llvm-project/pull/75858

Annotating `std::basic_string` with default allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677 but annotations for
short strings (SSO - Short String Optimization) are turned off there.
This commit turns them on. This also removes
`_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to
support turning on and off short string annotations.

Support in ASan API exists since
dd1b7b797a.
You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.

This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.

The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.

If you have any questions, please email:

    advenam.tacet@trailofbits.com
    disconnect3d@trailofbits.com
2024-01-23 19:18:53 +01:00
Vitaly Buka
2a54098de1
Revert "[ASan][libc++] Turn on ASan annotations for short strings" (#78627)
Reverts llvm/llvm-project#75882

To recover build bots :
https://lab.llvm.org/buildbot/#/builders/239/builds/5361
https://lab.llvm.org/buildbot/#/builders/168/builds/18126
2024-01-18 16:33:31 -08:00
Tacet
d06fb0b29c
[ASan][libc++] Turn on ASan annotations for short strings (#75882)
This commit turns on ASan annotations in `std::basic_string` for short
stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here:
https://github.com/llvm/llvm-project/pull/72677

Requires to pass CI without fails:
- https://github.com/llvm/llvm-project/pull/75845
- https://github.com/llvm/llvm-project/pull/75858

Annotating `std::basic_string` with default allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677 but annotations for
short strings (SSO - Short String Optimization) are turned off there.
This commit turns them on. This also removes
`_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to
support turning on and off short string annotations.

Support in ASan API exists since
dd1b7b797a.
You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.

This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.

The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.

If you have any questions, please email:

    advenam.tacet@trailofbits.com
    disconnect3d@trailofbits.com
2024-01-18 05:55:34 +01:00
Tacet
60ac394dc9
[ASan][libc++] Annotating std::basic_string with all allocators (#75845)
This commit turns on ASan annotations in `std::basic_string` for all
allocators by default.

Originally suggested here: https://reviews.llvm.org/D146214

String annotations added here:
https://github.com/llvm/llvm-project/pull/72677

This commit is part of our efforts to support container annotations with
(almost) every allocator. Annotating `std::basic_string` with default
allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677.

Additionally it removes `__begin != nullptr` because `data()` should
never return a nullptr.

Support in ASan API exists since
1c5ad6d2c0.
This patch removes the check in std::basic_string annotation member
function (__annotate_contiguous_container) to support different
allocators.

You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.

The motivation for a research and those changes was a bug, found by
Trail of Bits, in a real code where an out-of-bounds read could happen
as two strings were compared via a call to `std::equal` that took
`iter1_begin`, `iter1_end`, `iter2_begin` iterators (with a custom
comparison function). When object `iter1` was longer than `iter2`, read
out-of-bounds on `iter2` could happen. Container sanitization would
detect it.

If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
2024-01-13 18:11:53 +01:00
Tacet
75efddba0f
[ASan][libc++] Initialize __r_ variable with lambda (#77394)
This commit is a refactor (increases readability) and optimization fix.

This is a fixed commit of
https://github.com/llvm/llvm-project/pull/76200 First reverthed here:
1ea7a56057

Please, check original PR for details.

The difference is a return type of the lambda.

Original description:

This commit addresses optimization and instrumentation challenges
encountered within comma constructors.
1) _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS does not work in comma
constructors.
2) Code inside comma constructors is not always correctly optimized.
Problematic code examples:
- `: __r_(((__str.__is_long() ? 0 : (__str.__annotate_delete(), 0)),
std::move(__str.__r_))) {`
- `: __r_(__r_([&](){ if(!__s.__is_long()) __s.__annotate_delete();
return std::move(__s.__r_);}())) {`

However, lambda with argument seems to be correctly optimized. This
patch uses that fact.

Use of lambda based on idea from @ldionne.
2024-01-11 21:40:00 +01:00
Advenam Tacet
1ea7a56057 Revert "[ASan][libc++] String annotations optimizations fix with lambda (#76200)"
This reverts commit c68a9d25e99a096f6862fc4b57dd380a21245d31.
2024-01-09 00:45:34 +01:00