This is technically not necessary in most cases to prevent issues with ADL,
but let's be consistent. This allows us to remove the libcpp-qualify-declval
clang-tidy check, which is now enforced by the robust-against-adl clang-tidy check.
There's no reason not to, and it's easy enough to do using enable_if. As
a drive-by change, also add a missing _LIBCPP_NO_CFI attribute on
__add_alignment_assumption.
Changes:
- Carve out sized but input-only ranges for C++23.
- Call `std::move` for related functions when the iterator is possibly input-only.
Fixes#115727
In https://reviews.llvm.org/D136765 / https://reviews.llvm.org/D144155,
the asan annotations for `std::vector` were modified to unpoison freed
backing memory on destruction, instead of leaving it poisoned. However,
calling `__clear()` instead of `clear()` skips informing the asan runtime
of this decrease in the accessible container size, which breaks the
invariant that the value of `old_mid` should match the value of `new_mid`
from the previous call to `__sanitizer_annotate_contiguous_container`, which
can trip the sanity checks for the partial poison between [d1, d2) and the
container redzone between [d2, c), if enabled. To fix this, ensure that
`clear()` is called instead, as is already done by `__vdeallocate()`.
Also remove `__clear()`, since it is no longer called.
Missing information about begin and end pointers of std::vector can lead
to missed optimizations in LLVM.
This patch adds alignment assumptions at the point where the begin and
end pointers are loaded. If the pointers would not have the same
alignment, end might never get hit when incrementing begin.
See https://github.com/llvm/llvm-project/issues/101372 for a discussion
of missed range check optimizations in hardened mode.
Once https://github.com/llvm/llvm-project/pull/108958 lands, the created
`llvm.assume` calls for the alignment should be folded into the `load`
instructions, resulting in no extra instructions after InstCombine.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
As a follow-up to #113852, this PR optimizes the performance of the
`insert(const_iterator pos, InputIt first, InputIt last)` function for
`input_iterator`-pair inputs in `std::vector` for cases where
reallocation occurs during insertion. Additionally, this optimization
enhances exception safety by replacing the traditional `try-catch`
mechanism with a modern exception guard for the `insert` function.
The optimization targets cases where insertion trigger reallocation. In
scenarios without reallocation, the implementation remains unchanged.
Previous implementation
-----------------------
The previous implementation of `insert` is inefficient in reallocation
scenarios because it performs the following steps separately:
- `reserve()`: This leads to the first round of relocating old
elements to new memory;
- `rotate()`: This leads to the second round of reorganizing the
existing elements;
- Move-forward: Moves the elements after the insertion position to
their final positions.
- Insert: performs the actual insertion.
This approach results in a lot of redundant operations, requiring the
elements to undergo three rounds of relocations/reorganizations to be
placed in their final positions.
Proposed implementation
-----------------------
The proposed implementation jointly optimize the above 4 steps in the
previous implementation such that each element is placed in its final
position in just one round of relocation. Specifically, this
optimization reduces the total cost from 2 relocations + 1 std::rotate
call to just 1 relocation, without needing to call `std::rotate`,
thereby significantly improving overall performance.
This PR optimizes the input iterator overload of `assign(_InputIterator,
_InputIterator)` in `std::vector<_Tp, _Allocator>` by directly assigning
to already initialized memory, rather than first destroying existing
elements and then constructing new ones. By eliminating unnecessary
destruction and construction, the proposed algorithm enhances the
performance by up to 2x for trivial element types (e.g.,
`std::vector<int>`), up to 2.6x for non-trivial element types like
`std::vector<std::string>`, and up to 3.4x for more complex non-trivial
types (e.g., `std::vector<std::vector<int>>`).
### Google Benchmarks
Benchmark tests (`libcxx/test/benchmarks/vector_operations.bench.cpp`)
were conducted for the `assign()` implementations before and after this
patch. The tests focused on trivial element types like
`std::vector<int>`, and non-trivial element types such as
`std::vector<std::string>` and `std::vector<std::vector<int>>`.
#### Before
```
-------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------------------
BM_AssignInputIterIter/vector_int/1024/1024 1157 ns 1169 ns 608188
BM_AssignInputIterIter<32>/vector_string/1024/1024 14559 ns 14710 ns 47277
BM_AssignInputIterIter<32>/vector_vector_int/1024/1024 26846 ns 27129 ns 25925
```
#### After
```
-------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------------------
BM_AssignInputIterIter/vector_int/1024/1024 561 ns 566 ns 1242251
BM_AssignInputIterIter<32>/vector_string/1024/1024 5604 ns 5664 ns 128365
BM_AssignInputIterIter<32>/vector_vector_int/1024/1024 7927 ns 8012 ns 88579
```
This PR simplifies the implementation of std::vector's move constructor
with an alternative allocator by invoking __init_with_size() instead of
calling assign(), which ultimately calls __assign_with_size(). The
advantage of using __init_with_size() lies in its internal use of
an exception guard, which simplifies the code. Furthermore, from a
semantic standpoint, it is more intuitive for a constructor to call
an initialization function than an assignment function.
Now that we don't use __compressed_pair anymore inside std::vector, we
can remove some unnecessary accessors. This is a mechanical replacement
of the __alloc() and __end_cap() accessors, and similar for
std::vector<bool>.
Note that I consistently used this->__alloc_ instead of just __alloc_
since most of the code in <vector> uses that pattern to access members.
I don't think this is necessary anymore (and I'm even less certain I
like this), but I went for consistency with surrounding code. If we want
to change that, we can do a follow-up mechanical change.
Related to PR #114423, this PR proposes to unify the naming of the
internal pointer members in `std::vector` and `std::__split_buffer` for
consistency and clarity.
Both `std::vector` and `std::__split_buffer` originally used a
`__compressed_pair<pointer, allocator_type>` member named `__end_cap_`
to store an internal capacity pointer and an allocator. However,
inconsistent naming changes have been made in both classes:
- `std::vector` now uses `__cap_` and `__alloc_` for its internal
pointer and allocator members.
- In contrast, `std::__split_buffer` retains the name `__end_cap_` for
the capacity pointer, along with `__alloc_`.
This inconsistency between the names `__cap_` and `__end_cap_` has
caused confusions (especially to myself when I was working on both
classes). I suggest unifying these names by renaming `__end_cap_` to
`__cap_` in `std::__split_buffer`.
`std::copy` doesn't use the `_AlgPolicy` for anything other than calling
itself with it, so we can just remove the argument. This also removes
the need in a few other algorithms which had an `_AlgPolicy` argument
only to call `copy`.
This helps with readability since the reader doesn't need to jump around
a bunch of times only to read functions that are implemented in 1 or 2
lines of code.
This PR refactors the `__split_buffer` class to eliminate code
duplication in the `push_back` and `push_front` member functions.
**Motivation:**
The lvalue and rvalue reference overloads of `push_back` share identical
logic, which coincides with that of `emplace_back`. Similarly, the
overloads of `push_front` also share identical logic but lack an
`emplace_front` member function, leading to an inconsistency. These
identical internal logics lead to significant code duplication, making
future maintenance more difficult.
**Summary of Refactor:**
This PR reduces code redundancy by:
1. Modifying both overloads of `push_back` to call `emplace_back`.
2. Introducing a new `emplace_front` member function that encapsulates
the logic of `push_front`, allowing both overloads of `push_front` to
call it (The addition of `emplace_front` also avoids the inconsistency
regarding the absence of `emplace_front`).
The refactoring results in reduced code duplication, improved
maintainability, and enhanced readability.
Added exception guard to the `vector(n, x, a)` constructor to enhance
exception safety. This change ensures that the `vector(n, x, a)`
constructor is consistent with other constructors, such as `vector(n)`,
`vector(n, x)`, `vector(n, a)`, in terms of exception safety.