397 Commits

Author SHA1 Message Date
Дмитрий Изволов
51b8c66b08
[libc++] Extend the scope of radix sorting inside std::stable_sort to floating-point types (#129452)
These changes speed up `std::stable_sort` in the case of sorting
floating-point types.
This applies only to IEEE 754 floats.
The speedup is similar to that achieved for integers in PR #104683 (see
benchmarks below).

Why does this worth doing?
Previously, `std::stable_sort` had almost no chance of beating
`std::sort`.
Now there are cases when `std::stable_sort` is preferrable, and the
difference is significant.
```
---------------------------------------------------------------------------
Benchmark             |  std::stable_sort  |   std::sort | std::stable_sort
                      | without radix_sort |             | with radix_sort 
---------------------------------------------------------------------------
float_Random_1        |       1.62 ns      |     2.15 ns |          1.61 ns
float_Random_4        |       18.0 ns      |     2.71 ns |          16.3 ns
float_Random_16       |        118 ns      |      113 ns |           112 ns
float_Random_64       |        751 ns      |      647 ns |           730 ns
float_Random_256      |       4715 ns      |     2937 ns |          4669 ns
float_Random_1024     |      25713 ns      |    13172 ns |          5959 ns <--
float_Random_4096     |     131307 ns      |    56870 ns |         19294 ns <--
float_Random_16384    |     624996 ns      |   242953 ns |         64264 ns <--
float_Random_65536    |    2895661 ns      |  1027279 ns |        288553 ns <--
float_Random_262144   |   13285372 ns      |  4342593 ns |       3022377 ns <--
float_Random_1048576  |   60595871 ns      | 19087591 ns |      18690457 ns <--
float_Random_2097152  |  131336117 ns      | 38800396 ns |      52325016 ns
float_Random_4194304  |  270043042 ns      | 79978019 ns |     102907726 ns
double_Random_1       |       1.60 ns      |     2.15 ns |          1.61 ns
double_Random_4       |       15.2 ns      |     2.70 ns |          16.9 ns
double_Random_16      |        104 ns      |      112 ns |           119 ns
double_Random_64      |        712 ns      |      614 ns |           755 ns
double_Random_256     |       4496 ns      |     2966 ns |          4820 ns
double_Random_1024    |      24722 ns      |    12679 ns |          6189 ns <--
double_Random_4096    |     126075 ns      |    54484 ns |         20999 ns <--
double_Random_16384   |     613782 ns      |   232557 ns |        110276 ns <--
double_Random_65536   |    2894972 ns      |   988531 ns |        774302 ns <--
double_Random_262144  |   13460273 ns      |  4278059 ns |       5115123 ns
double_Random_1048576 |   61119996 ns      | 18408462 ns |      27166574 ns
double_Random_2097152 |  132511525 ns      | 37986158 ns |      54423869 ns
double_Random_4194304 |  272949862 ns      | 77912616 ns |     147670834 ns
```

Comparison for only `std::stable_sort`:
```
Benchmark                                                         Time      Time Old      Time New
--------------------------------------------------------------------------------------------------
BM_StableSort_float_Random_1024                                -0.7997         25438          5096
BM_StableSort_float_Random_4096                                -0.8731        128157         16260
BM_StableSort_float_Random_16384                               -0.9024        621271         60623
BM_StableSort_float_Random_65536                               -0.9081       2922413        268619
BM_StableSort_float_Random_262144                              -0.7766      13386345       2990408
BM_StableSort_float_Random_1048576                             -0.6954      60673010      18481751
BM_StableSort_float_Random_2097152                             -0.6026     130977358      52052182
BM_StableSort_float_Random_4194304                             -0.6252     271556583     101770500
BM_StableSort_float_Ascending_1024                             -0.6430          6711          2396
BM_StableSort_float_Ascending_4096                             -0.7979         38460          7773
BM_StableSort_float_Ascending_16384                            -0.8471        191069         29222
BM_StableSort_float_Ascending_65536                            -0.8683        882321        116194
BM_StableSort_float_Ascending_262144                           -0.8346       3868552        639937
BM_StableSort_float_Ascending_1048576                          -0.7460      16521233       4195953
BM_StableSort_float_Ascending_2097152                          -0.5439      21757532       9922776
BM_StableSort_float_Ascending_4194304                          -0.7525      67847496      16791582
BM_StableSort_float_Descending_1024                            -0.6359         15038          5475
BM_StableSort_float_Descending_4096                            -0.7090         62810         18278
BM_StableSort_float_Descending_16384                           -0.7763        311844         69750
BM_StableSort_float_Descending_65536                           -0.7228       1270513        352202
BM_StableSort_float_Descending_262144                          -0.6785       5484173       1763045
BM_StableSort_float_Descending_1048576                         -0.5084      20223149       9941852
BM_StableSort_float_Descending_2097152                         -0.7646      60523254      14247014
BM_StableSort_float_Descending_4194304                         -0.5638      95706839      41748858
BM_StableSort_float_SingleElement_1024                         +0.3715          1732          2375
BM_StableSort_float_SingleElement_4096                         -0.1685          9357          7781
BM_StableSort_float_SingleElement_16384                        -0.3793         47307         29362
BM_StableSort_float_SingleElement_65536                        -0.4925        227666        115536
BM_StableSort_float_SingleElement_262144                       -0.4271       1075853        616387
BM_StableSort_float_SingleElement_1048576                      -0.3736       5097599       3193279
BM_StableSort_float_SingleElement_2097152                      -0.2470       9854161       7420158
BM_StableSort_float_SingleElement_4194304                      -0.3384      22175964      14670720
BM_StableSort_float_PipeOrgan_1024                             -0.4885         10664          5455
BM_StableSort_float_PipeOrgan_4096                             -0.6340         50095         18337
BM_StableSort_float_PipeOrgan_16384                            -0.7078        238700         69739
BM_StableSort_float_PipeOrgan_65536                            -0.6740       1102419        359378
BM_StableSort_float_PipeOrgan_262144                           -0.7460       4698739       1193511
BM_StableSort_float_PipeOrgan_1048576                          -0.5657      18493972       8032392
BM_StableSort_float_PipeOrgan_2097152                          -0.7116      41089206      11850349
BM_StableSort_float_PipeOrgan_4194304                          -0.6650      83445011      27955737
BM_StableSort_float_QuickSortAdversary_1024                    -0.6863         17402          5460
BM_StableSort_float_QuickSortAdversary_4096                    -0.7715         79864         18247
BM_StableSort_float_QuickSortAdversary_16384                   -0.7800        317480         69839
BM_StableSort_float_QuickSortAdversary_65536                   -0.7400       1357601        352967
BM_StableSort_float_QuickSortAdversary_262144                  -0.6450       5662094       2009769
BM_StableSort_float_QuickSortAdversary_1048576                 -0.5092      21173627      10392107
BM_StableSort_float_QuickSortAdversary_2097152                 -0.7333      61748178      16469993
BM_StableSort_float_QuickSortAdversary_4194304                 -0.5607      98459863      43250182
BM_StableSort_double_Random_1024                               -0.7657         24769          5802
BM_StableSort_double_Random_4096                               -0.8441        126449         19717
BM_StableSort_double_Random_16384                              -0.8269        614910        106447
BM_StableSort_double_Random_65536                              -0.7413       2905000        751427
BM_StableSort_double_Random_262144                             -0.6287      13449514       4994348
BM_StableSort_double_Random_1048576                            -0.5635      60863246      26568349
BM_StableSort_double_Random_2097152                            -0.5959     130293892      52654532
BM_StableSort_double_Random_4194304                            -0.4772     272616445     142526267
BM_StableSort_double_Ascending_1024                            -0.4870          6757          3466
BM_StableSort_double_Ascending_4096                            -0.7360         37592          9923
BM_StableSort_double_Ascending_16384                           -0.7971        183967         37324
BM_StableSort_double_Ascending_65536                           -0.7465        897116        227398
BM_StableSort_double_Ascending_262144                          -0.6764       4020980       1301033
BM_StableSort_double_Ascending_1048576                         -0.6407      16421799       5900751
BM_StableSort_double_Ascending_2097152                         -0.6380      29347139      10622419
BM_StableSort_double_Ascending_4194304                         -0.5934      70439925      28644185
BM_StableSort_double_Descending_1024                           -0.5988         15216          6105
BM_StableSort_double_Descending_4096                           -0.6857         65069         20449
BM_StableSort_double_Descending_16384                          -0.6922        329321        101381
BM_StableSort_double_Descending_65536                          -0.7038       1367970        405242
BM_StableSort_double_Descending_262144                         -0.6472       5361644       1891429
BM_StableSort_double_Descending_1048576                        -0.6656      22031404       7366459
BM_StableSort_double_Descending_2097152                        -0.7593      68922467      16591242
BM_StableSort_double_Descending_4194304                        -0.6392      96283643      34743223
BM_StableSort_double_SingleElement_1024                        +0.9128          1895          3625
BM_StableSort_double_SingleElement_4096                        +0.1475         10013         11490
BM_StableSort_double_SingleElement_16384                       -0.1901         52382         42424
BM_StableSort_double_SingleElement_65536                       -0.2096        254698        201313
BM_StableSort_double_SingleElement_262144                      -0.1833       1248478       1019648
BM_StableSort_double_SingleElement_1048576                     -0.1741       5703397       4710603
BM_StableSort_double_SingleElement_2097152                     -0.1751      10922197       9009835
BM_StableSort_double_SingleElement_4194304                     -0.1538      26571923      22485137
BM_StableSort_double_PipeOrgan_1024                            -0.4406         10752          6014
BM_StableSort_double_PipeOrgan_4096                            -0.5917         49456         20195
BM_StableSort_double_PipeOrgan_16384                           -0.6258        270515        101221
BM_StableSort_double_PipeOrgan_65536                           -0.7098       1159462        336457
BM_StableSort_double_PipeOrgan_262144                          -0.6591       4735711       1614433
BM_StableSort_double_PipeOrgan_1048576                         -0.6620      19353110       6541172
BM_StableSort_double_PipeOrgan_2097152                         -0.7288      49131812      13323391
BM_StableSort_double_PipeOrgan_4194304                         -0.5988      81958974      32878171
BM_StableSort_double_QuickSortAdversary_1024                   -0.6516         17948          6254
BM_StableSort_double_QuickSortAdversary_4096                   -0.7527         82359         20363
BM_StableSort_double_QuickSortAdversary_16384                  -0.7009        340410        101811
BM_StableSort_double_QuickSortAdversary_65536                  -0.6854       1487480        467928
BM_StableSort_double_QuickSortAdversary_262144                 -0.6386       5648460       2041377
BM_StableSort_double_QuickSortAdversary_1048576                -0.6127      22859142       8852587
BM_StableSort_double_QuickSortAdversary_2097152                -0.7161      68693975      19499381
BM_StableSort_double_QuickSortAdversary_4194304                -0.5909      95532179      39077491
OVERALL_GEOMEAN                                                -0.6472             0             0
```
2025-04-16 17:34:11 +08:00
A. Jiang
cfa322fa9a
[libc++][NFC] Reuse __bit_log2 for sort (#135303)
The `__log2i` function template in `<algorithm/sort.h>` is basically
equivalent to `__bit_log2` in `<__bit/bit_log2.h>`. It seems better to
avoid duplication.
2025-04-13 09:45:18 +08:00
James E T Smith
475cbf0ad6
[libc++] Implement ranges::iota (#68494)
# Overview

As a disclaimer, this is my first PR to LLVM and while I've tried to
ensure I've followed the LLVM and libc++ contributing guidelines,
there's probably a good chance I missed something. If I have, just let
me know and I'll try to correct it as soon as I can.

This PR implements `std::ranges::iota` and
`std::ranges::out_value_result` outlined in
[P2440r1](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2440r1.html).

As outlined in the paper above, I've:
- Implemented `out_value_result` and added to `<algorithm>`
- Added `out_value_result`, `iota_result`, and two overloads of `iota`
to `std::ranges` in `<numeric>`
- Updated the version macro `__cpp_lib_ranges_iota` in `<version>`

I've also added tests for `ranges::iota` and `ranges::out_value_result`.
Lastly, I added those structs to the appropriate module files.

Partially implements #105184

EDIT: Forgot to mention in the original post, thanks to @hawkinsw for
taking a look at a preliminary version of this PR!

# TODOs
- [x] Updating the range [status
doc](https://github.com/jamesETsmith/llvm-project/blob/main/libcxx/docs/Status/RangesMajorFeatures.csv)
- [x] Ensure all comments from https://reviews.llvm.org/D121436 are
addressed here
- [X] EDIT (I'll do this in a separate PR). ~~I'm open to implementing
the rest of P2440r1 (`ranges::shift_left` and `ranges::shift_right`) if
that's ok, I just wanted to get feedback on `ranges::iota` first~~
- [x] I've been having trouble building the modules locally and want to
make sure that's working properly

Closes: #134060
2025-04-05 13:46:11 +02:00
Peng Liu
8fdfe3f2a7
[libc++] Refactor ranges::{min, max, min_element, max_element} to use std::__min_element (#132418)
Previously, ranges::min_element delegated to ranges::__min_element_impl, which
duplicated the definition of std::__min_element. This patch updates
ranges::min_element to directly call std::__min_element, which allows
removing the redundant code in ranges::__min_element_impl.

Upon removal of ranges::__min_element_impl, the other ranges algorithms
ranges::{min,max,max_element}, which previously delegated to ranges::__min_element_impl,
have been updated to call std::__min_element instead.

This refactoring unifies the implementation across these algorithms,
ensuring that future optimizations or maintenance work only need to be
applied in one place.
2025-03-27 09:05:37 -04:00
Louis Dionne
b0668d859b
[libc++] Make sure that __desugars_to isn't tripped up by reference_wrapper, const and ref qualifiers (#132092)
Previously, const and ref qualification on an operation would cause
__desugars_to to report false, which would lead to unnecessary
pessimizations. The same holds for reference_wrapper.

In practice, const and ref qualifications on the operation itself are
not relevant to determining whether an operation desugars to something
else or not, so can be ignored.

We are not stripping volatile qualifiers from operations in this patch
because we feel that this requires additional discussion.

Fixes #129312
2025-03-25 14:29:06 -04:00
Louis Dionne
aa80388cf9
[libc++] Ensure that we vectorize algorithms on all Clang-based compilers (#132090)
Otherwise, we wouldn't vectorize on compilers like AppleClang when in
reality we know perfectly well how to do it.
2025-03-24 13:22:40 -04:00
Nikolas Klauser
fb44f00641
[libc++] Add [[gnu::nodebug]] on type traits (#128502) 2025-03-23 21:01:25 +01:00
A. Jiang
053a714add
[libc++] Implement part of P2562R1: constexpr ranges::inplace_merge (#131947)
Drive-by changes:
- Consistently mark `std::__inplace_merge::__inplace_merge_impl`
`_LIBCPP_CONSTEXPR_SINCE_CXX26`.
- This function template is only called by other functions that becomes
constexpr since C++26, and it itself calls `std::__inplace_merge` that
is constexpr since C++26.
- Unblock related test coverage in constant evaluation for
`stable_partition`, `ranges::stable_sort`, `std::stable_sort`,
`std::stable_partition`, and `std::inplace_merge`.
2025-03-20 01:59:32 +08:00
Peng Liu
029cb8aff1
[libc++] Fix copy_backward for vector<bool> with small storage types (#131560)
This patch fixes an issue in libc++ where `std::copy_backward` and
`ranges::copy_backward` incorrectly copy `std::vector<bool>` with small
storage types (e.g., `uint8_t`, `uint16_t`). The problem arises from
flawed bit mask computations involving integral promotions and sign-bit
extension, leading to unintended zeroing of bits. This patch corrects
the bitwise operations to ensure correct bit-level copying.

Fixes #131718.
2025-03-19 11:58:02 -04:00
Peng Liu
a5b3d3a03f
[libc++] Fix {std, ranges}::copy for vector<bool> with small storage types (#131545)
The current implementation of `{std, ranges}::copy` fails to copy
`vector<bool>` correctly when the underlying storage type
(`__storage_type`) is smaller than `int`, such as `unsigned char`,
`unsigned short`, `uint8_t` and `uint16_t`. The root cause is that the
unsigned small storage type undergoes integer promotion to (signed)
`int`, which is then left and right shifted, leading to UB (before
C++20) and sign-bit extension (since C++20) respectively. As a result,
the underlying bit mask evaluations become incorrect, causing erroneous
copying behavior.

This patch resolves the issue by correcting the internal bitwise
operations, ensuring that `{std, ranges}::copy` operates correctly for
`vector<bool>` with any custom (unsigned) storage types.

Fixes #131692.
2025-03-19 11:55:51 -04:00
Peng Liu
c5195ae2d0
[libc++] Fix {std, ranges}::equal for vector<bool> with small storage types (#130394)
The current implementation of `{std, ranges}::equal` fails to correctly
compare `vector<bool>`s when the underlying storage type is smaller than
`int` (e.g., `unsigned char`, `unsigned short`, `uint8_t` and
`uint16_t`). See [demo](https://godbolt.org/z/j4s87s6b3)). The problem
arises due to integral promotions on the intermediate bitwise
operations, leading to incorrect final equality comparison results. This
patch fixes the issue by ensuring that `{std, ranges}::equal` operate
properly for both aligned and unaligned bits.
 
Fixes #126369.
2025-03-19 11:51:21 -04:00
Peng Liu
e53bea5182
[libc++] Fix ambiguous call in {ranges, std}::count (#122529)
This PR fixes an ambiguous call encountered while using the
`std::ranges::count` and `std::count` algorithms with `vector<bool>`
with small `size_type`s.

The ambiguity arises from integral promotions during the internal
bitwise arithmetic of the `count` algorithms for small integral types.
This results in multiple viable candidates:
`__libcpp_popcount(unsigned)`,` __libcpp_popcount(unsigned long)`, and
`__libcpp_popcount(unsigned long long)`, leading to an ambiguous call
error. To resolve this ambiguity, we introduce a dispatcher function,
`__popcount`, which directs calls to the appropriate overloads of
`__libcpp_popcount`. This closes #122528.
2025-03-19 11:36:29 -04:00
A. Jiang
854a4f2bbb
[libc++] Implement part of P2562R1: constexpr std::inplace_merge (#129008)
Drive-by:
- Adds `constexpr_random.h` for pseudo-randomizing or shuffling in tests
for constant evaluation.
2025-03-19 07:42:23 +08:00
Peng Liu
3bd71cbec7
[libc++] Fix ambiguous call in {ranges, std}::find (#122641)
This PR fixes an ambiguous call encountered when using the `std::ranges::find` or `std::find`
algorithms with `vector<bool>` with small `allocator_traits::size_type`s, an issue reported
in #122528. The ambiguity arises from integral promotions during the internal bitwise
arithmetic of the `find` algorithms when applied to `vector<bool>` with small integral
`size_type`s. This leads to multiple viable candidates for small integral types:
__libcpp_ctz(unsigned), __libcpp_ctz(unsigned long), and __libcpp_ctz(unsigned long long),
none of which represent a single best viable match, resulting in an ambiguous call error.

To resolve this, we propose invoking an internal function __countr_zero as a dispatcher
that directs the call to the appropriate overload of __libcpp_ctz. Necessary amendments
have also been made to __countr_zero.
2025-03-13 14:15:03 -04:00
Peng Liu
4baf1c03fa
[libc++] Optimize ranges::rotate for vector<bool>::iterator (#121168)
This PR optimizes the performance of `std::ranges::rotate` for
`vector<bool>::iterator`. The optimization yields a performance
improvement of up to 2096x.

Closes #64038.
2025-03-13 14:07:23 -04:00
A. Jiang
ba9aeedf8e
[libc++] Implement part of P2562R1: constexpr ranges::stable_sort (#128860)
Drive-by: Enables test coverage for `ranges::stable_sort` with proxy
iterators, and changes "constexpr in" to "constexpr since" in comments
in `<algorithm>`.
2025-03-07 01:27:48 +08:00
A. Jiang
c28c508962
[libc++] Implement part of P2562R1: constexpr ranges::stable_partition (#129839) 2025-03-06 09:23:55 +08:00
Peng Liu
a12744ff05
[libc++] Optimize ranges::swap_ranges for vector<bool>::iterator (#121150)
This PR optimizes the performance of `std::ranges::swap_ranges` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to **611x** for
aligned range swap and **78x** for unaligned range swap comparison.
Additionally, comprehensive tests covering up to 4 storage words (256
bytes) with odd and even bit sizes are provided, which validate the
proposed optimizations in this patch.
2025-03-04 17:15:36 -05:00
A. Jiang
bf9bf291a3
[libc++] Implement part of P2562R1: constexpr std::stable_partition (#128868)
Drive-by changes:
- Enables no-memory case for Clang.
- Enables `robust_re_difference_type.compile.pass.cpp` and
`robust_against_proxy_iterators_lifetime_bugs.pass.cpp` test coverage
for `std::stable_sort` in constant evaluation since C++26. The changes
were missing in the PR making `std::stable_sort` `constexpr`.
2025-03-04 09:23:29 +08:00
Nikolas Klauser
1a6f9fd87f
[libc++] Enable algorithm vectorization on arm neon (#128873)
Previously the wrong detection macro has been used to check whether arm
NEON is available. This fixes it, and removes a few unnecessary includes
from `__algorithm/simd_utils.h` as a drive-by.
2025-02-28 13:38:52 +01:00
Peng Liu
7717a549e9
[libc++] Optimize ranges::equal for vector<bool>::iterator (#121084)
This PR optimizes the performance of `std::ranges::equal` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to 188x for
aligned equality comparison and 82x for unaligned equality
comparison. Moreover, comprehensive tests covering up to 4 storage words
(256 bytes) with odd and even bit sizes are provided, which validate the
proposed optimizations in this patch.
2025-02-26 12:18:25 -05:00
Louis Dionne
5e26fb1699
[libc++] Qualify calls to nullary functions like __throw_foo (#122465)
This is technically not necessary in most cases to prevent issues with ADL,
but let's be consistent. This allows us to remove the libcpp-qualify-declval
clang-tidy check, which is now enforced by the robust-against-adl clang-tidy check.
2025-02-21 07:59:46 -05:00
Peng Liu
ab3d793982
[libc++] Optimize ranges::move{,_backward} for vector<bool>::iterator (#121109)
As a follow-up to #121013 (which optimized `ranges::copy`) and #121026
(which optimized `ranges::copy_backward`), this PR enhances the
performance of `std::ranges::{move, move_backward}` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.

The optimizations bring performance improvements analogous to those
achieved for the `{copy, copy_backward}` algorithms: up to 2000x for
aligned moves and 60x for unaligned moves. Moreover, comprehensive
tests covering up to 4 storage words (256 bytes) with odd and even bit
sizes are provided, which validate the proposed optimizations in this
patch.
2025-02-19 11:36:45 -05:00
Дмитрий Изволов
6e402f5121
[libc++] Support constexpr for std::stable_sort in radix sort branch (#125284)
`std::stable_sort` is `constexpr` since PR
https://github.com/llvm/llvm-project/pull/110320
But `radix_sort` branch is still non-`constexpr`.
This PR fixes it.

#119394
#105360
2025-02-06 09:07:25 +01:00
Peng Liu
cf9806eb4d
[libc++] Fix UB in bitwise logic of {std, ranges}::{fill, fill_n} algorithms (#122410)
This PR addresses an undefined behavior that arises when using the
`std::fill` and `std::fill_n` algorithms, as well as their ranges
counterparts `ranges::fill` and `ranges::fill_n`, with `vector<bool, Alloc>`
that utilizes a custom-sized allocator with small integral types.
2025-02-05 11:39:49 -05:00
Peng Liu
edc3dc6abd
[libc++] Optimize ranges::copy_backward for vector<bool>::iterator (#121026)
As a follow-up to #121013 (which focused on `std::ranges::copy`), this
PR optimizes the performance of `std::ranges::copy_backward` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to 2000x for
aligned copies and 60x for unaligned copies.
2025-01-30 14:55:05 -05:00
Peng Liu
5b65896ad6
[libc++] Optimize ranges::copy{, _n} for vector<bool>::iterator (#121013)
This PR optimizes the performance of `std::ranges::copy` and
`std::ranges::copy_n` specifically for `vector<bool>::iterator`,
addressing a subtask outlined in issue #64038. The optimizations yield
performance improvements of up to **2000x** for aligned copies and
**60x** for unaligned copies. Additionally, new tests have been added to
validate these enhancements.


- Aligned source-destination bits

ranges::copy
```
--------------------------------------------------------------------------
Benchmark                                Before        After   Improvement
--------------------------------------------------------------------------
bm_ranges_copy_vb_aligned/8              10.8 ns      1.42 ns           8x
bm_ranges_copy_vb_aligned/64             88.5 ns      2.28 ns          39x
bm_ranges_copy_vb_aligned/512             709 ns      1.95 ns         364x
bm_ranges_copy_vb_aligned/4096           5568 ns      5.01 ns        1111x
bm_ranges_copy_vb_aligned/32768         44754 ns      38.7 ns        1156x
bm_ranges_copy_vb_aligned/65536         91092 ns      73.2 ns        1244x
bm_ranges_copy_vb_aligned/102400       139473 ns       127 ns        1098x
bm_ranges_copy_vb_aligned/106496       189004 ns      81.5 ns        2319x
bm_ranges_copy_vb_aligned/110592       153647 ns      71.1 ns        2161x
bm_ranges_copy_vb_aligned/114688       159261 ns      70.2 ns        2269x
bm_ranges_copy_vb_aligned/118784       181910 ns      73.5 ns        2475x
bm_ranges_copy_vb_aligned/122880       174117 ns      76.5 ns        2276x
bm_ranges_copy_vb_aligned/126976       176020 ns      82.0 ns        2147x
bm_ranges_copy_vb_aligned/131072       180757 ns       137 ns        1319x
bm_ranges_copy_vb_aligned/135168       190342 ns       158 ns        1205x
bm_ranges_copy_vb_aligned/139264       192831 ns       103 ns        1872x
bm_ranges_copy_vb_aligned/143360       199627 ns      89.4 ns        2233x
bm_ranges_copy_vb_aligned/147456       203881 ns      88.6 ns        2301x
bm_ranges_copy_vb_aligned/151552       213345 ns      88.4 ns        2413x
bm_ranges_copy_vb_aligned/155648       216892 ns      92.9 ns        2335x
bm_ranges_copy_vb_aligned/159744       222751 ns      96.4 ns        2311x
bm_ranges_copy_vb_aligned/163840       225995 ns       173 ns        1306x
bm_ranges_copy_vb_aligned/167936       235230 ns       202 ns        1165x
bm_ranges_copy_vb_aligned/172032       244093 ns       131 ns        1863x
bm_ranges_copy_vb_aligned/176128       244434 ns       111 ns        2202x
bm_ranges_copy_vb_aligned/180224       249570 ns       108 ns        2311x
bm_ranges_copy_vb_aligned/184320       254538 ns       108 ns        2357x
bm_ranges_copy_vb_aligned/188416       261817 ns       113 ns        2317x
bm_ranges_copy_vb_aligned/192512       269923 ns       125 ns        2159x
bm_ranges_copy_vb_aligned/196608       273494 ns       210 ns        1302x
bm_ranges_copy_vb_aligned/200704       280035 ns       269 ns        1041x
bm_ranges_copy_vb_aligned/204800       293102 ns       231 ns        1269x
```

ranges::copy_n
```
--------------------------------------------------------------------------
Benchmark                                Before        After   Improvement
--------------------------------------------------------------------------
bm_ranges_copy_n_vb_aligned/8            11.8 ns       0.89 ns         13x
bm_ranges_copy_n_vb_aligned/64           91.6 ns       2.06 ns         44x
bm_ranges_copy_n_vb_aligned/512           718 ns       2.45 ns        293x
bm_ranges_copy_n_vb_aligned/4096         5750 ns       5.02 ns       1145x
bm_ranges_copy_n_vb_aligned/32768       45824 ns       40.9 ns       1120x
bm_ranges_copy_n_vb_aligned/65536       92267 ns       73.8 ns       1250x
bm_ranges_copy_n_vb_aligned/102400     143267 ns       125 ns        1146x
bm_ranges_copy_n_vb_aligned/106496     148625 ns      82.4 ns        1804x
bm_ranges_copy_n_vb_aligned/110592     154817 ns      72.0 ns        2150x
bm_ranges_copy_n_vb_aligned/114688     157953 ns      70.4 ns        2244x
bm_ranges_copy_n_vb_aligned/118784     162374 ns      71.5 ns        2270x
bm_ranges_copy_n_vb_aligned/122880     168638 ns      72.9 ns        2313x
bm_ranges_copy_n_vb_aligned/126976     175596 ns      76.6 ns        2292x
bm_ranges_copy_n_vb_aligned/131072     181164 ns       135 ns        1342x
bm_ranges_copy_n_vb_aligned/135168     184697 ns       157 ns        1176x
bm_ranges_copy_n_vb_aligned/139264     191395 ns       104 ns        1840x
bm_ranges_copy_n_vb_aligned/143360     194954 ns      88.3 ns        2208x
bm_ranges_copy_n_vb_aligned/147456     208917 ns      86.1 ns        2426x
bm_ranges_copy_n_vb_aligned/151552     211101 ns      87.2 ns        2421x
bm_ranges_copy_n_vb_aligned/155648     213175 ns      89.0 ns        2395x
bm_ranges_copy_n_vb_aligned/159744     218988 ns      86.7 ns        2526x
bm_ranges_copy_n_vb_aligned/163840     225263 ns       156 ns        1444x
bm_ranges_copy_n_vb_aligned/167936     230725 ns       184 ns        1254x
bm_ranges_copy_n_vb_aligned/172032     235795 ns       119 ns        1981x
bm_ranges_copy_n_vb_aligned/176128     241145 ns       101 ns        2388x
bm_ranges_copy_n_vb_aligned/180224     250680 ns      99.5 ns        2519x
bm_ranges_copy_n_vb_aligned/184320     262954 ns      99.7 ns        2637x
bm_ranges_copy_n_vb_aligned/188416     258584 ns       103 ns        2510x
bm_ranges_copy_n_vb_aligned/192512     267190 ns       125 ns        2138x
bm_ranges_copy_n_vb_aligned/196608     270821 ns       213 ns        1271x
bm_ranges_copy_n_vb_aligned/200704     279532 ns       262 ns        1067x
bm_ranges_copy_n_vb_aligned/204800     283412 ns       222 ns        1277x
```

- Unaligned source-destination bits
```
--------------------------------------------------------------------------------
Benchmark                                    Before           After  Improvement
--------------------------------------------------------------------------------
bm_ranges_copy_vb_unaligned/8               12.8 ns         8.59 ns         1.5x
bm_ranges_copy_vb_unaligned/64              98.2 ns         8.24 ns          12x
bm_ranges_copy_vb_unaligned/512              755 ns         18.1 ns          42x
bm_ranges_copy_vb_unaligned/4096            6027 ns          102 ns          59x
bm_ranges_copy_vb_unaligned/32768          47663 ns          774 ns          62x
bm_ranges_copy_vb_unaligned/262144        378981 ns         6455 ns          59x
bm_ranges_copy_vb_unaligned/1048576      1520486 ns        25942 ns          59x
bm_ranges_copy_n_vb_unaligned/8             11.3 ns         8.22 ns         1.4x
bm_ranges_copy_n_vb_unaligned/64            97.3 ns         7.89 ns          12x
bm_ranges_copy_n_vb_unaligned/512            747 ns         18.1 ns          41x
bm_ranges_copy_n_vb_unaligned/4096          5932 ns         99.0 ns          60x
bm_ranges_copy_n_vb_unaligned/32768        47776 ns         749 ns           64x
bm_ranges_copy_n_vb_unaligned/262144      378802 ns        6576 ns           58x
bm_ranges_copy_n_vb_unaligned/1048576    1547234 ns       26229 ns           59x
```
2025-01-30 17:26:26 +01:00
Nikolas Klauser
24e70e3930
[libc++] Switch experimental library macros to 0/1 macros (#124030)
This is a continuation of what's been started in #89178.

As a drive-by, this also changes the PSTL macro to say `EXPERIMENTAL`
instead of `INCOMPLETE`.
2025-01-24 09:34:42 +01:00
Nikolas Klauser
0fa05456a8
[libc++] Define an internal API for std::invoke and friends (#116637)
Currently we're using quite different internal names for the
`std::invoke` family of type traits. This adds a layer around the
current implementation to make it easier to understand when it is used
and makes it easier to define multiple implementations of it.
2025-01-20 18:00:15 +01:00
Peng Liu
493c066a3d
[libc++] Fix ambiguity due to non-uglified member typedefs (#121664)
This PR fixes the ambiguities in name lookup caused by non-standard
member typedefs `size_type` and `difference_type` in `std::bitset`.

Follows up #121620.
Closes #121618.
2025-01-14 10:30:30 -05:00
PaulXiCao
438e2ccd4a
[libc++] Make std::stable_sort constexpr friendly (#110320)
Implementing `constexpr std::stable_sort`. This is part of P2562R1,
tracked via issue #105360.

Closes #119394

Co-authored-by: A. Jiang <de34@live.cn>
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2025-01-14 10:24:35 -05:00
Louis Dionne
c492a228e9 [libc++] Add missing _LIBCPP_NODEBUG on internal aliases 2025-01-09 16:03:39 -05:00
Дмитрий Изволов
69b54c1a05
[libcxx][algorithm] Optimize std::stable_sort via radix sort algorithm (#104683)
The radix sort (LSD) algorithm allows to speed up std::stable_sort
dramatically in case we sort integers.
The speed up varies from a relatively small to x10 times, depending on
type of sorted elements and the initial state of the sorted array.

```
Running ./libcxx/test/benchmarks/stable_sort.bench.out
Run on (12 X 2600 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB
  L1 Instruction 32 KiB
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB
Load Average: 3.48, 3.38, 3.08
---------------------------------------------------------------------------
Benchmark                                               After        Before 
---------------------------------------------------------------------------
BM_StableSort_int8_Random_1                           3.39 ns       3.58 ns 
BM_StableSort_int8_Random_4                           21.1 ns       21.9 ns 
BM_StableSort_int8_Random_16                           142 ns        147 ns 
BM_StableSort_int8_Random_64                           893 ns        903 ns 
BM_StableSort_int8_Random_256                          409 ns       5810 ns 
BM_StableSort_int8_Random_1024                        1235 ns      29973 ns 
BM_StableSort_int8_Random_4096                        4410 ns     141880 ns 
BM_StableSort_int8_Random_16384                      18044 ns     620540 ns 
BM_StableSort_int8_Random_65536                     144030 ns    2592013 ns 
BM_StableSort_int8_Random_262144                    858350 ns   10935814 ns 
BM_StableSort_int8_Random_524288                   2929988 ns   27060729 ns 
BM_StableSort_int8_Random_1048576                  6058292 ns   49622720 ns 
BM_StableSort_int8_Ascending_1                        3.42 ns       3.92 ns 
BM_StableSort_int8_Ascending_4                        5.86 ns       8.08 ns 
BM_StableSort_int8_Ascending_16                       10.6 ns       12.0 ns 
BM_StableSort_int8_Ascending_64                       28.9 ns       30.6 ns 
BM_StableSort_int8_Ascending_256                       415 ns        391 ns 
BM_StableSort_int8_Ascending_1024                     1666 ns       2309 ns 
BM_StableSort_int8_Ascending_4096                     7748 ns      12269 ns 
BM_StableSort_int8_Ascending_16384                   40588 ns      60181 ns 
BM_StableSort_int8_Ascending_65536                  178843 ns     298221 ns 
BM_StableSort_int8_Ascending_262144                 919959 ns    1402692 ns 
BM_StableSort_int8_Ascending_524288                2397397 ns    3036984 ns 
BM_StableSort_int8_Ascending_1048576               5080043 ns    7218581 ns 
BM_StableSort_int8_Descending_1                       3.44 ns       3.53 ns 
BM_StableSort_int8_Descending_4                       7.94 ns       8.29 ns 
BM_StableSort_int8_Descending_16                      59.6 ns       57.7 ns 
BM_StableSort_int8_Descending_64                      1051 ns       1027 ns 
BM_StableSort_int8_Descending_256                      422 ns       4718 ns 
BM_StableSort_int8_Descending_1024                    1676 ns      21044 ns 
BM_StableSort_int8_Descending_4096                    7766 ns      64827 ns 
BM_StableSort_int8_Descending_16384                  40230 ns      93981 ns 
BM_StableSort_int8_Descending_65536                 190978 ns     421151 ns 
BM_StableSort_int8_Descending_262144               1055141 ns    1918927 ns 
BM_StableSort_int8_Descending_524288               2875115 ns    3809153 ns 
BM_StableSort_int8_Descending_1048576              5854135 ns    8713690 ns 
BM_StableSort_int8_SingleElement_1                    3.52 ns       3.46 ns 
BM_StableSort_int8_SingleElement_4                    6.25 ns       5.79 ns 
BM_StableSort_int8_SingleElement_16                   10.7 ns       11.4 ns 
BM_StableSort_int8_SingleElement_64                   29.3 ns       30.3 ns 
BM_StableSort_int8_SingleElement_256                   858 ns        380 ns 
BM_StableSort_int8_SingleElement_1024                 3036 ns       2231 ns 
BM_StableSort_int8_SingleElement_4096                11580 ns      11866 ns 
BM_StableSort_int8_SingleElement_16384               44956 ns      59621 ns 
BM_StableSort_int8_SingleElement_65536              182006 ns     297853 ns 
BM_StableSort_int8_SingleElement_262144             962181 ns    1432857 ns 
BM_StableSort_int8_SingleElement_524288            2256687 ns    2975707 ns 
BM_StableSort_int8_SingleElement_1048576           4522556 ns    6949948 ns 
BM_StableSort_int8_PipeOrgan_1                        3.26 ns       3.64 ns 
BM_StableSort_int8_PipeOrgan_4                        6.21 ns       6.58 ns 
BM_StableSort_int8_PipeOrgan_16                       23.7 ns       25.4 ns 
BM_StableSort_int8_PipeOrgan_64                        250 ns        248 ns 
BM_StableSort_int8_PipeOrgan_256                       414 ns       2498 ns 
BM_StableSort_int8_PipeOrgan_1024                     1697 ns      10946 ns 
BM_StableSort_int8_PipeOrgan_4096                     7840 ns      37238 ns 
BM_StableSort_int8_PipeOrgan_16384                   41402 ns      74805 ns 
BM_StableSort_int8_PipeOrgan_65536                  180107 ns     357891 ns 
BM_StableSort_int8_PipeOrgan_262144                 988273 ns    1647296 ns 
BM_StableSort_int8_PipeOrgan_524288                2547374 ns    3245991 ns 
BM_StableSort_int8_PipeOrgan_1048576               5128783 ns    7342444 ns 
BM_StableSort_int8_QuickSortAdversary_1               3.14 ns       4.01 ns 
BM_StableSort_int8_QuickSortAdversary_4               6.05 ns       7.02 ns 
BM_StableSort_int8_QuickSortAdversary_16              10.5 ns       11.9 ns 
BM_StableSort_int8_QuickSortAdversary_64               520 ns        516 ns 
BM_StableSort_int8_QuickSortAdversary_256              920 ns        386 ns 
BM_StableSort_int8_QuickSortAdversary_1024            3083 ns       2299 ns 
BM_StableSort_int8_QuickSortAdversary_4096           11659 ns      12295 ns 
BM_StableSort_int8_QuickSortAdversary_16384          45721 ns      60931 ns 
BM_StableSort_int8_QuickSortAdversary_65536         186334 ns     295423 ns 
BM_StableSort_int8_QuickSortAdversary_262144        946262 ns    1399973 ns 
BM_StableSort_int8_QuickSortAdversary_524288       2282004 ns    2832266 ns 
BM_StableSort_int8_QuickSortAdversary_1048576      4691123 ns    6963253 ns 
BM_StableSort_uint8_Random_1                          3.11 ns       3.44 ns 
BM_StableSort_uint8_Random_4                          21.9 ns       23.1 ns 
BM_StableSort_uint8_Random_16                          154 ns        171 ns 
BM_StableSort_uint8_Random_64                         1000 ns       1051 ns 
BM_StableSort_uint8_Random_256                         402 ns       6498 ns 
BM_StableSort_uint8_Random_1024                       1176 ns      35310 ns 
BM_StableSort_uint8_Random_4096                       4415 ns     164087 ns 
BM_StableSort_uint8_Random_16384                     17849 ns     686769 ns 
BM_StableSort_uint8_Random_65536                    146109 ns    2932051 ns 
BM_StableSort_uint8_Random_262144                   876710 ns   12163988 ns 
BM_StableSort_uint8_Random_524288                  2858089 ns   26458830 ns 
BM_StableSort_uint8_Random_1048576                 5766942 ns   54836214 ns 
BM_StableSort_uint8_Ascending_1                       3.11 ns       3.43 ns 
BM_StableSort_uint8_Ascending_4                       6.18 ns       7.24 ns 
BM_StableSort_uint8_Ascending_16                      14.5 ns       17.0 ns 
BM_StableSort_uint8_Ascending_64                      50.7 ns       59.2 ns 
BM_StableSort_uint8_Ascending_256                      395 ns        536 ns 
BM_StableSort_uint8_Ascending_1024                    1752 ns       2956 ns 
BM_StableSort_uint8_Ascending_4096                    7785 ns      15146 ns 
BM_StableSort_uint8_Ascending_16384                  41442 ns      74136 ns 
BM_StableSort_uint8_Ascending_65536                 180879 ns     354261 ns 
BM_StableSort_uint8_Ascending_262144                945880 ns    1674256 ns 
BM_StableSort_uint8_Ascending_524288               2287832 ns    3138581 ns 
BM_StableSort_uint8_Ascending_1048576              4630290 ns    7296278 ns 
BM_StableSort_uint8_Descending_1                      3.19 ns       3.63 ns 
BM_StableSort_uint8_Descending_4                      9.60 ns       11.5 ns 
BM_StableSort_uint8_Descending_16                     78.3 ns       86.0 ns 
BM_StableSort_uint8_Descending_64                     1265 ns       1308 ns 
BM_StableSort_uint8_Descending_256                     395 ns       6556 ns 
BM_StableSort_uint8_Descending_1024                   1712 ns      24669 ns 
BM_StableSort_uint8_Descending_4096                   7748 ns      83407 ns 
BM_StableSort_uint8_Descending_16384                 40779 ns     104043 ns 
BM_StableSort_uint8_Descending_65536                181560 ns     467680 ns 
BM_StableSort_uint8_Descending_262144              1146627 ns    2102769 ns 
BM_StableSort_uint8_Descending_524288              2874096 ns    4572229 ns 
BM_StableSort_uint8_Descending_1048576             5873195 ns   10170663 ns 
BM_StableSort_uint8_SingleElement_1                   3.28 ns       3.58 ns 
BM_StableSort_uint8_SingleElement_4                   6.44 ns       7.40 ns 
BM_StableSort_uint8_SingleElement_16                  14.9 ns       16.4 ns 
BM_StableSort_uint8_SingleElement_64                  51.2 ns       52.9 ns 
BM_StableSort_uint8_SingleElement_256                  876 ns        490 ns 
BM_StableSort_uint8_SingleElement_1024                3041 ns       2750 ns 
BM_StableSort_uint8_SingleElement_4096               11947 ns      14326 ns 
BM_StableSort_uint8_SingleElement_16384              46669 ns      69984 ns 
BM_StableSort_uint8_SingleElement_65536             197903 ns     328961 ns 
BM_StableSort_uint8_SingleElement_262144           1031466 ns    1551436 ns 
BM_StableSort_uint8_SingleElement_524288           2447672 ns    3049553 ns 
BM_StableSort_uint8_SingleElement_1048576          4793087 ns    7615245 ns 
BM_StableSort_uint8_PipeOrgan_1                       3.38 ns       3.56 ns 
BM_StableSort_uint8_PipeOrgan_4                       7.16 ns       8.70 ns 
BM_StableSort_uint8_PipeOrgan_16                      31.7 ns       35.3 ns 
BM_StableSort_uint8_PipeOrgan_64                       326 ns        366 ns 
BM_StableSort_uint8_PipeOrgan_256                      409 ns       2942 ns 
BM_StableSort_uint8_PipeOrgan_1024                    1994 ns      12571 ns 
BM_StableSort_uint8_PipeOrgan_4096                    8086 ns      46278 ns 
BM_StableSort_uint8_PipeOrgan_16384                  41749 ns      79813 ns 
BM_StableSort_uint8_PipeOrgan_65536                 180697 ns     375120 ns 
BM_StableSort_uint8_PipeOrgan_262144               1004899 ns    1676143 ns 
BM_StableSort_uint8_PipeOrgan_524288               2456081 ns    3333949 ns 
BM_StableSort_uint8_PipeOrgan_1048576              5030857 ns    7591303 ns 
BM_StableSort_uint8_QuickSortAdversary_1              3.12 ns       3.46 ns 
BM_StableSort_uint8_QuickSortAdversary_4              7.25 ns       6.83 ns 
BM_StableSort_uint8_QuickSortAdversary_16             14.6 ns       16.2 ns 
BM_StableSort_uint8_QuickSortAdversary_64              650 ns        665 ns 
BM_StableSort_uint8_QuickSortAdversary_256             395 ns       2982 ns 
BM_StableSort_uint8_QuickSortAdversary_1024           3125 ns       2583 ns 
BM_StableSort_uint8_QuickSortAdversary_4096          11797 ns      13929 ns 
BM_StableSort_uint8_QuickSortAdversary_16384         45803 ns      66513 ns 
BM_StableSort_uint8_QuickSortAdversary_65536        190745 ns     313467 ns 
BM_StableSort_uint8_QuickSortAdversary_262144       974646 ns    1469014 ns 
BM_StableSort_uint8_QuickSortAdversary_524288      2317553 ns    3022065 ns 
BM_StableSort_uint8_QuickSortAdversary_1048576     4898703 ns    6854079 ns 
BM_StableSort_int16_Random_1                          3.94 ns       3.49 ns 
BM_StableSort_int16_Random_4                          20.8 ns       23.2 ns 
BM_StableSort_int16_Random_16                          133 ns        163 ns 
BM_StableSort_int16_Random_64                          903 ns        953 ns 
BM_StableSort_int16_Random_256                        5638 ns       6258 ns 
BM_StableSort_int16_Random_1024                       3056 ns      34587 ns 
BM_StableSort_int16_Random_4096                      10596 ns     168397 ns 
BM_StableSort_int16_Random_16384                     49908 ns     753031 ns 
BM_StableSort_int16_Random_65536                    444605 ns    3838368 ns 
BM_StableSort_int16_Random_262144                  2419345 ns   15657285 ns 
BM_StableSort_int16_Random_524288                  7984040 ns   32726933 ns 
BM_StableSort_int16_Random_1048576                16092424 ns   67999766 ns 
BM_StableSort_int16_Ascending_1                       3.40 ns       3.43 ns 
BM_StableSort_int16_Ascending_4                       5.45 ns       5.79 ns 
BM_StableSort_int16_Ascending_16                      12.0 ns       15.3 ns 
BM_StableSort_int16_Ascending_64                      39.6 ns       52.6 ns 
BM_StableSort_int16_Ascending_256                      470 ns        550 ns 
BM_StableSort_int16_Ascending_1024                    1686 ns       2707 ns 
BM_StableSort_int16_Ascending_4096                    5676 ns      14165 ns 
BM_StableSort_int16_Ascending_16384                  21413 ns      69483 ns 
BM_StableSort_int16_Ascending_65536                  88010 ns     334466 ns 
BM_StableSort_int16_Ascending_262144                567239 ns    1570620 ns 
BM_StableSort_int16_Ascending_524288               1553063 ns    3424666 ns 
BM_StableSort_int16_Ascending_1048576              3145577 ns    8499649 ns 
BM_StableSort_int16_Descending_1                      3.22 ns       3.54 ns 
BM_StableSort_int16_Descending_4                      6.85 ns       10.2 ns 
BM_StableSort_int16_Descending_16                     62.7 ns       62.2 ns 
BM_StableSort_int16_Descending_64                     1138 ns       1036 ns 
BM_StableSort_int16_Descending_256                    5541 ns       4696 ns 
BM_StableSort_int16_Descending_1024                   3046 ns      19577 ns 
BM_StableSort_int16_Descending_4096                  10962 ns      79149 ns 
BM_StableSort_int16_Descending_16384                 58182 ns     327709 ns 
BM_StableSort_int16_Descending_65536                447025 ns    1424896 ns 
BM_StableSort_int16_Descending_262144              1104973 ns    5921903 ns 
BM_StableSort_int16_Descending_524288              2547840 ns   17956789 ns 
BM_StableSort_int16_Descending_1048576             5093555 ns   17044318 ns 
BM_StableSort_int16_SingleElement_1                   3.56 ns       3.96 ns 
BM_StableSort_int16_SingleElement_4                   5.75 ns       6.72 ns 
BM_StableSort_int16_SingleElement_16                  12.4 ns       16.1 ns 
BM_StableSort_int16_SingleElement_64                  36.9 ns       54.4 ns 
BM_StableSort_int16_SingleElement_256                  473 ns        557 ns 
BM_StableSort_int16_SingleElement_1024                1828 ns       2826 ns 
BM_StableSort_int16_SingleElement_4096                6239 ns      14252 ns 
BM_StableSort_int16_SingleElement_16384              23695 ns      70369 ns 
BM_StableSort_int16_SingleElement_65536              93281 ns     361641 ns 
BM_StableSort_int16_SingleElement_262144            599078 ns    1640216 ns 
BM_StableSort_int16_SingleElement_524288           1659678 ns    3343087 ns 
BM_StableSort_int16_SingleElement_1048576          3184033 ns    7770271 ns 
BM_StableSort_int16_PipeOrgan_1                       3.75 ns       3.76 ns 
BM_StableSort_int16_PipeOrgan_4                       5.94 ns       7.74 ns 
BM_StableSort_int16_PipeOrgan_16                      26.7 ns       25.9 ns 
BM_StableSort_int16_PipeOrgan_64                       300 ns        263 ns 
BM_StableSort_int16_PipeOrgan_256                     2769 ns       2760 ns 
BM_StableSort_int16_PipeOrgan_1024                    2996 ns      10544 ns 
BM_StableSort_int16_PipeOrgan_4096                   11641 ns      44750 ns 
BM_StableSort_int16_PipeOrgan_16384                  57224 ns     200464 ns 
BM_StableSort_int16_PipeOrgan_65536                 416873 ns     887631 ns 
BM_StableSort_int16_PipeOrgan_262144                843264 ns    3588669 ns 
BM_StableSort_int16_PipeOrgan_524288               2027741 ns   11056924 ns 
BM_StableSort_int16_PipeOrgan_1048576              4223773 ns   13261276 ns 
BM_StableSort_int16_QuickSortAdversary_1              3.83 ns       3.68 ns 
BM_StableSort_int16_QuickSortAdversary_4              5.55 ns       6.93 ns 
BM_StableSort_int16_QuickSortAdversary_16             12.3 ns       15.2 ns 
BM_StableSort_int16_QuickSortAdversary_64              646 ns        632 ns 
BM_StableSort_int16_QuickSortAdversary_256            2751 ns       2542 ns 
BM_StableSort_int16_QuickSortAdversary_1024           3028 ns      16901 ns 
BM_StableSort_int16_QuickSortAdversary_4096          10862 ns      80222 ns 
BM_StableSort_int16_QuickSortAdversary_16384         57753 ns     317281 ns 
BM_StableSort_int16_QuickSortAdversary_65536         94064 ns     328502 ns 
BM_StableSort_int16_QuickSortAdversary_262144       557796 ns    1613208 ns 
BM_StableSort_int16_QuickSortAdversary_524288      1518451 ns    3479740 ns 
BM_StableSort_int16_QuickSortAdversary_1048576     3165129 ns    7655880 ns 
BM_StableSort_uint16_Random_1                         3.26 ns       3.44 ns 
BM_StableSort_uint16_Random_4                         21.1 ns       22.2 ns 
BM_StableSort_uint16_Random_16                         157 ns        156 ns 
BM_StableSort_uint16_Random_64                         955 ns        947 ns 
BM_StableSort_uint16_Random_256                       5886 ns       6097 ns 
BM_StableSort_uint16_Random_1024                      2787 ns      30776 ns 
BM_StableSort_uint16_Random_4096                      9973 ns     155652 ns 
BM_StableSort_uint16_Random_16384                    48628 ns     741072 ns 
BM_StableSort_uint16_Random_65536                   439609 ns    3478966 ns 
BM_StableSort_uint16_Random_262144                 2336983 ns   15197642 ns 
BM_StableSort_uint16_Random_524288                 7888701 ns   34234254 ns 
BM_StableSort_uint16_Random_1048576               14865180 ns   68516386 ns 
BM_StableSort_uint16_Ascending_1                      3.33 ns       4.00 ns 
BM_StableSort_uint16_Ascending_4                      5.79 ns       6.64 ns 
BM_StableSort_uint16_Ascending_16                     14.9 ns       15.5 ns 
BM_StableSort_uint16_Ascending_64                     50.2 ns       52.5 ns 
BM_StableSort_uint16_Ascending_256                     538 ns        546 ns 
BM_StableSort_uint16_Ascending_1024                   1645 ns       2652 ns 
BM_StableSort_uint16_Ascending_4096                   5559 ns      14517 ns 
BM_StableSort_uint16_Ascending_16384                 22803 ns      70275 ns 
BM_StableSort_uint16_Ascending_65536                 83109 ns     333446 ns 
BM_StableSort_uint16_Ascending_262144               562667 ns    1568670 ns 
BM_StableSort_uint16_Ascending_524288              1564646 ns    3059839 ns 
BM_StableSort_uint16_Ascending_1048576             3178826 ns    7048327 ns 
BM_StableSort_uint16_Descending_1                     3.34 ns       3.93 ns 
BM_StableSort_uint16_Descending_4                     8.75 ns       9.73 ns 
BM_StableSort_uint16_Descending_16                    55.9 ns       55.5 ns 
BM_StableSort_uint16_Descending_64                    1021 ns       1035 ns 
BM_StableSort_uint16_Descending_256                   4752 ns       4931 ns 
BM_StableSort_uint16_Descending_1024                  2982 ns      19727 ns 
BM_StableSort_uint16_Descending_4096                 10432 ns      83165 ns 
BM_StableSort_uint16_Descending_16384                56593 ns     326131 ns 
BM_StableSort_uint16_Descending_65536               439134 ns    1371346 ns 
BM_StableSort_uint16_Descending_262144             1220925 ns    5735665 ns 
BM_StableSort_uint16_Descending_524288             2767234 ns   16758330 ns 
BM_StableSort_uint16_Descending_1048576            5673769 ns   17541715 ns 
BM_StableSort_uint16_SingleElement_1                  3.53 ns       3.73 ns 
BM_StableSort_uint16_SingleElement_4                  6.27 ns       5.81 ns 
BM_StableSort_uint16_SingleElement_16                 14.8 ns       15.1 ns 
BM_StableSort_uint16_SingleElement_64                 51.5 ns       50.9 ns 
BM_StableSort_uint16_SingleElement_256                 536 ns        540 ns 
BM_StableSort_uint16_SingleElement_1024               1669 ns       2690 ns 
BM_StableSort_uint16_SingleElement_4096               5840 ns      14230 ns 
BM_StableSort_uint16_SingleElement_16384             22468 ns      68524 ns 
BM_StableSort_uint16_SingleElement_65536             89845 ns     332187 ns 
BM_StableSort_uint16_SingleElement_262144           590736 ns    1550868 ns 
BM_StableSort_uint16_SingleElement_524288          1573677 ns    3095703 ns 
BM_StableSort_uint16_SingleElement_1048576         3183421 ns    8251180 ns 
BM_StableSort_uint16_PipeOrgan_1                      3.70 ns       3.64 ns 
BM_StableSort_uint16_PipeOrgan_4                      7.01 ns       6.81 ns 
BM_StableSort_uint16_PipeOrgan_16                     25.7 ns       26.4 ns 
BM_StableSort_uint16_PipeOrgan_64                      283 ns        277 ns 
BM_StableSort_uint16_PipeOrgan_256                    2562 ns       2852 ns 
BM_StableSort_uint16_PipeOrgan_1024                   2863 ns      10892 ns 
BM_StableSort_uint16_PipeOrgan_4096                  10585 ns      45668 ns 
BM_StableSort_uint16_PipeOrgan_16384                 59151 ns     194358 ns 
BM_StableSort_uint16_PipeOrgan_65536                508579 ns     854692 ns 
BM_StableSort_uint16_PipeOrgan_262144               901294 ns    3606346 ns 
BM_StableSort_uint16_PipeOrgan_524288              2192498 ns   10449279 ns 
BM_StableSort_uint16_PipeOrgan_1048576             4204368 ns   11956606 ns 
BM_StableSort_uint16_QuickSortAdversary_1             3.20 ns       3.63 ns 
BM_StableSort_uint16_QuickSortAdversary_4             5.30 ns       6.38 ns 
BM_StableSort_uint16_QuickSortAdversary_16            14.5 ns       15.3 ns 
BM_StableSort_uint16_QuickSortAdversary_64             575 ns        611 ns 
BM_StableSort_uint16_QuickSortAdversary_256           2423 ns       2577 ns 
BM_StableSort_uint16_QuickSortAdversary_1024          2794 ns      16854 ns 
BM_StableSort_uint16_QuickSortAdversary_4096         10511 ns      75952 ns 
BM_StableSort_uint16_QuickSortAdversary_16384        56214 ns     333824 ns 
BM_StableSort_uint16_QuickSortAdversary_65536       422512 ns    1354867 ns 
BM_StableSort_uint16_QuickSortAdversary_262144      583301 ns    1564443 ns 
BM_StableSort_uint16_QuickSortAdversary_524288     1584319 ns    3265575 ns 
BM_StableSort_uint16_QuickSortAdversary_1048576    3197732 ns    7945245 ns 
BM_StableSort_int32_Random_1                          3.81 ns       3.70 ns 
BM_StableSort_int32_Random_4                          20.8 ns       23.4 ns 
BM_StableSort_int32_Random_16                          134 ns        161 ns 
BM_StableSort_int32_Random_64                          895 ns        984 ns 
BM_StableSort_int32_Random_256                        5640 ns       5897 ns 
BM_StableSort_int32_Random_1024                       6994 ns      32118 ns 
BM_StableSort_int32_Random_4096                      27367 ns     168960 ns 
BM_StableSort_int32_Random_16384                    183261 ns     843240 ns 
BM_StableSort_int32_Random_65536                    950914 ns    3953588 ns 
BM_StableSort_int32_Random_262144                  3673311 ns   16790171 ns 
BM_StableSort_int32_Random_524288                 11515700 ns   36023098 ns 
BM_StableSort_int32_Random_1048576                24492515 ns   78116028 ns 
BM_StableSort_int32_Ascending_1                       3.31 ns       4.48 ns 
BM_StableSort_int32_Ascending_4                       5.96 ns       6.99 ns 
BM_StableSort_int32_Ascending_16                      13.0 ns       16.0 ns 
BM_StableSort_int32_Ascending_64                      36.7 ns       53.0 ns 
BM_StableSort_int32_Ascending_256                      391 ns        471 ns 
BM_StableSort_int32_Ascending_1024                    2705 ns       2682 ns 
BM_StableSort_int32_Ascending_4096                    8773 ns      14231 ns 
BM_StableSort_int32_Ascending_16384                  34709 ns      70625 ns 
BM_StableSort_int32_Ascending_65536                 142907 ns     344482 ns 
BM_StableSort_int32_Ascending_262144                745483 ns    1591418 ns 
BM_StableSort_int32_Ascending_524288               1873701 ns    3190305 ns 
BM_StableSort_int32_Ascending_1048576              3851590 ns    7570095 ns 
BM_StableSort_int32_Descending_1                      3.22 ns       4.23 ns 
BM_StableSort_int32_Descending_4                      7.58 ns       11.2 ns 
BM_StableSort_int32_Descending_16                     63.9 ns       58.6 ns 
BM_StableSort_int32_Descending_64                     1133 ns       1017 ns 
BM_StableSort_int32_Descending_256                    4850 ns       4464 ns 
BM_StableSort_int32_Descending_1024                   7023 ns      18954 ns 
BM_StableSort_int32_Descending_4096                  28550 ns      75163 ns 
BM_StableSort_int32_Descending_16384                200880 ns     341104 ns 
BM_StableSort_int32_Descending_65536               1095910 ns    1398021 ns 
BM_StableSort_int32_Descending_262144              3818864 ns    5695486 ns 
BM_StableSort_int32_Descending_524288              5606779 ns   17593982 ns 
BM_StableSort_int32_Descending_1048576            16416366 ns   26649503 ns 
BM_StableSort_int32_SingleElement_1                   3.81 ns       3.71 ns 
BM_StableSort_int32_SingleElement_4                   6.57 ns       6.61 ns 
BM_StableSort_int32_SingleElement_16                  14.0 ns       15.8 ns 
BM_StableSort_int32_SingleElement_64                  38.7 ns       53.5 ns 
BM_StableSort_int32_SingleElement_256                  386 ns        554 ns 
BM_StableSort_int32_SingleElement_1024                2761 ns       3046 ns 
BM_StableSort_int32_SingleElement_4096                9179 ns      15188 ns 
BM_StableSort_int32_SingleElement_16384              34794 ns      70119 ns 
BM_StableSort_int32_SingleElement_65536             135190 ns     354755 ns 
BM_StableSort_int32_SingleElement_262144            760995 ns    1644072 ns 
BM_StableSort_int32_SingleElement_524288           1969575 ns    3343419 ns 
BM_StableSort_int32_SingleElement_1048576          4423816 ns    8346971 ns 
BM_StableSort_int32_PipeOrgan_1                       3.79 ns       3.63 ns 
BM_StableSort_int32_PipeOrgan_4                       6.21 ns       6.73 ns 
BM_StableSort_int32_PipeOrgan_16                      27.5 ns       26.0 ns 
BM_StableSort_int32_PipeOrgan_64                       291 ns        265 ns 
BM_StableSort_int32_PipeOrgan_256                     2557 ns       2518 ns 
BM_StableSort_int32_PipeOrgan_1024                    6765 ns      10976 ns 
BM_StableSort_int32_PipeOrgan_4096                   26373 ns      44537 ns 
BM_StableSort_int32_PipeOrgan_16384                 201466 ns     188582 ns 
BM_StableSort_int32_PipeOrgan_65536                1148533 ns     802368 ns 
BM_StableSort_int32_PipeOrgan_262144               2255177 ns    3477829 ns 
BM_StableSort_int32_PipeOrgan_524288               3947015 ns   10356637 ns 
BM_StableSort_int32_PipeOrgan_1048576             10274312 ns   16405366 ns 
BM_StableSort_int32_QuickSortAdversary_1              3.32 ns       4.36 ns 
BM_StableSort_int32_QuickSortAdversary_4              5.98 ns       7.44 ns 
BM_StableSort_int32_QuickSortAdversary_16             13.0 ns       16.3 ns 
BM_StableSort_int32_QuickSortAdversary_64              657 ns        616 ns 
BM_StableSort_int32_QuickSortAdversary_256            2569 ns       2483 ns 
BM_StableSort_int32_QuickSortAdversary_1024           6898 ns      19635 ns 
BM_StableSort_int32_QuickSortAdversary_4096          27092 ns      75108 ns 
BM_StableSort_int32_QuickSortAdversary_16384        190379 ns     316463 ns 
BM_StableSort_int32_QuickSortAdversary_65536       1109040 ns    1319018 ns 
BM_StableSort_int32_QuickSortAdversary_262144      4361925 ns    5472779 ns 
BM_StableSort_int32_QuickSortAdversary_524288      6528215 ns   17538983 ns 
BM_StableSort_int32_QuickSortAdversary_1048576    18345325 ns   27223926 ns 
BM_StableSort_uint32_Random_1                         3.67 ns       3.82 ns 
BM_StableSort_uint32_Random_4                         22.3 ns       21.8 ns 
BM_StableSort_uint32_Random_16                         155 ns        153 ns 
BM_StableSort_uint32_Random_64                         946 ns        976 ns 
BM_StableSort_uint32_Random_256                       5824 ns       6019 ns 
BM_StableSort_uint32_Random_1024                      4525 ns      32764 ns 
BM_StableSort_uint32_Random_4096                     17223 ns     158608 ns 
BM_StableSort_uint32_Random_16384                   134821 ns     748525 ns 
BM_StableSort_uint32_Random_65536                   716644 ns    3453325 ns 
BM_StableSort_uint32_Random_262144                 3628062 ns   16065414 ns 
BM_StableSort_uint32_Random_524288                10971334 ns   36567712 ns 
BM_StableSort_uint32_Random_1048576               22688377 ns   77533497 ns 
BM_StableSort_uint32_Ascending_1                      3.57 ns       3.44 ns 
BM_StableSort_uint32_Ascending_4                      5.73 ns       5.33 ns 
BM_StableSort_uint32_Ascending_16                     14.5 ns       14.0 ns 
BM_StableSort_uint32_Ascending_64                     50.3 ns       51.3 ns 
BM_StableSort_uint32_Ascending_256                     465 ns        467 ns 
BM_StableSort_uint32_Ascending_1024                   3042 ns       2530 ns 
BM_StableSort_uint32_Ascending_4096                   9842 ns      12207 ns 
BM_StableSort_uint32_Ascending_16384                 37994 ns      61726 ns 
BM_StableSort_uint32_Ascending_65536                148890 ns     294385 ns 
BM_StableSort_uint32_Ascending_262144               855080 ns    1422167 ns 
BM_StableSort_uint32_Ascending_524288              2154903 ns    3203018 ns 
BM_StableSort_uint32_Ascending_1048576             5002518 ns    7563817 ns 
BM_StableSort_uint32_Descending_1                     3.51 ns       3.40 ns 
BM_StableSort_uint32_Descending_4                     9.09 ns       7.95 ns 
BM_StableSort_uint32_Descending_16                    54.8 ns       74.4 ns 
BM_StableSort_uint32_Descending_64                    1003 ns       1305 ns 
BM_StableSort_uint32_Descending_256                   4545 ns       5300 ns 
BM_StableSort_uint32_Descending_1024                  4361 ns      21884 ns 
BM_StableSort_uint32_Descending_4096                 16018 ns      90534 ns 
BM_StableSort_uint32_Descending_16384               146274 ns     381943 ns 
BM_StableSort_uint32_Descending_65536               938248 ns    1536806 ns 
BM_StableSort_uint32_Descending_262144             3899300 ns    6387843 ns 
BM_StableSort_uint32_Descending_524288             5808157 ns   21959858 ns 
BM_StableSort_uint32_Descending_1048576           17520047 ns   26351912 ns 
BM_StableSort_uint32_SingleElement_1                  4.03 ns       3.97 ns 
BM_StableSort_uint32_SingleElement_4                  6.55 ns       6.41 ns 
BM_StableSort_uint32_SingleElement_16                 15.6 ns       15.8 ns 
BM_StableSort_uint32_SingleElement_64                 52.3 ns       58.7 ns 
BM_StableSort_uint32_SingleElement_256                 473 ns        485 ns 
BM_StableSort_uint32_SingleElement_1024               3020 ns       2407 ns 
BM_StableSort_uint32_SingleElement_4096               9998 ns      12527 ns 
BM_StableSort_uint32_SingleElement_16384             38072 ns      62228 ns 
BM_StableSort_uint32_SingleElement_65536            153706 ns     295662 ns 
BM_StableSort_uint32_SingleElement_262144           836532 ns    1477099 ns 
BM_StableSort_uint32_SingleElement_524288          2144900 ns    3157204 ns 
BM_StableSort_uint32_SingleElement_1048576         4995525 ns    7617233 ns 
BM_StableSort_uint32_PipeOrgan_1                      4.02 ns       3.99 ns 
BM_StableSort_uint32_PipeOrgan_4                      6.97 ns       6.84 ns 
BM_StableSort_uint32_PipeOrgan_16                     26.1 ns       29.7 ns 
BM_StableSort_uint32_PipeOrgan_64                      266 ns        333 ns 
BM_StableSort_uint32_PipeOrgan_256                    2462 ns       2892 ns 
BM_StableSort_uint32_PipeOrgan_1024                   4291 ns      12431 ns 
BM_StableSort_uint32_PipeOrgan_4096                  15638 ns      51449 ns 
BM_StableSort_uint32_PipeOrgan_16384                154563 ns     217460 ns 
BM_StableSort_uint32_PipeOrgan_65536                907724 ns     925873 ns 
BM_StableSort_uint32_PipeOrgan_262144              2394580 ns    4103575 ns 
BM_StableSort_uint32_PipeOrgan_524288              4177145 ns   13947158 ns 
BM_StableSort_uint32_PipeOrgan_1048576            11848224 ns   18807297 ns 
BM_StableSort_uint32_QuickSortAdversary_1             3.50 ns       3.43 ns 
BM_StableSort_uint32_QuickSortAdversary_4             5.88 ns       4.96 ns 
BM_StableSort_uint32_QuickSortAdversary_16            14.6 ns       14.0 ns 
BM_StableSort_uint32_QuickSortAdversary_64             576 ns        715 ns 
BM_StableSort_uint32_QuickSortAdversary_256           2353 ns       2797 ns 
BM_StableSort_uint32_QuickSortAdversary_1024          4176 ns      21775 ns 
BM_StableSort_uint32_QuickSortAdversary_4096         15565 ns      96188 ns 
BM_StableSort_uint32_QuickSortAdversary_16384       149092 ns     398332 ns 
BM_StableSort_uint32_QuickSortAdversary_65536       902488 ns    1552393 ns 
BM_StableSort_uint32_QuickSortAdversary_262144     3946517 ns    6560414 ns 
BM_StableSort_uint32_QuickSortAdversary_524288     6247114 ns   22420977 ns 
BM_StableSort_uint32_QuickSortAdversary_1048576   19892446 ns   26529576 ns 
BM_StableSort_int64_Random_1                          3.83 ns       3.98 ns 
BM_StableSort_int64_Random_4                          21.1 ns       24.0 ns 
BM_StableSort_int64_Random_16                          129 ns        136 ns 
BM_StableSort_int64_Random_64                          890 ns        906 ns 
BM_StableSort_int64_Random_256                        5542 ns       5901 ns 
BM_StableSort_int64_Random_1024                      16085 ns      33112 ns 
BM_StableSort_int64_Random_4096                      63895 ns     162181 ns 
BM_StableSort_int64_Random_16384                    348827 ns     790045 ns 
BM_StableSort_int64_Random_65536                   1488237 ns    3557506 ns 
BM_StableSort_int64_Random_262144                  8195713 ns   16315808 ns 
BM_StableSort_int64_Random_524288                 16586833 ns   38274075 ns 
BM_StableSort_int64_Random_1048576                40346644 ns   79182089 ns 
BM_StableSort_int64_Ascending_1                       3.76 ns       3.55 ns 
BM_StableSort_int64_Ascending_4                       5.82 ns       6.19 ns 
BM_StableSort_int64_Ascending_16                      11.7 ns       11.8 ns 
BM_StableSort_int64_Ascending_64                      32.9 ns       36.8 ns 
BM_StableSort_int64_Ascending_256                      415 ns        550 ns 
BM_StableSort_int64_Ascending_1024                    5352 ns       3347 ns 
BM_StableSort_int64_Ascending_4096                   17516 ns      19134 ns 
BM_StableSort_int64_Ascending_16384                  64147 ns      91099 ns 
BM_StableSort_int64_Ascending_65536                 322126 ns     434009 ns 
BM_StableSort_int64_Ascending_262144               1554669 ns    2057056 ns 
BM_StableSort_int64_Ascending_524288               3656527 ns    5016650 ns 
BM_StableSort_int64_Ascending_1048576             10469979 ns   12908613 ns 
BM_StableSort_int64_Descending_1                      4.09 ns       3.35 ns 
BM_StableSort_int64_Descending_4                      9.13 ns       8.01 ns 
BM_StableSort_int64_Descending_16                     76.8 ns       92.9 ns 
BM_StableSort_int64_Descending_64                     1336 ns       1417 ns 
BM_StableSort_int64_Descending_256                    5525 ns       5674 ns 
BM_StableSort_int64_Descending_1024                  17461 ns      22558 ns 
BM_StableSort_int64_Descending_4096                  64285 ns     102360 ns 
BM_StableSort_int64_Descending_16384                336946 ns     388940 ns 
BM_StableSort_int64_Descending_65536                837912 ns    1662169 ns 
BM_StableSort_int64_Descending_262144              3680806 ns    7494323 ns 
BM_StableSort_int64_Descending_524288             11023784 ns   24935033 ns 
BM_StableSort_int64_Descending_1048576            20023568 ns   33220712 ns 
BM_StableSort_int64_SingleElement_1                   3.37 ns       3.98 ns 
BM_StableSort_int64_SingleElement_4                   5.32 ns       6.92 ns 
BM_StableSort_int64_SingleElement_16                  10.9 ns       13.3 ns 
BM_StableSort_int64_SingleElement_64                  32.1 ns       43.8 ns 
BM_StableSort_int64_SingleElement_256                  420 ns        541 ns 
BM_StableSort_int64_SingleElement_1024                5689 ns       3381 ns 
BM_StableSort_int64_SingleElement_4096               19199 ns      17989 ns 
BM_StableSort_int64_SingleElement_16384              75754 ns      91963 ns 
BM_StableSort_int64_SingleElement_65536             357106 ns     500326 ns 
BM_StableSort_int64_SingleElement_262144           1672975 ns    2417734 ns 
BM_StableSort_int64_SingleElement_524288           3642891 ns    5200878 ns 
BM_StableSort_int64_SingleElement_1048576         11172007 ns   13729511 ns 
BM_StableSort_int64_PipeOrgan_1                       3.38 ns       3.94 ns 
BM_StableSort_int64_PipeOrgan_4                       5.73 ns       6.44 ns 
BM_StableSort_int64_PipeOrgan_16                      27.5 ns       29.0 ns 
BM_StableSort_int64_PipeOrgan_64                       310 ns        321 ns 
BM_StableSort_int64_PipeOrgan_256                     2761 ns       2918 ns 
BM_StableSort_int64_PipeOrgan_1024                   16105 ns      12525 ns 
BM_StableSort_int64_PipeOrgan_4096                   65289 ns      59990 ns 
BM_StableSort_int64_PipeOrgan_16384                 341757 ns     270636 ns 
BM_StableSort_int64_PipeOrgan_65536                 587452 ns    1126132 ns 
BM_StableSort_int64_PipeOrgan_262144               2837955 ns    5034180 ns 
BM_StableSort_int64_PipeOrgan_524288               6617313 ns   15267354 ns 
BM_StableSort_int64_PipeOrgan_1048576             15208796 ns   23162989 ns 
BM_StableSort_int64_QuickSortAdversary_1              3.77 ns       3.45 ns 
BM_StableSort_int64_QuickSortAdversary_4              5.55 ns       5.20 ns 
BM_StableSort_int64_QuickSortAdversary_16             12.5 ns       11.5 ns 
BM_StableSort_int64_QuickSortAdversary_64              646 ns        750 ns 
BM_StableSort_int64_QuickSortAdversary_256            2655 ns       3539 ns 
BM_StableSort_int64_QuickSortAdversary_1024          16373 ns      22349 ns 
BM_StableSort_int64_QuickSortAdversary_4096          62306 ns      97248 ns 
BM_StableSort_int64_QuickSortAdversary_16384        321755 ns     388084 ns 
BM_StableSort_int64_QuickSortAdversary_65536       1374694 ns    1596091 ns 
BM_StableSort_int64_QuickSortAdversary_262144      4374661 ns    6894139 ns 
BM_StableSort_int64_QuickSortAdversary_524288     12736074 ns   23932229 ns 
BM_StableSort_int64_QuickSortAdversary_1048576    22615219 ns   33355629 ns 
BM_StableSort_uint64_Random_1                         3.82 ns       3.49 ns 
BM_StableSort_uint64_Random_4                         22.4 ns       23.4 ns 
BM_StableSort_uint64_Random_16                         154 ns        146 ns 
BM_StableSort_uint64_Random_64                         924 ns        926 ns 
BM_StableSort_uint64_Random_256                       5864 ns       5913 ns 
BM_StableSort_uint64_Random_1024                      7168 ns      31746 ns 
BM_StableSort_uint64_Random_4096                     27668 ns     154224 ns 
BM_StableSort_uint64_Random_16384                   219526 ns     755205 ns 
BM_StableSort_uint64_Random_65536                   965251 ns    3490165 ns 
BM_StableSort_uint64_Random_262144                 6262162 ns   15889589 ns 
BM_StableSort_uint64_Random_524288                12530078 ns   36458581 ns 
BM_StableSort_uint64_Random_1048576               38462191 ns   75168445 ns 
BM_StableSort_uint64_Ascending_1                      3.30 ns       3.35 ns 
BM_StableSort_uint64_Ascending_4                      5.65 ns       5.84 ns 
BM_StableSort_uint64_Ascending_16                     14.7 ns       12.6 ns 
BM_StableSort_uint64_Ascending_64                     55.3 ns       34.6 ns 
BM_StableSort_uint64_Ascending_256                     513 ns        533 ns 
BM_StableSort_uint64_Ascending_1024                   5541 ns       3189 ns 
BM_StableSort_uint64_Ascending_4096                  17706 ns      20326 ns 
BM_StableSort_uint64_Ascending_16384                 66420 ns      93757 ns 
BM_StableSort_uint64_Ascending_65536                341425 ns     435016 ns 
BM_StableSort_uint64_Ascending_262144              1595691 ns    2088317 ns 
BM_StableSort_uint64_Ascending_524288              3808703 ns    5092832 ns 
BM_StableSort_uint64_Ascending_1048576            11060417 ns   13023250 ns 
BM_StableSort_uint64_Descending_1                     3.29 ns       3.35 ns 
BM_StableSort_uint64_Descending_4                     8.65 ns       7.92 ns 
BM_StableSort_uint64_Descending_16                    54.7 ns       80.2 ns 
BM_StableSort_uint64_Descending_64                    1028 ns       1307 ns 
BM_StableSort_uint64_Descending_256                   4521 ns       5635 ns 
BM_StableSort_uint64_Descending_1024                  7122 ns      23323 ns 
BM_StableSort_uint64_Descending_4096                 30538 ns      95892 ns 
BM_StableSort_uint64_Descending_16384               195565 ns     392721 ns 
BM_StableSort_uint64_Descending_65536               852002 ns    1720358 ns 
BM_StableSort_uint64_Descending_262144             3737884 ns    7484130 ns 
BM_StableSort_uint64_Descending_524288            11159345 ns   25690770 ns 
BM_StableSort_uint64_Descending_1048576           20648864 ns   33057383 ns 
BM_StableSort_uint64_SingleElement_1                  3.62 ns       4.10 ns 
BM_StableSort_uint64_SingleElement_4                  6.73 ns       6.64 ns 
BM_StableSort_uint64_SingleElement_16                 14.9 ns       11.3 ns 
BM_StableSort_uint64_SingleElement_64                 52.0 ns       33.0 ns 
BM_StableSort_uint64_SingleElement_256                 511 ns        582 ns 
BM_StableSort_uint64_SingleElement_1024               6499 ns       3287 ns 
BM_StableSort_uint64_SingleElement_4096              22190 ns      17616 ns 
BM_StableSort_uint64_SingleElement_16384             84378 ns      86885 ns 
BM_StableSort_uint64_SingleElement_65536            466257 ns     457144 ns 
BM_StableSort_uint64_SingleElement_262144          1993687 ns    2361999 ns 
BM_StableSort_uint64_SingleElement_524288          4759565 ns    5096771 ns 
BM_StableSort_uint64_SingleElement_1048576        12426111 ns   13468453 ns 
BM_StableSort_uint64_PipeOrgan_1                      3.73 ns       3.94 ns 
BM_StableSort_uint64_PipeOrgan_4                      7.18 ns       7.54 ns 
BM_StableSort_uint64_PipeOrgan_16                     25.2 ns       29.1 ns 
BM_StableSort_uint64_PipeOrgan_64                      260 ns        321 ns 
BM_StableSort_uint64_PipeOrgan_256                    2468 ns       2970 ns 
BM_StableSort_uint64_PipeOrgan_1024                   7025 ns      12912 ns 
BM_StableSort_uint64_PipeOrgan_4096                  28968 ns      53379 ns 
BM_StableSort_uint64_PipeOrgan_16384                194156 ns     239790 ns 
BM_StableSort_uint64_PipeOrgan_65536                599491 ns     993800 ns 
BM_StableSort_uint64_PipeOrgan_262144              2648585 ns    4689680 ns 
BM_StableSort_uint64_PipeOrgan_524288              7621109 ns   15401808 ns 
BM_StableSort_uint64_PipeOrgan_1048576            15608814 ns   23484821 ns 
BM_StableSort_uint64_QuickSortAdversary_1             3.38 ns       3.54 ns 
BM_StableSort_uint64_QuickSortAdversary_4             5.50 ns       6.03 ns 
BM_StableSort_uint64_QuickSortAdversary_16            14.2 ns       11.0 ns 
BM_StableSort_uint64_QuickSortAdversary_64             597 ns        688 ns 
BM_StableSort_uint64_QuickSortAdversary_256           2446 ns       2818 ns 
BM_StableSort_uint64_QuickSortAdversary_1024          7266 ns      20319 ns 
BM_StableSort_uint64_QuickSortAdversary_4096         31155 ns      89112 ns 
BM_StableSort_uint64_QuickSortAdversary_16384       201033 ns     390574 ns 
BM_StableSort_uint64_QuickSortAdversary_65536       871014 ns    1685639 ns 
BM_StableSort_uint64_QuickSortAdversary_262144     3978535 ns    7265830 ns 
BM_StableSort_uint64_QuickSortAdversary_524288    10279721 ns   25350004 ns 
BM_StableSort_uint64_QuickSortAdversary_1048576   20256585 ns   33054393 ns 
```
2025-01-09 19:02:35 +01:00
Vitaly Buka
570f03096a
Revert "Reapply "[libc++] Explicitly convert to masks in SIMD code (#107983)"" (#122022)
Reverts llvm/llvm-project#121352

Triggers "vector type should not be a bool!" on:
```
  bool a[100];
  bool b[100];
  auto t = std::mismatch(std::begin(a), std::end(a), std::begin(b), std::end(b));
```

https://godbolt.org/z/Y73s3sdef
2025-01-08 18:14:39 +01:00
Nikolas Klauser
f69585235e
[libc++] Put _LIBCPP_NODEBUG on all internal aliases (#118710)
This significantly reduces the amount of debug information generated
for codebases using libc++, without hurting the debugging experience.
2025-01-08 11:12:59 -05:00
Vitaly Buka
ed572f2003
Reapply "[libc++] Explicitly convert to masks in SIMD code (#107983)" (#121352)
This reverts commit 0ea40bf02138c02e7680ce6fa8169502f2a8bd42.

Passes with https://github.com/llvm/llvm-project/issues/121365 fix: 
https://lab.llvm.org/buildbot/#/builders/55/builds/4930
2025-01-01 11:30:35 +01:00
Nikolas Klauser
b905bcc509
[libc++] Remove some unused includes (#120219) 2024-12-18 21:10:27 +01:00
Nikolas Klauser
59890c1334
[libc++] Granularize <new> includes (#119964) 2024-12-17 11:29:16 +01:00
Nikolas Klauser
a2042521a0
[libc++] Remove _AlgPolicy from std::copy and algorithms using std::copy (#115887)
`std::copy` doesn't use the `_AlgPolicy` for anything other than calling
itself with it, so we can just remove the argument. This also removes
the need in a few other algorithms which had an `_AlgPolicy` argument
only to call `copy`.
2024-11-12 23:03:52 +01:00
Nikolas Klauser
5b67372aec [libc++] Remove a few unused includes from <__algorithm/find_end.h> 2024-11-12 22:11:15 +01:00
Nikolas Klauser
eab7be5d42
[libc++] Forward more algorithms to the classic algorithms (#114674)
This partially addresses #105687.
2024-11-06 12:10:06 +01:00
Nikolas Klauser
c6f3b7bcd0
[libc++] Refactor the configuration macros to being always defined (#112094)
This is a follow-up to #89178. This updates the `<__config_site>`
macros.
2024-11-06 10:39:19 +01:00
Nikolas Klauser
e99c4906e4
[libc++] Granularize <cstddef> includes (#108696) 2024-10-31 02:20:10 +01:00
Nikolas Klauser
0fb76bae6b
Reapply "[libc++] Simplify the implementation of std::sort a bit (#104902)" (#114023)
This reverts commit ef44e4659878f2. The patch was originally reverted
because it was
deemed to introduce a performance regression for small inputs, however
it also fixed
a previous performance regression for larger inputs. So overall, this
patch is desirable.
2024-10-30 11:51:55 +01:00
Louis Dionne
8e6bba230e [libc++][NFC] Rename fold.h to ranges_fold.h (#109696)
This follows the pattern we use consistently for ranges algorithms.

This is a re-application of 24bc3244d4e which had been reverted in
f11abac65 due to unrelated failures.
2024-09-30 08:30:16 -04:00
Chris B
f11abac652
Revert "[libc++][modules] Rewrite the modulemap to have fewer top-level modules (#107638)" (#110384)
This reverts 3 commits:
45a09d1811d5d6597385ef02ecf2d4b7320c37c5
24bc3244d4e221f4e6740f45e2bf15a1441a3076
bc6bd3bc1e99c7ec9e22dff23b4f4373fa02cae3

The GitHub pre-merge CI has been broken since this PR went in. This
change reverts it to see if I can get the pre-merge CI working again.
2024-09-28 21:47:09 -05:00
Louis Dionne
24bc3244d4
[libc++][NFC] Rename fold.h to ranges_fold.h (#109696)
This follows the pattern we use consistently for ranges algorithms.
2024-09-27 01:02:21 -04:00
Louis Dionne
ef44e46598 Revert "[libc++] Simplify the implementation of std::sort a bit (#104902)"
This reverts commit d4ffccfce103b01401b8a9222e373f2d404f8439, which
caused a performance regression that needs to be investigated further.
2024-09-20 09:48:58 -04:00
Thurston Dang
0ea40bf021 Revert "[libc++] Explicitly convert to masks in SIMD code (#107983)"
This reverts commit 1603f99a37c5b179a21dbb8000c39a471a950927.

Reason: buildbot breakage e.g., https://lab.llvm.org/buildbot/#/builders/55/builds/2061
  llvm-libc++-shared.cfg.in :: std/algorithms/alg.nonmodifying/alg.starts_with/ranges.starts_with.pass.cpp
  llvm-libc++-shared.cfg.in :: std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp
  llvm-libc++-shared.cfg.in :: std/algorithms/alg.nonmodifying/mismatch/ranges_mismatch.pass.cpp
  ...

(Buildbot re-run passed with the previous revision, 1fc288bf481726393c73133eef9aa73c0f78312e)
2024-09-17 21:52:33 +00:00
Nikolas Klauser
1603f99a37
[libc++] Explicitly convert to masks in SIMD code (#107983)
This makes it clearer when we use masks and avoids MSan complaining.
2024-09-17 12:04:54 +02:00