251 Commits

Author SHA1 Message Date
Mark de Wever
7f65377880
[libc++][format][2/3] Optimizes c-string arguments. (#101805)
The formatter specializations for _CharT* and const _CharT* typically
write all elements in a loop. This format's internal functions are
optimized for larger writes.

Instead of writing one element at a time, convert the range to a
basic_string_view and write that instead.

For C string of 6 characters this is a bit slower, but for 60 characters
it's faster. The improvements for back_inserter<std::list<_CharT>> are
not as great as the others; it just gets as slow as
basic_string_view<_CharT>.

omparing libcxx/test/benchmarks/write_string_comparison.bench.out-before to libcxx/test/benchmarks/write_string_comparison.bench.out-after
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_sprintf/C_string_len_6                                                         -0.0015         +0.0013             5             5             5             5
BM_format/C_string_len_6                                                          +0.0390         +0.0416            53            55            53            55
BM_format_to_back_inserter<std::string>/C_string_len_6                            +0.0381         +0.0408            53            55            53            55
BM_format_to_back_inserter<std::vector<char>>/C_string_len_6                      +0.0287         +0.0315            69            71            69            71
BM_format_to_back_inserter<std::deque<char>>/C_string_len_6                       +0.0503         +0.0530           123           129           123           129
BM_format_to_back_inserter<std::list<char>>/C_string_len_6                        -0.0241         -0.0213           133           130           133           130
BM_format_to_iterator/<std::array> C_string_len_6                                 -0.0075         -0.0049            45            45            45            45
BM_format_to_iterator/<std::string> C_string_len_6                                +0.0311         +0.0340            44            46            44            46
BM_format_to_iterator/<std::vector> C_string_len_6                                +0.0380         +0.0409            43            45            43            45
BM_format_to_iterator/<std::deque> C_string_len_6                                 +0.0366         +0.0392            48            50            48            50
BM_format/string_len_6                                                            -0.0010         -0.0007            56            55            55            55
BM_format_to_back_inserter<std::string>/string_len_6                              +0.0044         +0.0041            55            56            55            55
BM_format_to_back_inserter<std::vector<char>>/string_len_6                        +0.0128         +0.0128            70            71            70            71
BM_format_to_back_inserter<std::deque<char>>/string_len_6                         +0.0151         +0.0151           126           128           126           128
BM_format_to_back_inserter<std::list<char>>/string_len_6                          -0.0719         -0.0718           140           130           139           129
BM_format_to_iterator/<std::array> string_len_6                                   -0.0323         -0.0324            47            46            47            46
BM_format_to_iterator/<std::string> string_len_6                                  -0.0011         -0.0010            45            44            44            44
BM_format_to_iterator/<std::vector> string_len_6                                  -0.0002         -0.0001            45            45            44            44
BM_format_to_iterator/<std::deque> string_len_6                                   +0.0046         +0.0047            51            51            51            51
BM_format/string_view_len_6                                                       +0.0031         +0.0031            54            54            54            54
BM_format_to_back_inserter<std::string>/string_view_len_6                         +0.0041         +0.0040            54            54            54            54
BM_format_to_back_inserter<std::vector<char>>/string_view_len_6                   +0.0022         +0.0022            70            70            70            70
BM_format_to_back_inserter<std::deque<char>>/string_view_len_6                    +0.0392         +0.0391           124           129           124           129
BM_format_to_back_inserter<std::list<char>>/string_view_len_6                     -0.0680         -0.0680           139           129           138           129
BM_format_to_iterator/<std::array> string_view_len_6                              -0.0321         -0.0320            47            46            47            46
BM_format_to_iterator/<std::string> string_view_len_6                             -0.0013         -0.0011            45            44            44            44
BM_format_to_iterator/<std::vector> string_view_len_6                             -0.0024         -0.0023            45            44            44            44
BM_format_to_iterator/<std::deque> string_view_len_6                              +0.0057         +0.0057            51            51            51            51
BM_sprintf/C_string_len_60                                                        -0.0035         -0.0035             4             4             4             4
BM_format/C_string_len_60                                                         -0.5627         -0.5627           169            74           169            74
BM_format_to_back_inserter<std::string>/C_string_len_60                           -0.5642         -0.5641           170            74           169            74
BM_format_to_back_inserter<std::vector<char>>/C_string_len_60                     -0.5300         -0.5299           178            84           178            84
BM_format_to_back_inserter<std::deque<char>>/C_string_len_60                      -0.2548         -0.2548           356           265           355           264
BM_format_to_back_inserter<std::list<char>>/C_string_len_60                       -0.1013         -0.1013          1325          1191          1322          1188
BM_format_to_iterator/<std::array> C_string_len_60                                -0.6790         -0.6791           141            45           141            45
BM_format_to_iterator/<std::string> C_string_len_60                               -0.6738         -0.6740           143            47           142            46
BM_format_to_iterator/<std::vector> C_string_len_60                               -0.6807         -0.6808           142            45           142            45
BM_format_to_iterator/<std::deque> C_string_len_60                                -0.6488         -0.6486           144            51           144            51
BM_format/string_len_60                                                           +0.0118         +0.0117            73            74            73            73
BM_format_to_back_inserter<std::string>/string_len_60                             +0.0089         +0.0088            73            73            73            73
BM_format_to_back_inserter<std::vector<char>>/string_len_60                       +0.0080         +0.0081            83            84            83            83
BM_format_to_back_inserter<std::deque<char>>/string_len_60                        +0.0005         +0.0002           262           263           262           262
BM_format_to_back_inserter<std::list<char>>/string_len_60                         -0.0384         -0.0380          1236          1188          1232          1186
BM_format_to_iterator/<std::array> string_len_60                                  -0.0288         -0.0288            47            46            47            46
BM_format_to_iterator/<std::string> string_len_60                                 +0.0213         +0.0210            44            45            44            45
BM_format_to_iterator/<std::vector> string_len_60                                 +0.0202         +0.0205            45            45            44            45
BM_format_to_iterator/<std::deque> string_len_60                                  +0.0124         +0.0124            50            51            50            51
BM_format/string_view_len_60                                                      +0.0093         +0.0093            73            73            73            73
BM_format_to_back_inserter<std::string>/string_view_len_60                        +0.0055         +0.0055            73            73            73            73
BM_format_to_back_inserter<std::vector<char>>/string_view_len_60                  +0.0165         +0.0166            81            83            81            83
BM_format_to_back_inserter<std::deque<char>>/string_view_len_60                   +0.0138         +0.0140           260           263           259           263
BM_format_to_back_inserter<std::list<char>>/string_view_len_60                    -0.0334         -0.0335          1228          1187          1225          1184
BM_format_to_iterator/<std::array> string_view_len_60                             -0.0257         -0.0259            48            46            47            46
BM_format_to_iterator/<std::string> string_view_len_60                            +0.0324         +0.0323            45            46            44            46
BM_format_to_iterator/<std::vector> string_view_len_60                            +0.0174         +0.0177            45            45            44            45
BM_format_to_iterator/<std::deque> string_view_len_60                             +0.0076         +0.0076            50            51            50            51
BM_sprintf/C_string_len_6000                                                      +0.4922         +0.4921            77           115            77           114
BM_format/C_string_len_6000                                                       -0.9239         -0.9239         11780           897         11750           894
BM_format_to_back_inserter<std::string>/C_string_len_6000                         -0.9239         -0.9239         11792           898         11763           895
BM_format_to_back_inserter<std::vector<char>>/C_string_len_6000                   -0.9257         -0.9257         11709           870         11679           868
BM_format_to_back_inserter<std::deque<char>>/C_string_len_6000                    -0.4057         -0.4057         25616         15225         25553         15187
BM_format_to_back_inserter<std::list<char>>/C_string_len_6000                     -0.0832         -0.0833        127144        116569        126823        116265
BM_format_to_iterator/<std::array> C_string_len_6000                              -0.9853         -0.9853         10869           160         10843           160
BM_format_to_iterator/<std::string> C_string_len_6000                             -0.9864         -0.9864         10870           148         10841           148
BM_format_to_iterator/<std::vector> C_string_len_6000                             -0.9863         -0.9863         10874           149         10846           148
BM_format_to_iterator/<std::deque> C_string_len_6000                              -0.9629         -0.9629         11239           417         11212           416
BM_format/string_len_6000                                                         -0.0012         -0.0013           846           845           844           842
BM_format_to_back_inserter<std::string>/string_len_6000                           -0.0029         -0.0034           845           843           843           840
BM_format_to_back_inserter<std::vector<char>>/string_len_6000                     -0.0129         -0.0125           832           821           830           819
BM_format_to_back_inserter<std::deque<char>>/string_len_6000                      +0.0048         +0.0048         15042         15114         15004         15076
BM_format_to_back_inserter<std::list<char>>/string_len_6000                       -0.0017         -0.0017        116266        116072        115967        115768
BM_format_to_iterator/<std::array> string_len_6000                                -0.0257         -0.0256           120           117           120           117
BM_format_to_iterator/<std::string> string_len_6000                               -0.0025         -0.0029           117           117           117           117
BM_format_to_iterator/<std::vector> string_len_6000                               -0.0089         -0.0087           118           116           117           116
BM_format_to_iterator/<std::deque> string_len_6000                                -0.0478         -0.0477           379           361           378           360
BM_format/string_view_len_6000                                                    -0.0092         -0.0091           842           835           840           833
BM_format_to_back_inserter<std::string>/string_view_len_6000                      -0.0081         -0.0083           841           835           839           832
BM_format_to_back_inserter<std::vector<char>>/string_view_len_6000                +0.0089         +0.0088           808           815           806           813
BM_format_to_back_inserter<std::deque<char>>/string_view_len_6000                 +0.0068         +0.0068         15030         15131         14992         15093
BM_format_to_back_inserter<std::list<char>>/string_view_len_6000                  +0.0012         +0.0010        116099        116243        115813        115934
BM_format_to_iterator/<std::array> string_view_len_6000                           -0.0122         -0.0121           118           117           118           116
BM_format_to_iterator/<std::string> string_view_len_6000                          +0.0010         +0.0010           106           107           106           106
BM_format_to_iterator/<std::vector> string_view_len_6000                          -0.0008         -0.0006           106           106           106           106
BM_format_to_iterator/<std::deque> string_view_len_6000                           -0.0549         -0.0548           370           349           369           349
OVERALL_GEOMEAN
2024-10-06 21:20:22 +02:00
serge-sans-paille
0eb26021d2
[libc++] Remove potential 0-sized array in __compressed_pair_padding (#109028) 2024-10-01 11:58:25 -04:00
Louis Dionne
41145feb77
[libc++][modules] Rewrite the modulemap to have fewer top-level modules (#110501)
This is a re-application of bc6bd3bc1e9 which was reverted in
f11abac6524 because it broke the Clang pre-commit CI.

Original commit message:

This patch rewrites the modulemap to have fewer top-level modules.
Previously, our modulemap had one top level module for each header in
the library, including private headers. This had the well-known problem
of making compilation times terrible, in addition to being somewhat
against the design principles of Clang modules.

This patch provides almost an order of magnitude compilation time
improvement when building modularized code (certainly subject to
variations). For example, including <ccomplex> without a module cache
went from 22.4 seconds to 1.6 seconds, a 14x improvement.

To achieve this, one might be tempted to simply put all the headers in a
single top-level module. Unfortunately, this doesn't work because libc++
provides C compatibility headers (e.g. stdlib.h) which create cycles
when the C Standard Library headers are modularized too. This is
especially tricky since base systems are usually not modularized: as far
as I know, only Xcode 16 beta contains a modularized SDK that makes this
issue visible. To understand it, imagine we have the following setup:

   // in libc++'s include/c++/v1/module.modulemap
   module std {
      header stddef.h
      header stdlib.h
   }

   // in the C library's include/module.modulemap
   module clib {
      header stddef.h
      header stdlib.h
   }

Now, imagine that the C library's <stdlib.h> includes <stddef.h>,
perhaps as an implementation detail. When building the `std` module,
libc++'s <stdlib.h> header does `#include_next <stdlib.h>` to get the C
library's <stdlib.h>, so libc++ depends on the `clib` module.

However, remember that the C library's <stdlib.h> header includes
<stddef.h> as an implementation detail. Since the header search paths
for libc++ are (and must be) before the search paths for the C library,
the C library ends up including libc++'s <stddef.h>, which means it
depends on the `std` module. That's a cycle.

To solve this issue, this patch creates one top-level module for each C
compatibility header. The rest of the libc++ headers are located in a
single top-level `std` module, with two main exceptions. First, the
module containing configuration headers (e.g. <__config>) has its own
top-level module too, because those headers are included by the C
compatibility headers.

Second, we create a top-level std_core module that contains several
dependency-free utilities used (directly or indirectly) from the __math
subdirectory. This is needed because __math pulls in a bunch of stuff,
and __math is used from the C compatibility header <math.h>.

As a direct benefit of this change, we don't need to generate an
artificial __std_clang_module header anymore to provide a monolithic
`std` module, since our modulemap does it naturally by construction.

A next step after this change would be to look into whether math.h
really needs to include the contents of __math, and if so, whether
libc++'s math.h truly needs to include the C library's math.h header.
Removing either dependency would break this annoying cycle.

Thanks to Eric Fiselier for pointing out this approach during a recent
meeting. This wasn't viable before some recent refactoring, but wrapping
everything (except the C headers) in a large module is by far the
simplest and the most effective way of doing this.

Fixes #86193
2024-09-30 14:17:05 -04:00
Chris B
f11abac652
Revert "[libc++][modules] Rewrite the modulemap to have fewer top-level modules (#107638)" (#110384)
This reverts 3 commits:
45a09d1811d5d6597385ef02ecf2d4b7320c37c5
24bc3244d4e221f4e6740f45e2bf15a1441a3076
bc6bd3bc1e99c7ec9e22dff23b4f4373fa02cae3

The GitHub pre-merge CI has been broken since this PR went in. This
change reverts it to see if I can get the pre-merge CI working again.
2024-09-28 21:47:09 -05:00
Louis Dionne
bc6bd3bc1e
[libc++][modules] Rewrite the modulemap to have fewer top-level modules (#107638)
This patch rewrites the modulemap to have fewer top-level modules.
Previously, our modulemap had one top level module for each header in
the library, including private headers. This had the well-known problem
of making compilation times terrible, in addition to being somewhat
against the design principles of Clang modules.

This patch provides almost an order of magnitude compilation time
improvement when building modularized code (certainly subject to
variations). For example, including <ccomplex> without a module cache
went from 22.4 seconds to 1.6 seconds, a 14x improvement.

To achieve this, one might be tempted to simply put all the headers in a
single top-level module. Unfortunately, this doesn't work because libc++
provides C compatibility headers (e.g. stdlib.h) which create cycles
when the C Standard Library headers are modularized too. This is
especially tricky since base systems are usually not modularized: as far
as I know, only Xcode 16 beta contains a modularized SDK that makes this
issue visible. To understand it, imagine we have the following setup:

   // in libc++'s include/c++/v1/module.modulemap
   module std {
      header stddef.h
      header stdlib.h
   }

   // in the C library's include/module.modulemap
   module clib {
      header stddef.h
      header stdlib.h
   }

Now, imagine that the C library's <stdlib.h> includes <stddef.h>,
perhaps as an implementation detail. When building the `std` module,
libc++'s <stdlib.h> header does `#include_next <stdlib.h>` to get the C
library's <stdlib.h>, so libc++ depends on the `clib` module.

However, remember that the C library's <stdlib.h> header includes
<stddef.h> as an implementation detail. Since the header search paths
for libc++ are (and must be) before the search paths for the C library,
the C library ends up including libc++'s <stddef.h>, which means it
depends on the `std` module. That's a cycle.

To solve this issue, this patch creates one top-level module for each C
compatibility header. The rest of the libc++ headers are located in a
single top-level `std` module, with two main exceptions. First, the
module containing configuration headers (e.g. <__config>) has its own
top-level module too, because those headers are included by the C
compatibility headers.

Second, we create a top-level std_core module that contains several
dependency-free utilities used (directly or indirectly) from the __math
subdirectory. This is needed because __math pulls in a bunch of stuff,
and __math is used from the C compatibility header <math.h>.

As a direct benefit of this change, we don't need to generate an
artificial __std_clang_module header anymore to provide a monolithic
`std` module, since our modulemap does it naturally by construction.

A next step after this change would be to look into whether math.h
really needs to include the contents of __math, and if so, whether
libc++'s math.h truly needs to include the C library's math.h header.
Removing either dependency would break this annoying cycle.

Thanks to Eric Fiselier for pointing out this approach during a recent
meeting. This wasn't viable before some recent refactoring, but wrapping
everything (except the C headers) in a large module is by far the
simplest and the most effective way of doing this.

Fixes #86193
2024-09-26 13:19:48 -04:00
Louis Dionne
09e3a36058
[libc++][modules] Fix missing and incorrect includes (#108850)
This patch adds a large number of missing includes in the libc++ headers
and the test suite. Those were found as part of the effort to move
towards a mostly monolithic top-level std module.
2024-09-16 15:06:20 -04:00
Nikolas Klauser
748023dc32
[libc++][NFC] Replace _LIBCPP_NORETURN and TEST_NORETURN with [[noreturn]] (#80455)
`[[__noreturn__]]` is now always available, so we can simply use the
attribute directly instead of through a macro.
2024-09-11 08:59:46 +02:00
Louis Dionne
348e74139a [libc++][NFC] Run clang-format on libcxx/include
This re-formats a few headers that had become out-of-sync with respect
to formatting since we ran clang-format on the whole codebase. There's
surprisingly few instances of it.
2024-08-30 12:09:36 -04:00
Nikolas Klauser
5c717d6b1d
[libc++] re-enable clang-tidy in the CI and fix any issues (#102658)
It looks like we've accidentally disabled clang-tidy in the CI. This
re-enables it and fixes the issues accumulated while it was disabled.
2024-08-10 10:08:41 +02:00
Mark de Wever
f08df56d3a
[libc++][format] Implements P3107R5 in <format>. (#86713)
This adds the new std::enable_nonlocking_formatter_optimization trait in
<format>. This trait will be used in std::print to implement the
performance benefits.

Implements parts of
- P3107R5 - Permit an efficient implementation of ``std::print``
2024-07-30 19:04:26 +02:00
A. Jiang
ca055bbec7
[libc++][format] LWG4061: Should std::basic_format_context be default-constructible/copyable/movable? (#97251)
See [LWG4061](https://cplusplus.github.io/LWG/issue4061) and
[P3341R0](https://wg21.link/p3341r0). Effectively reverts commit
36ce0c3b1e581ca310ae7d0cbc6af002cc5d0251.


`libcxx/test/std/utilities/format/format.functions/bug_81590.compile.pass.cpp`
has a `format` function that unexpectedly takes the
`basic_format_context` by value, which is made ill-formed by LWG4061.
This PR changes the function to take the context by reference.
2024-07-09 12:23:50 +02:00
A. Jiang
96c9913332
[libc++][format] LWG4106: basic_format_args should not be default-constructible (#97250)
See [LWG4106](https://cplusplus.github.io/LWG/issue4106) and
[P3341R0](https://wg21.link/p3341r0).

The test coverage for the empty state of `basic_format_args` in
`get.pass.cpp` is to be completely removed, because the
non-default-constructibility is covered in `ctor.pass.cpp`.
2024-07-09 12:21:30 +02:00
Louis Dionne
04a75f54a1
[libc++] Properly define _LIBCPP_HAS_NO_UNICODE in __config_site (#95138)
Fixes #93638

Co-authored-by: Mark de Wever <koraq@xs4all.nl>
2024-06-18 14:22:33 -04:00
Louis Dionne
e2c2ffbe7a
[libc++][NFC] Run clang-format on libcxx/include again (#95874)
As time went by, a few files have become mis-formatted w.r.t.
clang-format. This was made worse by the fact that formatting was not
being enforced in extensionless headers. This commit simply brings all
of libcxx/include in-line with clang-format again.

We might have to do this from time to time as we update our clang-format
version, but frankly this is really low effort now that we've formatted
everything once.
2024-06-18 09:13:45 -04:00
Eisuke Kawashima
88184e5060
[libc++] Fix invalid escape sequences in Python comments (#94032) 2024-06-10 09:38:31 -04:00
Mark de Wever
e3dea5e341
[libc++][format] Improves escaping performance. (#88533)
The previous patch implemented
- P2713R1 Escaping improvements in std::format
- LWG3965 Incorrect example in [format.string.escaped] p3 for formatting
of combining characters

These changes were correct, but had a size and performance penalty. This
patch improves the size and performance of the previous patch. The
performance is still worse than before since the lookups may require two
property lookups instead of one before implementing the paper. The
changes give a tighter coupling between the Unicode data and the
algorithm. Additional tests are added to notify about changes in future
Unicode updates.

Before
```
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_ascii_escaped<char>           110704 ns       110696 ns         6206
BM_unicode_escaped<char>         101371 ns       101374 ns         6862
BM_cyrillic_escaped<char>         63329 ns        63327 ns        11013
BM_japanese_escaped<char>         41223 ns        41225 ns        16938
BM_emoji_escaped<char>           111022 ns       111021 ns         6304
BM_ascii_escaped<wchar_t>        112441 ns       112443 ns         6231
BM_unicode_escaped<wchar_t>      102776 ns       102779 ns         6813
BM_cyrillic_escaped<wchar_t>      58977 ns        58975 ns        11868
BM_japanese_escaped<wchar_t>      36885 ns        36886 ns        18975
BM_emoji_escaped<wchar_t>        115885 ns       115881 ns         6051
```

The first change is to manually encode the entire last area and make a
manual exception for the 240 excluded entries. This reduced the table
from 1077 to 729 entries and gave the following benchmark results.
```
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_ascii_escaped<char>           104777 ns       104776 ns         6550
BM_unicode_escaped<char>          96980 ns        96982 ns         7238
BM_cyrillic_escaped<char>         60254 ns        60251 ns        11670
BM_japanese_escaped<char>         44452 ns        44452 ns        15734
BM_emoji_escaped<char>           104557 ns       104551 ns         6685
BM_ascii_escaped<wchar_t>        107456 ns       107454 ns         6505
BM_unicode_escaped<wchar_t>       96219 ns        96216 ns         7301
BM_cyrillic_escaped<wchar_t>      56921 ns        56904 ns        12288
BM_japanese_escaped<wchar_t>      39530 ns        39529 ns        17492
BM_emoji_escaped<wchar_t>        108494 ns       108496 ns         6408
```

An entry in the table can only contain 2048 code points. For larger
ranges there are multiple entries split in chunks with a maximum size of
2048 entries. To encode the entire Unicode code point range 21 bits are
required. The manual part starts at 0x323B0 this means all entries in
the table fit in 18 bits. This allows to allocate 3 additional bits for
the range. This allows entries to have 16384 elements. This range always
avoids splitting the range in multiple chunks.

This reduces the number of table elements from 729 to 711 and gives the
following benchmark results.
```
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_ascii_escaped<char>           104289 ns       104289 ns         6619
BM_unicode_escaped<char>          96682 ns        96681 ns         7215
BM_cyrillic_escaped<char>         59673 ns        59673 ns        11732
BM_japanese_escaped<char>         41983 ns        41982 ns        16646
BM_emoji_escaped<char>           104119 ns       104120 ns         6683
BM_ascii_escaped<wchar_t>        104503 ns       104505 ns         6693
BM_unicode_escaped<wchar_t>       93426 ns        93423 ns         7489
BM_cyrillic_escaped<wchar_t>      54858 ns        54859 ns        12742
BM_japanese_escaped<wchar_t>      36385 ns        36384 ns        19259
BM_emoji_escaped<wchar_t>        105608 ns       105610 ns         6592
```
2024-04-28 12:15:25 +02:00
Mark de Wever
ad76a85954
[libc++][format] Improves escaping. (#88283)
The change increments the size of the lookup table considerably. The
table has an "upper boundary" check. The removal of the code units with
the property Grapheme_Extend=Yes removes the range E0100..E01EF. This
breaks the trailing large continuous section in two parts. This will be
improved in a followup patch.

Implements:
- P2713R1 Escaping improvements in std::format
- LWG3965 Incorrect example in [format.string.escaped] p3 for formatting
of combining characters

```
---------------------------------------------------------
Benchmark                           Before          After    
---------------------------------------------------------
BM_ascii_escaped<char>            95696 ns      110704 ns
BM_unicode_escaped<char>          89311 ns      101371 ns
BM_cyrillic_escaped<char>         58633 ns       63329 ns
BM_japanese_escaped<char>         44500 ns       41223 ns
BM_emoji_escaped<char>            99156 ns      111022 ns
BM_ascii_escaped<wchar_t>         92245 ns      112441 ns
BM_unicode_escaped<wchar_t>       80970 ns      102776 ns
BM_cyrillic_escaped<wchar_t>      51253 ns       58977 ns
BM_japanese_escaped<wchar_t>      37252 ns       36885 ns
BM_emoji_escaped<wchar_t>         96226 ns      115885 ns
```
2024-04-25 17:16:41 +02:00
Nikolas Klauser
83bc7b5771
[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the tests (#87094)
This also adds a few tests that were missing.
2024-04-22 22:13:58 +02:00
Nikolas Klauser
472b612ccb
[libc++][NFC] Remove unused includes from <__type_traits/remove_cv.h> (#88752) 2024-04-18 06:55:50 +02:00
Mark de Wever
59e66c515a
[libc++][format] Switches to Unicode 15.1. (#86543)
In addition to changes in the tables the extended grapheme clustering
algorithm has been overhauled. Before I considered a separate state
machine to implement the rules. With the new rule GB9c this became more
attractive and the design has changed.

This change initially had quite an impact on the performance. By making
the state machine persistent the performance was improved greatly. Note
it is still slower than before due to the larger Unicode tables.

Before
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_ascii_text<char>             1891 ns         1889 ns       369504
BM_unicode_text<char>         106642 ns       106397 ns         6576
BM_cyrillic_text<char>         73420 ns        73277 ns         9445
BM_japanese_text<char>         62485 ns        62387 ns        11153
BM_emoji_text<char>             1895 ns         1893 ns       369525
BM_ascii_text<wchar_t>          2015 ns         2013 ns       346887
BM_unicode_text<wchar_t>       92119 ns        92017 ns         7598
BM_cyrillic_text<wchar_t>      62637 ns        62568 ns        11117
BM_japanese_text<wchar_t>      53850 ns        53785 ns        12803
BM_emoji_text<wchar_t>          2016 ns         2014 ns       347325

After
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_ascii_text<char>             1906 ns         1904 ns       369409
BM_unicode_text<char>         265462 ns       265175 ns         2628
BM_cyrillic_text<char>        181063 ns       180865 ns         3871
BM_japanese_text<char>        130927 ns       130789 ns         5324
BM_emoji_text<char>             1892 ns         1890 ns       370537
BM_ascii_text<wchar_t>          2038 ns         2035 ns       343689
BM_unicode_text<wchar_t>      277603 ns       277282 ns         2526
BM_cyrillic_text<wchar_t>     188558 ns       188339 ns         3727
BM_japanese_text<wchar_t>     133084 ns       132943 ns         5262
BM_emoji_text<wchar_t>          2012 ns         2010 ns       348015

Persistent
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_ascii_text<char>             1904 ns         1899 ns       367472
BM_unicode_text<char>         133609 ns       133287 ns         5246
BM_cyrillic_text<char>         90185 ns        89941 ns         7796
BM_japanese_text<char>         75137 ns        74946 ns         9316
BM_emoji_text<char>             1906 ns         1901 ns       368081
BM_ascii_text<wchar_t>          2703 ns         2696 ns       259153
BM_unicode_text<wchar_t>      131497 ns       131168 ns         5341
BM_cyrillic_text<wchar_t>      87071 ns        86840 ns         8076
BM_japanese_text<wchar_t>      72279 ns        72099 ns         9682
BM_emoji_text<wchar_t>          2021 ns         2016 ns       346767
2024-04-09 19:20:06 +02:00
Brian Cain
e1830f586a
[libcxx] coerce formatter precision to int (#87738)
__precision_ is declared as an int32_t which on some hexagon platforms
is defined as a long.

This change fixes errors like the ones below:

In file included from
/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:19:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/format:202:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:29:

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:700:17:
error: no matching function for call to 'max'
700 | int __p = std::max(1, (__specs.__has_precision() ?
__specs.__precision_ : 6));
          |                 ^~~~~~~~

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:771:25:
note: in instantiation of function template specialization
'std::__formatter::__format_floating_point<float, char,
std::format_context>' requested here
771 | return __formatter::__format_floating_point(__value, __ctx,
__parser_.__get_parsed_std_specifications(__ctx));
          |                         ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:284:42:
note: in instantiation of function template specialization
'std::__formatter_floating_point<char>::format<float,
std::format_context>' requested here
284 | __ctx.advance_to(__formatter.format(__arg, __ctx));
          |                                          ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:429:15:
note: in instantiation of function template specialization
'std::__vformat_to<std::back_insert_iterator<std::string>, char,
std::back_insert_iterator<std::__format::__output_buffer<char>>>'
requested here
429 | return std::__vformat_to(std::move(__out_it), __fmt, __args);
          |               ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:462:8:
note: in instantiation of function template specialization
'std::vformat_to<std::back_insert_iterator<std::string>>' requested here
      462 |   std::vformat_to(std::back_inserter(__res), __fmt, __args);
          |        ^

/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:29:8:
note: in instantiation of function template specialization
'std::vformat<void>' requested here
       29 |   std::vformat("", std::make_format_args());
          |        ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:35:1:
note: candidate template ignored: deduced conflicting types for
parameter '_Tp' ('int' vs. 'int32_t' (aka 'long'))
35 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b) {
          | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:43:1:
note: candidate template ignored: could not match
'initializer_list<_Tp>' against 'int'
       43 | max(initializer_list<_Tp> __t, _Compare __comp) {
          | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:48:86:
note: candidate function template not viable: requires single argument
'__t', but 2 arguments were provided
48 | _LIBCPP_NODISCARD_EXT inline _LIBCPP_HIDE_FROM_ABI
_LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp max(initializer_list<_Tp> __t) {
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:29:1:
note: candidate function template not viable: requires 3 arguments, but
2 were provided
29 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b, _Compare __comp) {
| ^
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2024-04-05 11:06:37 -05:00
Nikolas Klauser
beeb15b716
[libc++][NFC] Remove a few unused <__availablity> includes (#86126) 2024-04-02 13:52:07 +02:00
Nikolas Klauser
316634ff59
[libc++] Remove <queue> and <stack> includes from <format> (#85520)
This reduces the include time of <format> from 691ms to 556ms.
2024-03-29 12:06:09 +01:00
Mark de Wever
d179176f3e
[libc++][format] Adds ABI tags to inline constexpr variables. (#86293)
This uses the macro on record types and inline constexpr variables. The
tagged declarations are very likely to change in future versions of
libc++:
- __fields are internal types used to control the formatter's parse
functions which fields to expect. Newer formatters may add new fields.
For example the filesystem::path formatter accepted in the recent Tokyo
meeting added a new 'g' flag, which differs from the 'g' type.
- The Unicode tables. The number of entries in these table likely differ
between Unicode versions. The tables contain only a part of all Unicode
properties. Typically they are stored in a 32-bit entry where some bits
contain the properties and other bits the size of the range. Changes in
the Unicode or C++ algorithms may require more properties to be
available in C++. This may affect the number of bits available in the
range. If needed, other declarations get the macro. This is mainly a
first time to review this approach.

This was originally https://reviews.llvm.org/D143494 where a new macro
_LIBCPP_HIDE_FROM_ABI_TYPE was defined. Testing revealed the existing
macro _LIBCPP_HIDE_FROM_ABI could be used. The "parts" of the macro that
do not affect records are not harmful. Based on this information the
existing macro was used and additional documentation was written.
2024-03-25 18:33:30 +01:00
Mark de Wever
11dd881b9c
[libc++][format] Fixes nested concept evaluation. (#85548)
Before the __formattable concept depended on itself in a contrieved
example. By using the underlying concept directly the cycle is broken.

Fixes https://github.com/llvm/llvm-project/issues/81590
2024-03-20 09:45:12 +01:00
Nikolas Klauser
5bcb78141c
[libc++] Remove <locale> includes from <format> (#85478)
This reduces the include time from 767ms to 691ms.
2024-03-16 13:45:24 +01:00
Nikolas Klauser
4528c44d0a
[libc++] Remove <tuple> include from <__format/concepts.h> (#80214)
This also moves `tuple_size_v` into `tuple_size` as a drive-by.
2024-03-14 12:04:41 +01:00
Nikolas Klauser
0876668114
[libc++][NFC] Move __format/format_fwd.h to __fwd/format.h (#84336) 2024-03-08 20:43:10 +01:00
Po-yao Chang
b29301cd40
[libc++][format] Handle range-underlying-spec (#81914)
An immediate colon signifeis that the range-format-spec contains only
range-underlying-spec.

This patch allows this code to compile and run:
```c++
std::println("{::<<9?}", std::span<const char>{"Hello", sizeof "Hello"});
```
2024-03-04 08:05:01 +08:00
Louis Dionne
37dca605c9
[libc++] Clean up includes of <__assert> (#80091)
Originally, we used __libcpp_verbose_abort to handle assertion failures.
That function was declared from all public headers. Since we don't use
that mechanism anymore, we don't need to declare __libcpp_verbose_abort
from all public headers, and we can clean up a lot of unnecessary
includes.

This patch also moves the definition of the various assertion categories
to the <__assert> header, since we now rely on regular IWYU for these
assertion macros.

rdar://105510916
2024-02-29 10:12:22 -05:00
Po-yao Chang
08fe7df600
[libc++][format] Don't treat a closing '}' as part of format-spec (#81305)
This allows:
```
std::println("{}>42", std:🧵:id{});
std::println("{}>42", std::span<int>{});
std::println("{}>42", std::pair{42, "Hello"sv});
std::println("{:}>42", std:🧵:id{});
std::println("{:}>42", std::span<int>{});
std::println("{:}>42", std::pair{42, "Hello"sv});
```
to compile and run.
2024-02-16 02:41:07 +08:00
Nikolas Klauser
ffb3589b8c
[libc++] Remove transitive <locale> include from <vector> (#80282)
This reduces the time to include `<vector>` from 468ms to 367ms.
2024-02-02 11:33:08 +01:00
Louis Dionne
683bc94e16
[libc++] Officially remove _VSTD and _LIBCPP_INLINE_VISIBILITY (#79885)
Those were deprecated and basically not used anymore after we renamed
them in batch. This patch removes the macros entirely.
2024-01-30 13:51:20 +01:00
Hristo Hristov
27e67cdb31
Reland: [libc++][format] P2637R3: Member visit (std::basic_format_arg) #76449 (#79032)
Deleted the offending test case.


`libcxx/test/std/utilities/format/format.arguments/format.arg/visit.return_type.pass.cpp`
lines: 134-135:   
>   test<Context, bool, long>(true, 192812079084L);
     test<Context, bool, long>(false, 192812079084L);
     
 Relands: https://github.com/llvm/llvm-project/pull/76449
Reverted in:
02f95b7751

---------

Co-authored-by: Zingam <zingam@outlook.com>
2024-01-29 20:57:12 +02:00
Louis Dionne
7b4622514d
[libc++] Fix missing and incorrect push/pop macros (#79204)
We recently noticed that the unwrap_iter.h file was pushing macros, but
it was pushing them again instead of popping them at the end of the
file. This led to libc++ basically swallowing any custom definition of
these macros in user code:

    #define min HELLO
    #include <algorithm>
    // min is not HELLO anymore, it's not defined

While investigating this issue, I noticed that our push/pop pragmas were
actually entirely wrong too. Indeed, instead of pushing macros like
`move`, we'd push `move(int, int)` in the pragma, which is not a valid
macro name. As a result, we would not actually push macros like `move`
-- instead we'd simply undefine them. This led to the following code not
working:

    #define move HELLO
    #include <algorithm>
    // move is not HELLO anymore

Fixing the pragma push/pop incantations led to a cascade of issues
because we use identifiers like `move` in a large number of places, and
all of these headers would now need to do the push/pop dance.

This patch fixes all these issues. First, it adds a check that we don't
swallow important names like min, max, move or refresh as explained
above. This is done by augmenting the existing
system_reserved_names.gen.py test to also check that the macros are what
we expect after including each header.

Second, it fixes the push/pop pragmas to work properly and adds missing
pragmas to all the files I could detect a failure in via the newly added
test.

rdar://121365472
2024-01-25 15:48:46 -05:00
Petr Hosek
02f95b7751 Revert "[libc++][format] P2637R3: Member visit (std::basic_format_arg) (#76449)"
This reverts commit 7d9b5aa65b09126031e1c2903605a7d34aea4bc1 since
std/utilities/format/format.arguments/format.arg/visit.return_type.pass.cpp
is failing on Windows when building with Clang-cl.
2024-01-22 17:23:05 +00:00
Hristo Hristov
7d9b5aa65b
[libc++][format] P2637R3: Member visit (std::basic_format_arg) (#76449)
Implements parts of: `P2637R3` https://wg21.link/P2637R3
(https://eel.is/c++draft/variant.visit)

Implements:
`basic_format_arg.visit()`
`basic_format_arg.visit<R>()`
Deprecates:
`std::visit_format_arg()`

The tests are as close as possible to the non-member function tests.

To land after: https://github.com/llvm/llvm-project/pull/76447,
https://github.com/llvm/llvm-project/pull/76268

---------

Co-authored-by: Zingam <zingam@outlook.com>
2024-01-21 12:30:25 +02:00
Konstantin Varlamov
4f215fdd62
[libc++][hardening] Categorize more assertions. (#75918)
Also introduce `_LIBCPP_ASSERT_PEDANTIC` for assertions violating which
results in a no-op or other benign behavior, but which may nevertheless
indicate a bug in the invoking code.
2024-01-05 16:29:23 -08:00
bgra8
8c72ff716b
[NFC] Renames a template parameter to avoid clashes with userspace names. (#76829)
Co-authored-by: Bogdan Graur <bgraur@google.com>
2024-01-04 09:25:57 +01:00
Konstantin Varlamov
1638657dce
[libc++][hardening] Categorize more 'valid-element-access' checks. (#71620) 2023-12-20 17:24:48 -08:00
Louis Dionne
9783f28cbb
[libc++] Format the code base (#74334)
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.

This patch was generated with:

   find libcxx/include libcxx/src -type f \
      | grep -v 'module.modulemap.in' \
      | grep -v 'CMakeLists.txt' \
      | grep -v 'README.txt' \
      | grep -v 'libcxx.imp' \
      | grep -v '__config_site.in' \
      | xargs clang-format -i

A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.

[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
2023-12-18 14:01:33 -05:00
Louis Dionne
a35629cd8d
[libc++] Remove assumptions that std::array::iterator is a raw pointer (#74624)
This patch removes assumptions that std::array's iterators are raw
pointers in the source code and in our test suite. While this is true
right now, this doesn't have to be true and ion the future we might want
to enable bounded iterators in std::array, which would require this
change.

This is a pre-requisite for landing #74482
2023-12-18 10:00:47 -05:00
Mark de Wever
e3f154d873
[libc++] Implements Runtime format strings. (#73353)
This change requires quite a number of changes in the tests; this is not
code I expect people to use in the wild. So I don't expect breakage for
users.

Implements:
- P2905R2 Runtime format strings, as a Defect Report
2023-12-09 12:32:17 +01:00
Louis Dionne
77a00c0d54
[libc++] Replace uses of _VSTD:: by std:: (#74331)
As part of the upcoming clang-formatting of libc++, this patch performs
the long desired removal of the _VSTD macro.

See https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
for the clang-format proposal.
2023-12-05 11:19:15 -05:00
Louis Dionne
b18a46e35d
[libc++][NFC] Add a few clang-format annotations (#74352)
This is in preparation for clang-formatting the whole code base. These
annotations are required either to avoid clang-format bugs or because
the manually formatted code is significantly more readable than the
clang-formatted alternative. All in all, it seems like very few
annotations are required, which means that clang-format is doing a very
good job in most cases.
2023-12-04 15:17:31 -05:00
Mark de Wever
16b8c9608f
[libc++][format] Fixes formatting code units as integers. (#73396)
This paper was voted in as a DR, so it's retroactively enabled back to
C++20; the C++ version that introduced std::format.

Implements:
- P2909R4 Fix formatting of code units as integers (Dude, where’s my
``char``?)
2023-11-29 17:55:09 +01:00
Mark de Wever
92d9f232dd
[libc++] Implements Runtime format strings II. (#72543)
Implements
- P2918R2 Runtime format strings II
2023-11-24 17:30:33 +01:00
Hans Wennborg
e2fc68c3db Typos: 'maxium', 'minium' 2023-10-23 10:42:28 +02:00
Igor Zhukov
70248920fc [libc++][test] Add '-Wdeprecated-copy', '-Wdeprecated-copy-dtor' warnings to the test suite
This is a follow up to https://reviews.llvm.org/D144694.
Fixes https://github.com/llvm/llvm-project/issues/60977.

Differential Revision: https://reviews.llvm.org/D144775
2023-09-12 08:53:38 -04:00
Louis Dionne
b397921fc7 [runtimes] Fix some duplicate word typos
Those fixes were taken from https://reviews.llvm.org/D137338.
2023-08-31 11:55:10 -04:00