llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-30 19:06:04 +00:00

Author	SHA1	Message	Date
Nikolas Klauser	e99c4906e4	[libc++] Granularize <cstddef> includes (#108696 )	2024-10-31 02:20:10 +01:00
Louis Dionne	348e74139a	[libc++][NFC] Run clang-format on libcxx/include This re-formats a few headers that had become out-of-sync with respect to formatting since we ran clang-format on the whole codebase. There's surprisingly few instances of it.	2024-08-30 12:09:36 -04:00
Eisuke Kawashima	88184e5060	[libc++] Fix invalid escape sequences in Python comments (#94032 )	2024-06-10 09:38:31 -04:00
AngryLoki	ae858b5123	[libc++] Fix SyntaxWarning messages from python 3.12 (#93637 ) This fixes "SyntaxWarning: invalid escape sequence" and "SyntaxWarning: `is` with int literal". transitive_includes.gen.py was also reformatted with darker per the style guide. Signed-off-by: Sv. Lockal <lockalsash@gmail.com>	2024-06-05 10:31:03 -04:00
Mark de Wever	e3dea5e341	[libc++][format] Improves escaping performance. (#88533 ) The previous patch implemented - P2713R1 Escaping improvements in std::format - LWG3965 Incorrect example in [format.string.escaped] p3 for formatting of combining characters These changes were correct, but had a size and performance penalty. This patch improves the size and performance of the previous patch. The performance is still worse than before since the lookups may require two property lookups instead of one before implementing the paper. The changes give a tighter coupling between the Unicode data and the algorithm. Additional tests are added to notify about changes in future Unicode updates. Before ``` ----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- BM_ascii_escaped<char> 110704 ns 110696 ns 6206 BM_unicode_escaped<char> 101371 ns 101374 ns 6862 BM_cyrillic_escaped<char> 63329 ns 63327 ns 11013 BM_japanese_escaped<char> 41223 ns 41225 ns 16938 BM_emoji_escaped<char> 111022 ns 111021 ns 6304 BM_ascii_escaped<wchar_t> 112441 ns 112443 ns 6231 BM_unicode_escaped<wchar_t> 102776 ns 102779 ns 6813 BM_cyrillic_escaped<wchar_t> 58977 ns 58975 ns 11868 BM_japanese_escaped<wchar_t> 36885 ns 36886 ns 18975 BM_emoji_escaped<wchar_t> 115885 ns 115881 ns 6051 ``` The first change is to manually encode the entire last area and make a manual exception for the 240 excluded entries. This reduced the table from 1077 to 729 entries and gave the following benchmark results. ``` ----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- BM_ascii_escaped<char> 104777 ns 104776 ns 6550 BM_unicode_escaped<char> 96980 ns 96982 ns 7238 BM_cyrillic_escaped<char> 60254 ns 60251 ns 11670 BM_japanese_escaped<char> 44452 ns 44452 ns 15734 BM_emoji_escaped<char> 104557 ns 104551 ns 6685 BM_ascii_escaped<wchar_t> 107456 ns 107454 ns 6505 BM_unicode_escaped<wchar_t> 96219 ns 96216 ns 7301 BM_cyrillic_escaped<wchar_t> 56921 ns 56904 ns 12288 BM_japanese_escaped<wchar_t> 39530 ns 39529 ns 17492 BM_emoji_escaped<wchar_t> 108494 ns 108496 ns 6408 ``` An entry in the table can only contain 2048 code points. For larger ranges there are multiple entries split in chunks with a maximum size of 2048 entries. To encode the entire Unicode code point range 21 bits are required. The manual part starts at 0x323B0 this means all entries in the table fit in 18 bits. This allows to allocate 3 additional bits for the range. This allows entries to have 16384 elements. This range always avoids splitting the range in multiple chunks. This reduces the number of table elements from 729 to 711 and gives the following benchmark results. ``` ----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- BM_ascii_escaped<char> 104289 ns 104289 ns 6619 BM_unicode_escaped<char> 96682 ns 96681 ns 7215 BM_cyrillic_escaped<char> 59673 ns 59673 ns 11732 BM_japanese_escaped<char> 41983 ns 41982 ns 16646 BM_emoji_escaped<char> 104119 ns 104120 ns 6683 BM_ascii_escaped<wchar_t> 104503 ns 104505 ns 6693 BM_unicode_escaped<wchar_t> 93426 ns 93423 ns 7489 BM_cyrillic_escaped<wchar_t> 54858 ns 54859 ns 12742 BM_japanese_escaped<wchar_t> 36385 ns 36384 ns 19259 BM_emoji_escaped<wchar_t> 105608 ns 105610 ns 6592 ```	2024-04-28 12:15:25 +02:00
Mark de Wever	ad76a85954	[libc++][format] Improves escaping. (#88283 ) The change increments the size of the lookup table considerably. The table has an "upper boundary" check. The removal of the code units with the property Grapheme_Extend=Yes removes the range E0100..E01EF. This breaks the trailing large continuous section in two parts. This will be improved in a followup patch. Implements: - P2713R1 Escaping improvements in std::format - LWG3965 Incorrect example in [format.string.escaped] p3 for formatting of combining characters ``` --------------------------------------------------------- Benchmark Before After --------------------------------------------------------- BM_ascii_escaped<char> 95696 ns 110704 ns BM_unicode_escaped<char> 89311 ns 101371 ns BM_cyrillic_escaped<char> 58633 ns 63329 ns BM_japanese_escaped<char> 44500 ns 41223 ns BM_emoji_escaped<char> 99156 ns 111022 ns BM_ascii_escaped<wchar_t> 92245 ns 112441 ns BM_unicode_escaped<wchar_t> 80970 ns 102776 ns BM_cyrillic_escaped<wchar_t> 51253 ns 58977 ns BM_japanese_escaped<wchar_t> 37252 ns 36885 ns BM_emoji_escaped<wchar_t> 96226 ns 115885 ns ```	2024-04-25 17:16:41 +02:00
Mark de Wever	d179176f3e	[libc++][format] Adds ABI tags to inline constexpr variables. (#86293 ) This uses the macro on record types and inline constexpr variables. The tagged declarations are very likely to change in future versions of libc++: - __fields are internal types used to control the formatter's parse functions which fields to expect. Newer formatters may add new fields. For example the filesystem::path formatter accepted in the recent Tokyo meeting added a new 'g' flag, which differs from the 'g' type. - The Unicode tables. The number of entries in these table likely differ between Unicode versions. The tables contain only a part of all Unicode properties. Typically they are stored in a 32-bit entry where some bits contain the properties and other bits the size of the range. Changes in the Unicode or C++ algorithms may require more properties to be available in C++. This may affect the number of bits available in the range. If needed, other declarations get the macro. This is mainly a first time to review this approach. This was originally https://reviews.llvm.org/D143494 where a new macro _LIBCPP_HIDE_FROM_ABI_TYPE was defined. Testing revealed the existing macro _LIBCPP_HIDE_FROM_ABI could be used. The "parts" of the macro that do not affect records are not harmful. Based on this information the existing macro was used and additional documentation was written.	2024-03-25 18:33:30 +01:00
Louis Dionne	b18a46e35d	[libc++][NFC] Add a few clang-format annotations (#74352 ) This is in preparation for clang-formatting the whole code base. These annotations are required either to avoid clang-format bugs or because the manually formatted code is significantly more readable than the clang-formatted alternative. All in all, it seems like very few annotations are required, which means that clang-format is doing a very good job in most cases.	2023-12-04 15:17:31 -05:00
Stephan T. Lavavej	0d3c40b82b	[libc++] Remove unused Python imports (#73724 ) VSCode's Pylance extension informed me, and text searching confirmed, that these imports are unused. I believe we should be able to remove them harmlessly.	2023-11-29 09:25:06 -05:00
Nikolas Klauser	4f15267d3d	[libc++][NFC] Replace _LIBCPP_STD_VER > x with _LIBCPP_STD_VER >= x This change is almost fully mechanical. The only interesting change is in `generate_feature_test_macro_components.py` to generate `_LIBCPP_STD_VER >=` instead. To avoid churn in the git-blame this commit should be added to the `.git-blame-ignore-revs` once committed. Reviewed By: ldionne, var-const, #libc Spies: jloser, libcxx-commits, arichardson, arphaman, wenlei Differential Revision: https://reviews.llvm.org/D143962	2023-02-15 16:52:25 +01:00
Mark de Wever	a48007355a	[libc++][format] Implements string escaping. Implements parts of - P2286R8 Formatting Ranges Reviewed By: #libc, tahonermann Differential Revision: https://reviews.llvm.org/D134036	2022-10-20 17:29:34 +02:00

11 Commits