mirror of
https://github.com/llvm/llvm-project.git
synced 2025-05-07 21:16:06 +00:00

As obvious from the paper's title this is an LWG issue and thus retroactively applied to C++20. This change may the output for certain code points: 1 Considers 8477 extra codepoints as having a width 2 (as of Unicode 15) (mostly Tangut Ideographs) 2 Change the width of 85 unassigned code points from 2 to 1 3 Change the width of 8 codepoints (in the range U+3248 CIRCLED NUMBER TEN ON BLACK SQUARE ... U+324F CIRCLED NUMBER EIGHTY ON BLACK SQUARE) from 2 to 1, because it seems questionable to make an exception for those without input from Unicode Note that libc++ already uses Unicode 15, while the Standard requires Unicode 12. (The last time I checked MSVC STL used Unicode 14.) So in practice the only notable change is item 3. Implements P2675 LWG3780: The Paper format's width estimation is too approximate and not forward compatible Benchmark before these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3928 ns 3928 ns 178131 BM_unicode_text<char> 75231 ns 75230 ns 9158 BM_cyrillic_text<char> 59837 ns 59834 ns 11529 BM_japanese_text<char> 39842 ns 39832 ns 17501 BM_emoji_text<char> 3931 ns 3930 ns 177750 BM_ascii_text<wchar_t> 4024 ns 4024 ns 174190 BM_unicode_text<wchar_t> 63756 ns 63751 ns 11136 BM_cyrillic_text<wchar_t> 44639 ns 44638 ns 15597 BM_japanese_text<wchar_t> 34425 ns 34424 ns 20283 BM_emoji_text<wchar_t> 3937 ns 3937 ns 177684 Benchmark after these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3914 ns 3913 ns 178814 BM_unicode_text<char> 70380 ns 70378 ns 9694 BM_cyrillic_text<char> 51889 ns 51877 ns 13488 BM_japanese_text<char> 41707 ns 41705 ns 16723 BM_emoji_text<char> 3908 ns 3907 ns 177912 BM_ascii_text<wchar_t> 3949 ns 3948 ns 177525 BM_unicode_text<wchar_t> 64591 ns 64587 ns 10649 BM_cyrillic_text<wchar_t> 44089 ns 44078 ns 15721 BM_japanese_text<wchar_t> 39369 ns 39367 ns 17779 BM_emoji_text<wchar_t> 3936 ns 3934 ns 177821 Benchmarks without "if(__code_point < (__entries[0] >> 14))" -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3922 ns 3922 ns 178587 BM_unicode_text<char> 94474 ns 94474 ns 7351 BM_cyrillic_text<char> 69202 ns 69200 ns 10157 BM_japanese_text<char> 42735 ns 42692 ns 16382 BM_emoji_text<char> 3920 ns 3919 ns 178704 BM_ascii_text<wchar_t> 3951 ns 3950 ns 177224 BM_unicode_text<wchar_t> 81003 ns 80988 ns 8668 BM_cyrillic_text<wchar_t> 57020 ns 57018 ns 12048 BM_japanese_text<wchar_t> 39695 ns 39687 ns 17582 BM_emoji_text<wchar_t> 3977 ns 3976 ns 176479 This optimization does carry its weight for the Unicode and Cyrillic test. For the Japanese tests the gains are minor and for emoji it seems to have no effect. Reviewed By: ldionne, tahonermann, #libc Differential Revision: https://reviews.llvm.org/D144499
2.3 KiB
2.3 KiB
1 | Number | Name | Standard | Assignee | Status | First released version |
---|---|---|---|---|---|---|
2 | `P0645 <https://wg21.link/P0645>`_ | Text Formatting | C++20 | Mark de Wever | |Complete| | Clang 14 |
3 | `P1652 <https://wg21.link/P1652>`_ | Printf corner cases in std::format | C++20 | Mark de Wever | |Complete| | Clang 14 |
4 | `P1892 <https://wg21.link/P1892>`_ | Extended locale-specific presentation specifiers for std::format | C++20 | Mark de Wever | |Complete| | Clang 14 |
5 | `P1868 <https://wg21.link/P1868>`_ | width: clarifying units of width and precision in std::format (Implements the unicode support.) | C++20 | Mark de Wever | |Complete| | Clang 14 |
6 | `P2216 <https://wg21.link/P2216>`_ | std::format improvements | C++20 | Mark de Wever | |Complete| | Clang 15 |
7 | `P2418 <https://wg21.link/P2418>`__ | Add support for ``std::generator``-like types to ``std::format`` | C++20 | Mark de Wever | |Complete| | Clang 15 |
8 | `P2093R14 <https://wg21.link/P2093R14>`__ | Formatted output | C++23 | Mark de Wever | |In Progress| | |
9 | `P2286R8 <https://wg21.link/P2286R8>`__ | Formatting Ranges | C++23 | Mark de Wever | |Complete| | Clang 16 |
10 | `P2508R1 <https://wg21.link/P2508R1>`__ | Exposing ``std::basic-format-string`` | C++23 | Mark de Wever | |Complete| | Clang 15 |
11 | `P2585R0 <https://wg21.link/P2585R0>`__ | Improving default container formatting | C++23 | Mark de Wever | |Complete| | Clang 17 |
12 | `P2675R1 <https://wg21.link/P2675R1>`__ | ``format``'s width estimation is too approximate and not forward compatible | C++23 | Mark de Wever | |Complete| | Clang 17 |
13 | `P1361 <https://wg21.link/P1361>`_ | Integration of chrono with text formatting | C++20 | Mark de Wever | |In Progress| | |
14 | `P2372 <https://wg21.link/P2372>`__ | Fixing locale handling in chrono formatters | C++20 | Mark de Wever | |In Progress| |