llvm-project/libcxx/utils/CMakeLists.txt

add_custom_target(libcxx-generate-feature-test-macros
    COMMAND "${Python3_EXECUTABLE}" "${LIBCXX_SOURCE_DIR}/utils/generate_feature_test_macro_components.py"
    COMMENT "Generate the <version> header and tests for feature test macros.")

add_custom_target(libcxx-generate-std-cppm-in-file
  COMMAND
        "${Python3_EXECUTABLE}"
		"${LIBCXX_SOURCE_DIR}/utils/generate_libcxx_cppm_in.py"
		"std"
  COMMENT "Generate the std.cppm.in file")

add_custom_target(libcxx-generate-std-compat-cppm-in-file
  COMMAND
        "${Python3_EXECUTABLE}"
		"${LIBCXX_SOURCE_DIR}/utils/generate_libcxx_cppm_in.py"
		"std.compat"
  COMMENT "Generate the std.compat.cppm.in file")

add_custom_target(libcxx-generate-extended-grapheme-cluster-tables
    COMMAND
        "${Python3_EXECUTABLE}"
        "${LIBCXX_SOURCE_DIR}/utils/generate_extended_grapheme_cluster_table.py"
        "${LIBCXX_SOURCE_DIR}/include/__format/extended_grapheme_cluster_table.h"
    COMMENT "Generate the extended grapheme cluster header.")

add_custom_target(libcxx-generate-extended-grapheme-cluster-tests
    COMMAND
        "${Python3_EXECUTABLE}"
        "${LIBCXX_SOURCE_DIR}/utils/generate_extended_grapheme_cluster_test.py"
         "${LIBCXX_SOURCE_DIR}/test/libcxx/utilities/format/format.string/format.string.std/extended_grapheme_cluster.h"
    COMMENT "Generate the extended grapheme cluster header.")

add_custom_target(libcxx-generate-escaped-output-table
    COMMAND
        "${Python3_EXECUTABLE}"
        "${LIBCXX_SOURCE_DIR}/utils/generate_escaped_output_table.py"
        "${LIBCXX_SOURCE_DIR}/include/__format/escaped_output_table.h"
    COMMENT "Generate the escaped output header")

add_custom_target(libcxx-generate-width-estimation-table
    COMMAND
        "${Python3_EXECUTABLE}"
        "${LIBCXX_SOURCE_DIR}/utils/generate_width_estimation_table.py"
        "${LIBCXX_SOURCE_DIR}/include/__format/width_estimation_table.h"
    COMMENT "Generate the width estimation header")

add_custom_target(libcxx-indic-conjunct-break-table
    COMMAND
        "${Python3_EXECUTABLE}"
        "${LIBCXX_SOURCE_DIR}/utils/generate_indic_conjunct_break_table.py"
        "${LIBCXX_SOURCE_DIR}/include/__format/indic_conjunct_break_table.h"
    COMMENT "Generate the Indic Conjunct Break header")

add_custom_target(libcxx-generate-files
    DEPENDS libcxx-generate-feature-test-macros
            libcxx-generate-std-cppm-in-file
            libcxx-generate-std-compat-cppm-in-file
            libcxx-generate-extended-grapheme-cluster-tables
            libcxx-generate-extended-grapheme-cluster-tests
            libcxx-generate-escaped-output-table
            libcxx-generate-width-estimation-table
            libcxx-indic-conjunct-break-table
    COMMENT "Create all the auto-generated files in libc++ and its tests.")
[libc++] Add a CMake target to re-generate files and revamp CONTRIBUTING.rst As we automate more and more things in the library, it becomes useful for contributors to have a single target for running all the automation as part of their workflow. This commit adds a new `libcxx-generate-files` target that should re-generate all the auto-generated files in the library. As a fly-by, I also revamped the documentation on Contributing to account for this new target and present it as a bullet list of things to check before committing. I also added a few things that are often overlooked to that list, such as updating the synopsis and the status files. Differential Revision: https://reviews.llvm.org/D106067 2021-07-15 10:19:39 -04:00			`add_custom_target(libcxx-generate-feature-test-macros`
			`COMMAND "${Python3_EXECUTABLE}" "${LIBCXX_SOURCE_DIR}/utils/generate_feature_test_macro_components.py"`
			`COMMENT "Generate the <version> header and tests for feature test macros.")`

[libc++][modules] Generates std.cppm.in. This takes the header restrictions into account instead of manually duplicating this build information. This is a preparation to properly support the libc++ disabled parts in the std module. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D158192 2023-08-17 17:49:34 +02:00			`add_custom_target(libcxx-generate-std-cppm-in-file`
[libc++][modules] Adds std.compat module. (#71438) This adds the std.compat module. The patch contains a bit of refactoring to avoid code duplication between the std and std.compat module. Implements parts of - P2465R3 Standard Library Modules std and std.compat 2023-12-09 13:51:50 +01:00			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_libcxx_cppm_in.py"`
			`"std"`
[libc++][modules] Generates std.cppm.in. This takes the header restrictions into account instead of manually duplicating this build information. This is a preparation to properly support the libc++ disabled parts in the std module. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D158192 2023-08-17 17:49:34 +02:00			`COMMENT "Generate the std.cppm.in file")`

[libc++][modules] Adds std.compat module. (#71438) This adds the std.compat module. The patch contains a bit of refactoring to avoid code duplication between the std and std.compat module. Implements parts of - P2465R3 Standard Library Modules std and std.compat 2023-12-09 13:51:50 +01:00			`add_custom_target(libcxx-generate-std-compat-cppm-in-file`
			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_libcxx_cppm_in.py"`
			`"std.compat"`
			`COMMENT "Generate the std.compat.cppm.in file")`

[libc++] Improve updating data files. This changes makes it easier to update the Unicode data files used for the Extended Graphme Clustering as added in D126971. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D129668 2022-07-13 19:24:12 +02:00			`add_custom_target(libcxx-generate-extended-grapheme-cluster-tables`
			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_extended_grapheme_cluster_table.py"`
			`"${LIBCXX_SOURCE_DIR}/include/__format/extended_grapheme_cluster_table.h"`
			`COMMENT "Generate the extended grapheme cluster header.")`

			`add_custom_target(libcxx-generate-extended-grapheme-cluster-tests`
			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_extended_grapheme_cluster_test.py"`
			`"${LIBCXX_SOURCE_DIR}/test/libcxx/utilities/format/format.string/format.string.std/extended_grapheme_cluster.h"`
			`COMMENT "Generate the extended grapheme cluster header.")`

[libc++][format] Implements string escaping. Implements parts of - P2286R8 Formatting Ranges Reviewed By: #libc, tahonermann Differential Revision: https://reviews.llvm.org/D134036 2022-05-05 08:03:58 +02:00			`add_custom_target(libcxx-generate-escaped-output-table`
			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_escaped_output_table.py"`
			`"${LIBCXX_SOURCE_DIR}/include/__format/escaped_output_table.h"`
			`COMMENT "Generate the escaped output header")`

[libc++][format] Improves width estimate. As obvious from the paper's title this is an LWG issue and thus retroactively applied to C++20. This change may the output for certain code points: 1 Considers 8477 extra codepoints as having a width 2 (as of Unicode 15) (mostly Tangut Ideographs) 2 Change the width of 85 unassigned code points from 2 to 1 3 Change the width of 8 codepoints (in the range U+3248 CIRCLED NUMBER TEN ON BLACK SQUARE ... U+324F CIRCLED NUMBER EIGHTY ON BLACK SQUARE) from 2 to 1, because it seems questionable to make an exception for those without input from Unicode Note that libc++ already uses Unicode 15, while the Standard requires Unicode 12. (The last time I checked MSVC STL used Unicode 14.) So in practice the only notable change is item 3. Implements P2675 LWG3780: The Paper format's width estimation is too approximate and not forward compatible Benchmark before these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3928 ns 3928 ns 178131 BM_unicode_text<char> 75231 ns 75230 ns 9158 BM_cyrillic_text<char> 59837 ns 59834 ns 11529 BM_japanese_text<char> 39842 ns 39832 ns 17501 BM_emoji_text<char> 3931 ns 3930 ns 177750 BM_ascii_text<wchar_t> 4024 ns 4024 ns 174190 BM_unicode_text<wchar_t> 63756 ns 63751 ns 11136 BM_cyrillic_text<wchar_t> 44639 ns 44638 ns 15597 BM_japanese_text<wchar_t> 34425 ns 34424 ns 20283 BM_emoji_text<wchar_t> 3937 ns 3937 ns 177684 Benchmark after these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3914 ns 3913 ns 178814 BM_unicode_text<char> 70380 ns 70378 ns 9694 BM_cyrillic_text<char> 51889 ns 51877 ns 13488 BM_japanese_text<char> 41707 ns 41705 ns 16723 BM_emoji_text<char> 3908 ns 3907 ns 177912 BM_ascii_text<wchar_t> 3949 ns 3948 ns 177525 BM_unicode_text<wchar_t> 64591 ns 64587 ns 10649 BM_cyrillic_text<wchar_t> 44089 ns 44078 ns 15721 BM_japanese_text<wchar_t> 39369 ns 39367 ns 17779 BM_emoji_text<wchar_t> 3936 ns 3934 ns 177821 Benchmarks without "if(__code_point < (__entries[0] >> 14))" -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3922 ns 3922 ns 178587 BM_unicode_text<char> 94474 ns 94474 ns 7351 BM_cyrillic_text<char> 69202 ns 69200 ns 10157 BM_japanese_text<char> 42735 ns 42692 ns 16382 BM_emoji_text<char> 3920 ns 3919 ns 178704 BM_ascii_text<wchar_t> 3951 ns 3950 ns 177224 BM_unicode_text<wchar_t> 81003 ns 80988 ns 8668 BM_cyrillic_text<wchar_t> 57020 ns 57018 ns 12048 BM_japanese_text<wchar_t> 39695 ns 39687 ns 17582 BM_emoji_text<wchar_t> 3977 ns 3976 ns 176479 This optimization does carry its weight for the Unicode and Cyrillic test. For the Japanese tests the gains are minor and for emoji it seems to have no effect. Reviewed By: ldionne, tahonermann, #libc Differential Revision: https://reviews.llvm.org/D144499 2023-02-21 17:33:56 +01:00			`add_custom_target(libcxx-generate-width-estimation-table`
			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_width_estimation_table.py"`
			`"${LIBCXX_SOURCE_DIR}/include/__format/width_estimation_table.h"`
			`COMMENT "Generate the width estimation header")`

[libc++][format] Switches to Unicode 15.1. (#86543) In addition to changes in the tables the extended grapheme clustering algorithm has been overhauled. Before I considered a separate state machine to implement the rules. With the new rule GB9c this became more attractive and the design has changed. This change initially had quite an impact on the performance. By making the state machine persistent the performance was improved greatly. Note it is still slower than before due to the larger Unicode tables. Before -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 1891 ns 1889 ns 369504 BM_unicode_text<char> 106642 ns 106397 ns 6576 BM_cyrillic_text<char> 73420 ns 73277 ns 9445 BM_japanese_text<char> 62485 ns 62387 ns 11153 BM_emoji_text<char> 1895 ns 1893 ns 369525 BM_ascii_text<wchar_t> 2015 ns 2013 ns 346887 BM_unicode_text<wchar_t> 92119 ns 92017 ns 7598 BM_cyrillic_text<wchar_t> 62637 ns 62568 ns 11117 BM_japanese_text<wchar_t> 53850 ns 53785 ns 12803 BM_emoji_text<wchar_t> 2016 ns 2014 ns 347325 After -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 1906 ns 1904 ns 369409 BM_unicode_text<char> 265462 ns 265175 ns 2628 BM_cyrillic_text<char> 181063 ns 180865 ns 3871 BM_japanese_text<char> 130927 ns 130789 ns 5324 BM_emoji_text<char> 1892 ns 1890 ns 370537 BM_ascii_text<wchar_t> 2038 ns 2035 ns 343689 BM_unicode_text<wchar_t> 277603 ns 277282 ns 2526 BM_cyrillic_text<wchar_t> 188558 ns 188339 ns 3727 BM_japanese_text<wchar_t> 133084 ns 132943 ns 5262 BM_emoji_text<wchar_t> 2012 ns 2010 ns 348015 Persistent -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 1904 ns 1899 ns 367472 BM_unicode_text<char> 133609 ns 133287 ns 5246 BM_cyrillic_text<char> 90185 ns 89941 ns 7796 BM_japanese_text<char> 75137 ns 74946 ns 9316 BM_emoji_text<char> 1906 ns 1901 ns 368081 BM_ascii_text<wchar_t> 2703 ns 2696 ns 259153 BM_unicode_text<wchar_t> 131497 ns 131168 ns 5341 BM_cyrillic_text<wchar_t> 87071 ns 86840 ns 8076 BM_japanese_text<wchar_t> 72279 ns 72099 ns 9682 BM_emoji_text<wchar_t> 2021 ns 2016 ns 346767 2024-04-09 19:20:06 +02:00			`add_custom_target(libcxx-indic-conjunct-break-table`
			`COMMAND`
			`"${Python3_EXECUTABLE}"`
			`"${LIBCXX_SOURCE_DIR}/utils/generate_indic_conjunct_break_table.py"`
			`"${LIBCXX_SOURCE_DIR}/include/__format/indic_conjunct_break_table.h"`
			`COMMENT "Generate the Indic Conjunct Break header")`

[libc++] Add a CMake target to re-generate files and revamp CONTRIBUTING.rst As we automate more and more things in the library, it becomes useful for contributors to have a single target for running all the automation as part of their workflow. This commit adds a new `libcxx-generate-files` target that should re-generate all the auto-generated files in the library. As a fly-by, I also revamped the documentation on Contributing to account for this new target and present it as a bullet list of things to check before committing. I also added a few things that are often overlooked to that list, such as updating the synopsis and the status files. Differential Revision: https://reviews.llvm.org/D106067 2021-07-15 10:19:39 -04:00			`add_custom_target(libcxx-generate-files`
[libc++] Use .gen.py tests for the transitive inclusion tests This finishes the transition of tests covered in generate_header_tests.py to the new .gen.py format. Differential Revision: https://reviews.llvm.org/D152008 2023-05-31 13:32:06 -07:00			`DEPENDS libcxx-generate-feature-test-macros`
[libc++][modules] Generates std.cppm.in. This takes the header restrictions into account instead of manually duplicating this build information. This is a preparation to properly support the libc++ disabled parts in the std module. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D158192 2023-08-17 17:49:34 +02:00			`libcxx-generate-std-cppm-in-file`
[libc++][modules] Adds std.compat module. (#71438) This adds the std.compat module. The patch contains a bit of refactoring to avoid code duplication between the std and std.compat module. Implements parts of - P2465R3 Standard Library Modules std and std.compat 2023-12-09 13:51:50 +01:00			`libcxx-generate-std-compat-cppm-in-file`
[libc++] Improve updating data files. This changes makes it easier to update the Unicode data files used for the Extended Graphme Clustering as added in D126971. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D129668 2022-07-13 19:24:12 +02:00			`libcxx-generate-extended-grapheme-cluster-tables`
			`libcxx-generate-extended-grapheme-cluster-tests`
[libc++][format] Implements string escaping. Implements parts of - P2286R8 Formatting Ranges Reviewed By: #libc, tahonermann Differential Revision: https://reviews.llvm.org/D134036 2022-05-05 08:03:58 +02:00			`libcxx-generate-escaped-output-table`
[libc++][format] Improves width estimate. As obvious from the paper's title this is an LWG issue and thus retroactively applied to C++20. This change may the output for certain code points: 1 Considers 8477 extra codepoints as having a width 2 (as of Unicode 15) (mostly Tangut Ideographs) 2 Change the width of 85 unassigned code points from 2 to 1 3 Change the width of 8 codepoints (in the range U+3248 CIRCLED NUMBER TEN ON BLACK SQUARE ... U+324F CIRCLED NUMBER EIGHTY ON BLACK SQUARE) from 2 to 1, because it seems questionable to make an exception for those without input from Unicode Note that libc++ already uses Unicode 15, while the Standard requires Unicode 12. (The last time I checked MSVC STL used Unicode 14.) So in practice the only notable change is item 3. Implements P2675 LWG3780: The Paper format's width estimation is too approximate and not forward compatible Benchmark before these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3928 ns 3928 ns 178131 BM_unicode_text<char> 75231 ns 75230 ns 9158 BM_cyrillic_text<char> 59837 ns 59834 ns 11529 BM_japanese_text<char> 39842 ns 39832 ns 17501 BM_emoji_text<char> 3931 ns 3930 ns 177750 BM_ascii_text<wchar_t> 4024 ns 4024 ns 174190 BM_unicode_text<wchar_t> 63756 ns 63751 ns 11136 BM_cyrillic_text<wchar_t> 44639 ns 44638 ns 15597 BM_japanese_text<wchar_t> 34425 ns 34424 ns 20283 BM_emoji_text<wchar_t> 3937 ns 3937 ns 177684 Benchmark after these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3914 ns 3913 ns 178814 BM_unicode_text<char> 70380 ns 70378 ns 9694 BM_cyrillic_text<char> 51889 ns 51877 ns 13488 BM_japanese_text<char> 41707 ns 41705 ns 16723 BM_emoji_text<char> 3908 ns 3907 ns 177912 BM_ascii_text<wchar_t> 3949 ns 3948 ns 177525 BM_unicode_text<wchar_t> 64591 ns 64587 ns 10649 BM_cyrillic_text<wchar_t> 44089 ns 44078 ns 15721 BM_japanese_text<wchar_t> 39369 ns 39367 ns 17779 BM_emoji_text<wchar_t> 3936 ns 3934 ns 177821 Benchmarks without "if(__code_point < (__entries[0] >> 14))" -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3922 ns 3922 ns 178587 BM_unicode_text<char> 94474 ns 94474 ns 7351 BM_cyrillic_text<char> 69202 ns 69200 ns 10157 BM_japanese_text<char> 42735 ns 42692 ns 16382 BM_emoji_text<char> 3920 ns 3919 ns 178704 BM_ascii_text<wchar_t> 3951 ns 3950 ns 177224 BM_unicode_text<wchar_t> 81003 ns 80988 ns 8668 BM_cyrillic_text<wchar_t> 57020 ns 57018 ns 12048 BM_japanese_text<wchar_t> 39695 ns 39687 ns 17582 BM_emoji_text<wchar_t> 3977 ns 3976 ns 176479 This optimization does carry its weight for the Unicode and Cyrillic test. For the Japanese tests the gains are minor and for emoji it seems to have no effect. Reviewed By: ldionne, tahonermann, #libc Differential Revision: https://reviews.llvm.org/D144499 2023-02-21 17:33:56 +01:00			`libcxx-generate-width-estimation-table`
[libc++] Don't commit libcxx.imp (#89391) We can instead generate it on-the-fly when we install the headers. This reduces the amount of boilerplate we have to re-generate whenever we add, remove or relocate header files. Fixes #88529 2024-04-22 08:45:02 -04:00			`libcxx-indic-conjunct-break-table`
[libc++] Add a CMake target to re-generate files and revamp CONTRIBUTING.rst As we automate more and more things in the library, it becomes useful for contributors to have a single target for running all the automation as part of their workflow. This commit adds a new `libcxx-generate-files` target that should re-generate all the auto-generated files in the library. As a fly-by, I also revamped the documentation on Contributing to account for this new target and present it as a bullet list of things to check before committing. I also added a few things that are often overlooked to that list, such as updating the synopsis and the status files. Differential Revision: https://reviews.llvm.org/D106067 2021-07-15 10:19:39 -04:00			`COMMENT "Create all the auto-generated files in libc++ and its tests.")`