Locale objects use atomic reference counting, which may be very
expensive in parallel applications. The classic locale is used by
default by all streams and can be very contended. But it's never
destroyed, so the reference counting is also completely pointless on the
classic locale. Currently ~70% of time in the parallel stringstream
benchmarks is spent in locale ctor/dtor. And the execution radically
slows down with more threads.
Avoid reference counting on the classic locale. With this change
parallel benchmarks start to scale with threads.
This is a re-application of f8afc53d641c (aka PR #72112) which was
reverted in 4e0c48b907f1 because it broke the sanitizer builds due
to an initialization order fiasco. This issue has now been fixed by
ensuring that the locale is constinit'ed.
Co-authored-by: Dmitry Vyukov <dvyukov@google.com>
This commit does a pass of clang-format over files in libc++ that
don't require major changes to conform to our style guide, or for
which we're not overly concerned about conflicting with in-flight
patches or hindering the git blame.
This roughly covers:
- benchmarks
- range algorithms
- concepts
- type traits
I did a manual verification of all the changes, and in particular I
applied clang-format on/off annotations in a few places where the
result was less readable after than before. This was not necessary
in a lot of places, however I did find that clang-format had pretty
bad taste when it comes to formatting concepts.
Differential Revision: https://reviews.llvm.org/D153140
The function num_get<_CharT>::stage2_int_prep makes unnecessary copy of src
into atoms when char_type is char. This can be avoided by creating
a switch on type and just returning __src when char_type is char.
Added the test case to demonstrate performance improvement.
In order to avoid ABI incompatibilities, the changes are guarded
with a macro _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
Differential Revision: https://reviews.llvm.org/D30268
Reviewed by: EricWF
llvm-svn: 305427