7 Commits

Author SHA1 Message Date
Louis Dionne
1a5af34e6f
[libc++] Speed up classic locale (take 2) (#73533)
Locale objects use atomic reference counting, which may be very
expensive in parallel applications. The classic locale is used by
default by all streams and can be very contended. But it's never
destroyed, so the reference counting is also completely pointless on the
classic locale. Currently ~70% of time in the parallel stringstream
benchmarks is spent in locale ctor/dtor. And the execution radically
slows down with more threads.

Avoid reference counting on the classic locale. With this change
parallel benchmarks start to scale with threads.

This is a re-application of f8afc53d641c (aka PR #72112) which was
reverted in 4e0c48b907f1 because it broke the sanitizer builds due
to an initialization order fiasco. This issue has now been fixed by
ensuring that the locale is constinit'ed.

Co-authored-by: Dmitry Vyukov <dvyukov@google.com>
2023-11-29 09:31:05 -05:00
Kirill Stoimenov
4e0c48b907 Revert "[libc++] Speed up classic locale (#72112)"
Looks like it broke the ASAN build: https://lab.llvm.org/buildbot/#/builders/168/builds/17053/steps/9/logs/stdio

This reverts commit f8afc53d641ce9d4ad8565aae9e7b5911b572a02.
2023-11-27 20:21:58 +00:00
Dmitry Vyukov
f8afc53d64
[libc++] Speed up classic locale (#72112)
Locale objects use atomic reference counting, which may be very
expensive in parallel applications. The classic locale is used by
default by all streams and can be very contended. But it's never
destroyed, so the reference counting is also completely pointless on the
classic locale. Currently ~70% of time in the parallel stringstream
benchmarks is spent in locale ctor/dtor. And the execution radically
slows down with more threads.

Avoid reference counting on the classic locale. With this change
parallel benchmarks start to scale with threads.

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>

```
                              │   baseline   │    optimized                            │
                              │    sec/op    │    sec/op      vs base                  │
Istream_numbers/0/threads:1      4.672µ ± 0%   4.419µ ± 0%     -5.42% (p=0.000 n=30+39)
Istream_numbers/0/threads:72   539.817µ ± 0%   9.842µ ± 1%    -98.18% (p=0.000 n=30+40)
Istream_numbers/1/threads:1      4.890µ ± 0%   4.750µ ± 0%     -2.85% (p=0.000 n=30+40)
Istream_numbers/1/threads:72     66.44µ ± 1%   10.14µ ± 1%    -84.74% (p=0.000 n=30+40)
Istream_numbers/2/threads:1      4.888µ ± 0%   4.746µ ± 0%     -2.92% (p=0.000 n=30+40)
Istream_numbers/2/threads:72     494.8µ ± 0%   410.2µ ± 1%    -17.11% (p=0.000 n=30+40)
Istream_numbers/3/threads:1      4.697µ ± 0%   4.695µ ± 5%          ~ (p=0.391 n=30+37)
Istream_numbers/3/threads:72     421.5µ ± 7%   421.9µ ± 9%          ~ (p=0.665 n=30)
Ostream_number/0/threads:1       183.0n ± 0%   141.0n ± 2%    -22.95% (p=0.000 n=30)
Ostream_number/0/threads:72    24196.5n ± 1%   343.5n ± 3%    -98.58% (p=0.000 n=30)
Ostream_number/1/threads:1       250.0n ± 0%   196.0n ± 2%    -21.60% (p=0.000 n=30)
Ostream_number/1/threads:72    16260.5n ± 0%   407.0n ± 2%    -97.50% (p=0.000 n=30)
Ostream_number/2/threads:1       254.0n ± 0%   196.0n ± 1%    -22.83% (p=0.000 n=30)
Ostream_number/2/threads:72      28.49µ ± 1%   18.89µ ± 5%    -33.72% (p=0.000 n=30)
Ostream_number/3/threads:1       185.0n ± 0%   185.0n ± 0%      0.00% (p=0.017 n=30)
Ostream_number/3/threads:72      19.38µ ± 4%   19.33µ ± 5%          ~ (p=0.425 n=30)
```
2023-11-27 07:00:21 +01:00
Louis Dionne
5aa03b648b [libc++][NFC] Apply clang-format on large parts of the code base
This commit does a pass of clang-format over files in libc++ that
don't require major changes to conform to our style guide, or for
which we're not overly concerned about conflicting with in-flight
patches or hindering the git blame.

This roughly covers:
- benchmarks
- range algorithms
- concepts
- type traits

I did a manual verification of all the changes, and in particular I
applied clang-format on/off annotations in a few places where the
result was less readable after than before. This was not necessary
in a lot of places, however I did find that clang-format had pretty
bad taste when it comes to formatting concepts.

Differential Revision: https://reviews.llvm.org/D153140
2023-06-19 11:19:51 -04:00
Eric Fiselier
fca28db904 Add test macros for always_inline and noinline
llvm-svn: 344167
2018-10-10 18:22:23 +00:00
Eric Fiselier
1903976d37 Update Google Benchmark library
llvm-svn: 322812
2018-01-18 04:23:01 +00:00
Aditya Kumar
38bc3df8a3 [locale] Avoid copy of __atoms when char_type is char
The function num_get<_CharT>::stage2_int_prep makes unnecessary copy of src
into atoms when char_type is char. This can be avoided by creating
a switch on type and just returning __src when char_type is char.

Added the test case to demonstrate performance improvement.
In order to avoid ABI incompatibilities, the changes are guarded
with a macro _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET

Differential Revision: https://reviews.llvm.org/D30268
Reviewed by: EricWF

llvm-svn: 305427
2017-06-14 23:17:45 +00:00