This patch documents the underlying API for implementing atomics on a
platform.
This doesn't change the operations that std::atomic is based on, but it
reorganizes the C11 / GCC implementation split to make it clearer what's
the base support layer and what's not.
Now that we've dropped support for older C++ dialects in the
synchronization library, we can use lambdas to clarify some of the code
used to implement atomic_wait.
The __atomic_base base class is only useful to conditionalize the
operations we provide inside std::atomic. It shouldn't be used directly
from other places in the library which can use std::atomic directly
instead.
Since we've granularized our includes, using std::atomic directly should
not make much of a difference compile-time wise.
This patch starts using std::atomic directly from other classes like
std::barrier and std::latch. Changing this shouldn't be an ABI break
since both classes have the same size and layout.
The benefits of this patch are isolating other parts of the code base
from implementation details of std::atomic and simplifying the mental
model for std::atomic's layers of implementation by making it clear that
__atomic_base is only an implementation detail of std::atomic.
Currently, the library-internal feature test macros are only defined if
the feature is not available, and always have the prefix
`_LIBCPP_HAS_NO_`. This patch changes that, so that they are always
defined and have the prefix `_LIBCPP_HAS_` instead. This changes the
canonical use of these macros to `#if _LIBCPP_HAS_FEATURE`, which means
that using an undefined macro (e.g. due to a missing include) is
diagnosed now. While this is rather unlikely currently, a similar change
in `<__configuration/availability.h>` caught a few bugs. This also
improves readability, since it removes the double-negation of `#ifndef
_LIBCPP_HAS_NO_FEATURE`.
The current patch only touches the macros defined in `<__config>`. If
people are happy with this approach, I'll make a follow-up PR to also
change the macros defined in `<__config_site>`.
This patch adds a large number of missing includes in the libc++ headers
and the test suite. Those were found as part of the effort to move
towards a mostly monolithic top-level std module.
Many headers include `<cstddef>` just for size_t, and pulling in
additional content (e.g. the traits used for std::byte) is unnecessary.
To solve this problem, this patch splits up `<cstddef>` into
subcomponents so that headers can include only the parts that they
actually require.
This has the added benefit of making the modules build a lot stricter
with respect to IWYU, and also providing a canonical location where we
define `std::size_t` and friends (which were previously defined in
multiple headers like `<cstddef>` and `<ctime>`).
After this patch, there's still many places in the codebase where we
include `<cstddef>` when `<__cstddef/size_t.h>` would be sufficient.
This patch focuses on removing `<cstddef>` includes from __type_traits
to make these headers non-circular with `<cstddef>`. Additional
refactorings can be tackled separately.
This patch increases the alignment requirement for std::atomic_ref
such that we can guarantee lockfree operations more often. Specifically,
we require types that are 1, 2, 4, 8, or 16 bytes in size to be aligned
to at least their size to be used with std::atomic_ref.
This is the case for most types, however a notable exception is
`long long` on x86, which is 8 bytes in length but has an alignment
of 4.
As a result of this patch, one has to be more careful about the
alignment of objects used with std::atomic_ref. Failure to provide
a properly-aligned object to std::atomic_ref is a precondition
violation and is technically UB. On the flipside, this allows us
to provide an atomic_ref that is actually lockfree more often,
which is an important QOI property.
More information in the discussion at https://github.com/llvm/llvm-project/pull/99570#issuecomment-2237668661.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
When we initially implemented the C++20 synchronization library, we
reluctantly accepted for the implementation to be backported to C++03
upon request from the person who provided the patch. This was when we
were only starting to have experience with the issues this can create,
so we flinched. Nowadays, we have a much stricter stance about not
backporting features to previous standards.
We have recently started fixing several bugs (and near bugs) in our
implementation of the synchronization library. A recurring theme during
these reviews has been how difficult to understand the current code is,
and upon inspection it becomes clear that being able to use a few recent
C++ features (in particular lambdas) would help a great deal. The code
would still be pretty intricate, but it would be a lot easier to reason
about the flow of callbacks through things like
__thread_poll_with_backoff.
As a result, this patch drops support for the synchronization library
before C++20. This makes us more strictly conforming and opens the door
to major simplifications, in particular around atomic_wait which was
supported all the way to C++03.
This change will probably have some impact on downstream users, however
since the C++20 synchronization library was added only in LLVM 10 (~3
years ago) and it's quite a niche feature, the set of people trying to
use this part of the library before C++20 should be reasonably small.
The builtin __atomic_always_lock_free takes into account the type of the
pointer provided as the second argument. Because we were passing void*,
rather than T*, the calculation failed. This meant that
atomic_ref<T>::is_always_lock_free was only true for char & bool.
This bug exists elsewhere in the atomic library (when using GCC, we fail
to pass a pointer at all, and we fail to correctly align the atomic like
_Atomic would).
This change also attempts to start sorting out testing difficulties with
this function that caused the bug to exist by using the
__GCC_ATOMIC_(CHAR|SHORT|INT|LONG|LLONG|POINTER)_IS_LOCK_FREE predefined
macros to establish an expected value for `is_always_lock_free` and
`is_lock_free` for the respective types, as well as types with matching
sizes and compatible alignment values.
Using these compiler pre-defines we can actually validate that certain
types, like char and int, are actually always lock free like they are on
every platform in the wild.
Note that this patch was actually authored by Eric Fiselier but I picked
up the patch and GitHub won't let me set Eric as the primary author.
Co-authored-by: Eric Fiselier <eric@efcs.ca>
The GCC build has gotten to the point where it's often hard to find the
actual error in the build log. We should look into enabling these
warnings again in the future, but it looks like a lot of them are
bogous.
As time went by, a few files have become mis-formatted w.r.t.
clang-format. This was made worse by the fact that formatting was not
being enforced in extensionless headers. This commit simply brings all
of libcxx/include in-line with clang-format again.
We might have to do this from time to time as we update our clang-format
version, but frankly this is really low effort now that we've formatted
everything once.
These were required a long time ago due to `static_assert` not actually
being available in C++03. Now `static_assert` is simply mapped to
`_Static_assert` in C++03, making the additional parens unnecessary.
In essence, this header has always been related to configuration of
the library but we didn't want to put it inside <__config> due to
complexity reasons. Now that we have sub-headers in <__config>, we
can move <__availability> to it and stop including it everywhere since
we already obtain the required macros via <__config>.
When we initially implemented the C++20 synchronization library, we
reluctantly accepted for the implementation to be backported to C++03
upon request from the person who provided the patch. This was when we
were only starting to have experience with the issues this can create,
so we flinched. Nowadays, we have a much stricter stance about not
backporting features to previous standards.
We have recently started fixing several bugs (and near bugs) in our
implementation of the synchronization library. A recurring theme during
these reviews has been how difficult to understand the current code is,
and upon inspection it becomes clear that being able to use a few recent
C++ features (in particular lambdas) would help a great deal. The code
would still be pretty intricate, but it would be a lot easier to reason
about the flow of callbacks through things like
__thread_poll_with_backoff.
As a result, this patch deprecates support for the synchronization
library before C++20. In the next release, we can remove that support
entirely.
These headers have become very small by using compiler builtins, often
containing only two declarations. This merges these headers, since
there doesn't seem to be much of a benefit keeping them separate.
Specifically, `is_{,_nothrow,_trivially}{assignable,constructible}` are
kept and the `copy`, `move` and `default` versions of these type traits
are moved in to the respective headers.
The goal of this patch is to make `atomic`'s wait functions to be
reusable by `atomic_ref`.
https://github.com/llvm/llvm-project/pull/76647
First, this patch is built on top of
https://github.com/llvm/llvm-project/pull/80596 , to reduce the future
merge conflicts.
This patch made the following functions as "API"s to be used by
`atomic`, `atomic_flag`, `semaphore`, `latch`, `atomic_ref`
```
__atomic_wait
__atomic_wait_unless
__atomic_notify_one
__atomic_notify_all
```
These functions are made generic to support `atomic` type and
`atomic_ref`. There are two customisation points.
```
// How to load the value from the given type (with a memory order)
__atomic_load
```
```
// what is the contention address that the platform `wait` function is going to monitor
__atomic_contention_address
```
For `atomic_ref` (not implemented in this patch), the `load` and
`address` function will be different, because
- it does not use the "atomic abstraction layer" so the `load` operation
will be some gcc builtin
- the contention address will be the user's actual type that the
`atomic_ref` is pointing to
Originally, we used __libcpp_verbose_abort to handle assertion failures.
That function was declared from all public headers. Since we don't use
that mechanism anymore, we don't need to declare __libcpp_verbose_abort
from all public headers, and we can clean up a lot of unnecessary
includes.
This patch also moves the definition of the various assertion categories
to the <__assert> header, since we now rely on regular IWYU for these
assertion macros.
rdar://105510916
As discussed in #76647, _LIBCPP_ATOMIC_ONLY_USE_BUILTINS is a
questionable configuration option. It makes our implementation of
std::atomic even more complicated than it already is for a limited
benefit.
Indeed, the original goal of that setting was to decouple libc++ from
libraries like compiler-rt and libatomic in Freestanding mode. We didn't
have a clear understanding of goals and non-goals of Freestanding back
then, but nowadays we do have a better understanding that removing all
dependencies of libc++ in Freestanding is a non-goal. We should still be
able to depend on builtins like those defined in compiler-rt for
implementing our atomic operations in Freestanding. Freestanding means
that there is no underlying operating system, not that there is no
toolchain available.
This patch removes the configuration option. This should have a very
limited fallout since that configuration was only enabled with
-ffreestanding, and libc++ basically doesn't work out of the box on
Freestanding platforms today.
The benefits are a slightly simpler implementation of std::atomic,
getting rid of one of the ABI-incompatible representations of
std::atomic, and clearing the way for proper Freestanding support to
eventually land in the library.
Fixes#81286
This is a follow-up PR to
<https://github.com/llvm/llvm-project/pull/79265>. It aims to be a
gentle refactoring of the `__cxx_atomic_wait` function that takes a
predicate.
The key idea here is that this function's signature is changed to look
like this (`std::function` used just for clarity):
```c++
__cxx_atomic_wait_fn(Atp*, std::function<bool(Tp &)> poll, memory_order __order);
```
...where `Tp` is the corresponding `value_type` to the atomic variable
type `Atp`. The function's semantics are similar to `atomic`s `.wait()`,
but instead of having a hardcoded predicate (is the loaded value unequal
to `old`?) the predicate is specified explicitly.
The `poll` function may change its argument, and it is very important
that if it returns `false`, it leaves its current understanding of the
atomic's value in the argument. Internally, `__cxx_atomic_wait_fn`
dispatches to two waiting mechanisms, depending on the type of the
atomic variable:
1. If the atomic variable can be waited on directly (for example,
Linux's futex mechanism only supports waiting on 32 bit long variables),
the value of the atomic variable (which `poll` made its decision on) is
then given to the underlying system wait function (e.g. futex).
2. If the atomic variable can not be waited on directly, there is a
global pool of atomics that are used for this task. The ["eventcount"
pattern](<https://gist.github.com/mratsim/04a29bdd98d6295acda4d0677c4d0041>)
is employed to make this possible.
The eventcount pattern needs a "monitor" variable which is read before
the condition is checked another time. libcxx has the
`__libcpp_atomic_monitor` function for this. However, this function only
has to be called in case "2", i.e. when the eventcount is actually used.
In case "1", the futex is used directly, so the monitor must be the
value of the atomic variable that the `poll` function made its decision
on to continue blocking. Previously, `__libcpp_atomic_monitor` was
_also_ used in case "1". This was the source of the ABA style bug that
PR#79265 fixed.
However, the solution in PR#79265 has some disadvantages:
- It exposes internals such as `cxx_contention_t` or the fact that
`__libcpp_thread_poll_with_backoff` needs two functions to higher level
constructs such as `semaphore`.
- It doesn't prevent consumers calling `__cxx_atomic_wait` in an error
prone way, i.e. by providing to it a predicate that doesn't take an
argument. This makes ABA style issues more likely to appear.
Now, `__cxx_atomic_wait_fn` takes just _one_ function, which is then
transformed into the `poll` and `backoff` callables needed by
`__libcpp_thread_poll_with_backoff`.
Aside from the `__cxx_atomic_wait` changes, the only other change is the
weakening of the initial atomic load of `semaphore`'s `try_acquire` into
`memory_order_relaxed` and the CAS inside the loop is changed from
`strong` to `weak`. Both weakenings should be fine, since the CAS is
called in a loop, and the "acquire" semantics of `try_acquire` come from
the CAS, not from the initial load.
The <__threading_support> header is a huge beast and it's really
difficult to navigate. I find myself struggling to find what I want
every time I have to open it, and I've been considering splitting it up
for years for that reason.
This patch aims not to contain any functional change. The various
implementations of the threading base are simply moved to separate
headers and then the individual headers are simplified in mechanical
ways. For example, we used to have redundant declarations of all the
functions at the top of `__threading_support`, and those are removed
since they are not needed anymore. The various #ifdefs are also
simplified and removed when they become unnecessary.
Finally, this patch adds documentation for the API we expect from any
threading implementation.
Clang-tidy 18 no longer has false positives with the spaceship operator.
Note that I'm quite sure there are more occurrences in our headers that
are not caught.
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.
This patch was generated with:
find libcxx/include libcxx/src -type f \
| grep -v 'module.modulemap.in' \
| grep -v 'CMakeLists.txt' \
| grep -v 'README.txt' \
| grep -v 'libcxx.imp' \
| grep -v '__config_site.in' \
| xargs clang-format -i
A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.
[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
Their definition was a bit roundabout and it was actually wrong since
atomic_unsigned_lock_free would be a signed type whenever
__cxx_contention_t is lock free, which is most of the time.
Fixes#72968
std::atomic is implemented with the following (confusing!) hierarchy of
types:
std::atomic<T> : std::__atomic_base<T> { ... };
std::__atomic_base<T> {
std::__cxx_atomic_impl<T> __impl;
};
std::__cxx_atomic_impl<T> {
_Atomic(T) __val;
};
Inside std::__atomic_base, we implement the is_lock_free() and
is_always_lock_free() functions. However, we used to implement them
inconsistently:
- is_always_lock_free() is based on whether __cxx_atomic_impl<T> is
always lock free (using the builtin), which means that we include any
potential padding added by _Atomic(T) into the determination.
- is_lock_free() was based on whether T is lock free (using the
builtin), which meant that we did not take into account any potential
padding added by _Atomic(T).
It is important to note that the padding added by _Atomic(T) can turn a
type that wouldn't be lock free into a lock free type, for example by
making its size become a power of two.
The inconsistency of how the two functions were implemented could lead
to cases where is_always_lock_free() would return true, but
is_lock_free() would then return false. This is the case for example of
the following type, which is always lock free on arm64 but was
incorrectly reported as !is_lock_free() before this patch:
struct Foo { float x[3]; };
This patch switches the determination of is_lock_free() to be based on
__cxx_atomic_impl<T> instead to match how we determine
is_always_lock_free().
rdar://115324353
This brings most of the enable_ifs in libc++ to the same style. It also has the nice side-effect of reducing the size of names of these symbols, since the depedent return type is shorter.
Reviewed By: #libc, ldionne
Spies: ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D157736
We already have a clang-tidy check for making sure that `_LIBCPP_HIDE_FROM_ABI` is on free functions. This patch extends this to class members. The places where we don't check for `_LIBCPP_HIDE_FROM_ABI` are classes for which we have an instantiation in the library.
Reviewed By: ldionne, Mordante, #libc
Spies: jplehr, mikhail.ramalho, sstefan1, libcxx-commits, krytarowski, miyuki, smeenai
Differential Revision: https://reviews.llvm.org/D142332
This uses std::addressof everywherein atomic. This is not strictly
needed for the integral and floating point specializations. They should
not be used by user defined types. But it's easier to fix everything.
Note these changes are made using a WIP clang-tidy plugin.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D144786