llvm-project/clang/test/CodeGenCoroutines/coro-always-inline.cpp
fpasserby f786881340
[coroutine] Implement llvm.coro.await.suspend intrinsic (#79712)
Implement `llvm.coro.await.suspend` intrinsics, to deal with performance
regression after prohibiting `.await_suspend` inlining, as suggested in
#64945.
Actually, there are three new intrinsics, which directly correspond to
each of three forms of `await_suspend`:
```
void llvm.coro.await.suspend.void(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
```
There are three different versions instead of one, because in `bool`
case it's result is used for resuming via a branch, and in
`coroutine_handle` case exceptions from `await_suspend` are handled in
the coroutine, and exceptions from the subsequent `.resume()` are
propagated to the caller.

Await-suspend block is simplified down to intrinsic calls only, for
example for symmetric transfer:
```
%id = call token @llvm.coro.save(ptr null)
%handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
call void @llvm.coro.resume(%handle)
%result = call i8 @llvm.coro.suspend(token %id, i1 false)
switch i8 %result, ...
```
All await-suspend logic is moved out into a wrapper function, generated
for each suspension point.
The signature of the function is `<type> wrapperFunction(ptr %awaiter,
ptr %frame)` where `<type>` is one of `void` `i1` or `ptr`, depending on
the return type of `await_suspend`.
Intrinsic calls are lowered during `CoroSplit` pass, right after the
split.

Because I'm new to LLVM, I'm not sure if the helper function generation,
calls to them and lowering are implemented in the right way, especially
with regard to various metadata and attributes, i. e. for TBAA. All
things that seemed questionable are marked with `FIXME` comments.

There is another detail: in case of symmetric transfer raw pointer to
the frame of coroutine, that should be resumed, is returned from the
helper function and a direct call to `@llvm.coro.resume` is generated.
C++ standard demands, that `.resume()` method is evaluated. Not sure how
important is this, because code has been generated in the same way
before, sans helper function.
2024-03-11 10:00:00 +08:00

51 lines
1.3 KiB
C++

// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -std=c++20 \
// RUN: -O0 %s -o - | FileCheck %s
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -std=c++20 \
// RUN: -fno-inline -O0 %s -o - | FileCheck %s
namespace std {
struct handle {};
struct awaitable {
bool await_ready() noexcept { return true; }
// CHECK-NOT: await_suspend
inline void __attribute__((__always_inline__)) await_suspend(handle) noexcept {}
bool await_resume() noexcept { return true; }
};
template <typename T>
struct coroutine_handle {
static handle from_address(void *address) noexcept { return {}; }
};
template <typename T = void>
struct coroutine_traits {
struct promise_type {
awaitable initial_suspend() { return {}; }
awaitable final_suspend() noexcept { return {}; }
void return_void() {}
T get_return_object() { return T(); }
void unhandled_exception() {}
};
};
} // namespace std
// CHECK-LABEL: @_Z3foov
// CHECK-LABEL: entry:
// CHECK: %ref.tmp.reload.addr = getelementptr
// CHECK: %ref.tmp3.reload.addr = getelementptr
void foo() { co_return; }
// Check that bar is not inlined even it's marked as always_inline.
// CHECK-LABEL: define {{.*}} void @_Z3bazv()
// CHECK: call void @_Z3barv(
__attribute__((__always_inline__)) void bar() {
co_return;
}
void baz() {
bar();
co_return;
}