In Rust, asynchronous functions do not require allocating on the heap. Function async
returns a compiler-generated structure on which you call a compiler generated poll
method. The design of async looks logical and clear.
The design of coroutines in C++ is strange. It forces you to do allocation on the heap, because you have no other way to return something using co_return
.
(It is possible to make a custom allocator that will make allocations in a buffer on the stack, but this will unnecessarily complicate the code.)
Why was it decided in the design of C++ that an object returned by a coroutine must have a promise_type
?
Why is await_ready
, await_suspend
, await_resume
not enough?
(It looks strange, and this is what enforces you to do allocation; you can't just directly construct a SomeTask<T>
object (with three await_*
methods) and return it.)
How can we make a zero-alloc coroutines without a custom allocator?
Essentially it forces you to do allocation on the heap, because you have no other way to return something using
co_return
That's not the reason for the allocation.
Coroutines require storage to do their job. Not just the marshaling of return values, but for tracking the entire thing. The coroutine's stack, the thing that has to be paused and resumed, is part of this.
The whole point of a coroutine is that it can suspend execution and have that execution resumed by somebody. Who that somebody is is not known statically. And that is the reason why dynamic allocation is needed; coroutine lifetimes are not limited by the scope of their initial invoker.
If function X invokes coroutine A, and coroutine A suspends itself pending some async request, function X may just return the coroutine to someone else who will wait on it. This means that function X's call stack is gone. If coroutine A lived on function X's stack, its storage is gone now.
That's bad.
C++ stores coroutine A in dynamically-allocated memory to allow it to survive its calling invocation. This is the presumed default case of coroutines.
Presumably, the circumstances you are describing are those where function X itself is a coroutine that awaits on A, or otherwise will not leave its own scope until A is finished. This would allow the coroutine state of A to live on X's stack without problems. C++ compilers are permitted to optimize these conditions if they are detected; they can elide the dynamic allocation.
But otherwise, that dynamic allocation is necessary.
I don't know Rust. So I don't know what Rust is doing with its async
functions. Maybe its async functions are unable to persist beyond the boundaries of their calling scope. Maybe Rust is better able to detect when the async function needs to be able to live beyond its caller's scope. Maybe you have to explicitly tell Rust when you want a coroutine to escape its caller's scope. I don't know.
But C++ isn't like that. C++ coroutines are presumed to leave the scope of their invoking function.