Search code examples
c++c++20destructorcoroutinec++-coroutine

C++ coroutine destruction order


This article presents this pseudocode for how the compiler transforms a coroutine function:

ReturnType someCoroutine(Parameters parameter)
{
    auto* frame = new coroutineFrame(std::forward<Parameters>(parameters));
    auto returnObject = frame->promise.get_return_object();
    co_await frame->promise.initial_suspend();
    try
    {
        <body-statements>
    }
    catch (...)
    {
        frame->promise.unhandled_exception();
    }
    co_await frame->promise.final_suspend();
    delete frame;
    return returnObject;
}

However, this pseudocode would imply that if your coroutine type returns a continuation from final_suspend that the memory from the coroutine that just finished is not freed until after the continuation runs which seems less than ideal. When I try running this minimal example, it seems to happen before:

#include <concepts>
#include <coroutine>
#include <exception>
#include <iostream>
#include <vector>

template<class T = void>
class Future;

template<>
class Future<void>
{
public:
    class Awaiter;

    class promise_type {
    private:
        std::exception_ptr _exception;
        std::coroutine_handle<> _continuation;

        friend class Awaiter;

    public:
        Future<void> get_return_object() { return {HandleT::from_promise(*this)}; }

        static Future<void> get_return_object_on_allocation_failure() {
            abort();
        }

        // Futures are lazy, they don't do anything until they are
        // awaited or a task is spawned that takes one.
        std::suspend_always initial_suspend() { return {}; }

        // Always cleanup right away. Maybe not the right choice?
        std::suspend_never final_suspend() noexcept {
            std::cerr << "final_suspend this=" << this << std::endl;
            return {};
        }

        // Need this definition to avoid UB
        void return_void() {
            std::cerr << "returning void this=" << this << std::endl;
        }

        // futures are not generators. Also note there is no
        // `yield_void`, standard is inconsistent.
        std::suspend_always yield_value(...) = delete;

        void unhandled_exception() {
            _exception = std::current_exception();
        }

        void* operator new(std::size_t size) noexcept
        {
            auto p = malloc(size);
            std::cerr << "new p=" << p << std::endl;
            return p;
        }

        void operator delete(void* p, std::size_t size) noexcept
        {
            std::cerr << "delete p=" << p << std::endl;
        }
    };

    struct Awaiter {
    private:
        Future<void>& future;

    public:
        explicit Awaiter(Future<void>& future)
            : future(future)
        {}

        // this will check if the future is already done, e.g. if it
        // was `co_await`ed previously
        bool await_ready() const noexcept { return future._done; }

        // this coroutine is the one doing the `co_await`
        std::coroutine_handle<> await_suspend(std::coroutine_handle<> handle) noexcept
        {
            // remember we want to resume back intot his caller later
            future._handle.promise()._continuation = handle;

            // For now let the coroutine associated with this future
            // run, via "symmetric transfer", which takes the caller
            // resume call off the stack and replaces it with this
            // coroutine.
            return future._handle;
        }

        // this runs in the caller's task when they wake back up from having resume called?
        void await_resume() const noexcept {
            future._done = true;
        }
    };

    auto operator co_await() {
        return Awaiter(*this);
    }

    void resume() {
        _handle.resume();
    }

protected:
    using HandleT = std::coroutine_handle<promise_type>;

    Future(HandleT&& p)
        : _handle(std::move(p))
    {}

    Future(const Future&) = delete;
    Future& operator=(const Future&) = delete;

    Future(Future&& other) = delete;
    Future& operator=(Future&& other) = delete;

    friend class Awaiter;

    bool _done = false;
    HandleT _handle;
    std::exception_ptr _exception = nullptr;
};

__attribute__((noinline)) Future<void> b()
{
    co_return;
}

__attribute__((noinline)) Future<void> a()
{
    co_await b();
}

int main()
{
    auto fut = a();
    std::cerr << "first resume" << std::endl;
    fut.resume();
    std::cerr << "second resume" << std::endl;
    fut.resume();
    std::cerr << "exiting main" << std::endl;
}

Which gives the output below. The addresses don't exactly match, but the addresses that are only 16 bytes apart are referring to the same coroutine, so 0x6424502582b0 and 0x6424502582c0 are a and 0x642450258310 and 0x642450258320 are b, so it appears the inner call, b is freed before a:

g++-14 -std=c++23 -g3 -fcoroutines example.cpp && ./a.out
new p=0x6424502582b0
first resume
new p=0x642450258310
returning void this=0x642450258320
final_suspend this=0x642450258320
delete p=0x642450258310
second resume
returning void this=0x6424502582c0
final_suspend this=0x6424502582c0
delete p=0x6424502582b0
exiting main

Is this a compiler optimization, or is the pseudocode wrong, or am I causing UB that just happens to not crash? :)


Solution

  • if your coroutine type returns a continuation from final_suspend that the memory from the coroutine that just finished is not freed until after the continuation runs

    This is a correct description of the behaviour specified by the C++ standard. I'm confused by your program, though: your coroutines don't use symmetric transfer at the final suspend point.

    In the case of a coroutine that does use symmetric transfer at the final suspend point, note that such a coroutine will suspend at its final suspend point and then the other coroutine will resume; the former can then no longer be resumed, so it will be deallocated only when explicitly destroyed. Here's a program that illustrates this behaviour:

    #include <coroutine>
    #include <iostream>
    struct Task {
        struct promise_type {
            std::coroutine_handle<promise_type> continuation_;
            Task get_return_object() {
                return Task{*this};
            }
            std::suspend_always initial_suspend() { return {}; }
            struct FinalAwaitable {
                bool await_ready() noexcept { return false; }
                std::coroutine_handle<>
                await_suspend(std::coroutine_handle<promise_type> h) noexcept {
                    if (h.promise().continuation_) {
                        return h.promise().continuation_;
                    } else {
                        return std::noop_coroutine();
                    }
                }
                void await_resume() noexcept {}
            };
            FinalAwaitable final_suspend() noexcept { return {}; }
            void return_void() {}
            void unhandled_exception() {}
    
            void* operator new(std::size_t size) {
                void* result = malloc(size);
                std::cout << "allocating coro frame at " << result << '\n';
                return result;
            }
    
            void operator delete(void* p) {
                std::cout << "deallocating coro frame at " << p << '\n';
                free(p);
            }
    
            auto handle() {
                return std::coroutine_handle<promise_type>::from_promise(*this);
            }
        };
        promise_type& promise_;
    };
    Task f() {
        std::cout << "resuming f\n";
        co_return;
    }
    Task g() {
        co_return;
    }
    int main() {
        Task task1 = f();
        Task task2 = g();
        task2.promise_.continuation_ = task1.promise_.handle();
        task2.promise_.handle().resume();
        task1.promise_.handle().destroy();
        task2.promise_.handle().destroy();
    }
    

    Coroutine f is started first, followed by g. Then, the latter is configured to transfer control to the former at the final suspend point. When g is resumed, it becomes suspended at its final suspend point and transfers control to f. Because f also suspends at its final suspend point, both coroutines are not deallocated until explicitly destroyed in main. The output may look something like this:

    allocating coro frame at 0x55cc5e934eb0
    allocating coro frame at 0x55cc5e9352f0
    resuming f
    deallocating coro frame at 0x55cc5e934eb0
    deallocating coro frame at 0x55cc5e9352f0
    

    From this we can see that coroutine g does not get destroyed prior to the symmetric transfer.