Search code examples
pythonpython-asynciocoroutine

Are Python coroutines stackless or stackful?


I've seen conflicting views on whether Python coroutines (I primarily mean async/await) are stackless or stackful.

Some sources say they're stackful:

While others seem to imply they're stackless, e.g. https://gamelisp.rs/reference/coroutines.html

GameLisp's coroutines follow the model set by Rust, Python, C# and C++. Our coroutines are "stackless"

In general my understanding always was that any meaningful async/await implementation implies stackless coroutines, while stackful ones are basically fibers (userspace threads, often switched more or less cooperatively), like goroutines, Boost.Coroutine, apparently those in Lua etc.

Is my understanding correct? Or do Python coroutines somehow fundamentally differ from those in say C++, and are stackful? Or do the authors of the source above mean different things?


Solution

  • TLDR: Going by the C++ definition, Python's coroutines too are stackless. The defining feature is that the coroutine's persistent state is stored separately from the stack.

    Coroutines are stackless: they suspend execution by returning to the caller and the data that is required to resume execution is stored separately from the stack. […] (cppreference: Coroutines (C++20))

    On a less technical level, Python requires await, async for, etc. to allow child coroutines to suspend their parent(s). It is not possible for a regularly called function to suspend its parent.

    async def foo():
        a = await some_awaitable()  # this may suspend `foo` as well
        b = some_function()         # this cannot suspend `foo`
        return a + b
    

    In contrast to a stackless coroutine a stackful coroutine can be suspended from within a nested stackframe. (Boost: Coroutines)


    Stack and Coroutines

    The interaction of stack and coroutine is not as clearcut as for regular routines. A routine is either executing or done. This naturally maps to function execution adding a single execution frame to the stack;1 once that frame completes the routine completes and the frame is popped from the stack and discarded.

    In contrast, a coroutine can be executing, suspended, or done. This raises the question how to handle the state during suspension – practically, what to do with local variables. There are two prominent means:

    1. We can treat each partial execution like a regular routine. To get coroutine semantics, when such a routine finishes we safe its state. This allows us to switch between individual coroutines.

    2. We can treat the entire execution like a regular routine. To get coroutine semantics, when such a routine suspends we safe the entire stack. This allows us to switch between individual stacks.

    Solution 1. is usually called stackless – the coroutine lives separately from the stack. This gives a lot of power since each step is first-class, but it exposes a lot of implementation details. We must explicitly emulate the stacking of nested coroutine execution – usually, this is what await is there for.

    Solution 2. is usually called stackfull – the coroutine lives as part of the stack itself. This removes a lot of power since execution happens internally, but hides a lot of implementation details. The distinction of routines versus coroutines is hidden or even inconsequential – any routine may suspend at any point.

    These definitions are not universal. For example, you may find 1. being called stackfull since the coroutine itself keeps its own suspended stack. When in doubt, compare the semantics instead of the naming.
    However, this appears to be the most prevalent naming scheme (see references).


    Where does Python fit in?

    From a technical standpoint, Python's coroutines are stackless. Coroutine functions create first-class coroutine objects that allow partial execution and store state internally during suspension (as cr_frame). Coroutines are nested using explicit await and routines cannot await anything.
    Notably, Python itself does not support stackful coroutines: Regular routines cannot suspend execution.2


    1The semantics of a frame stack do not necessarily mean that the implementation has a 1:1 relation of function execution to frames on a memory stack. Rather, consider that "the stack" as an abstract description of a runtime can be defined by function execution. Implementations may differ without having observable difference for high-level semantics of a language. Consider CPython which has a C stack running a Python stack.

    2The absence of stackfull coroutines is not strictly guaranteed by the language semantics. There are third-party extensions to add stackfull coroutines, namely greenlet. Fittingly, this was introduced by Stackless Python – the "Stackless" refers to the use of the C-Stack by the interpreter, not to how coroutines use the stack.
    However, this is a separate suspension mechanism from Python's own await and yield suspension.

    References: