Segfaulting with strands/c++20 coroutine in boost asio. What should proper usage look like?

I'm new to asio and trying to wrap my head around strands. I feel like I understand them conceptually as per the docs, but running into a few issues with usage.

My thought was, if I have some resource that naturally requires synchronous access, I can bundle that with a strand and make sure asio is executing within that strand whenever any operation needs to access the resource.

I tried to create a coroutine-based loop like so:

asio::awaitable<void> progress(auto strand) {
    int i = 0;
    for (;;) {
        co_await dispatch(bind_executor(strand, asio::deferred));
        ore::util::print("Strand Executor 1:", ++i);
        // ...Do something with a synchronous resource...

        co_await dispatch(asio::deferred);
        ore::util::print("Back on IO Context Executor");
        // ...Code that doesn't depend on the synchronous resource...

        co_await dispatch(bind_executor(strand, asio::deferred));
        ore::util::print("Strand Executor 2");
        // ... Code that depends on the synchronous resource again...
    }
}

int main() {
    asio::io_context io;
    auto my_strand = make_strand(io);

    co_spawn(io, progress(my_strand), asio::detached);

    io.run();
}

This has (at least) one issue, and I'm curious about it. Because I defined the strand as:

    auto my_strand = make_strand(io);

It will end up segfaulting after 2.6k iterations or so.

...
T0 Strand Executor 1: 2657
T0 Back on IO Context Executor
Segmentation fault (core dumped)

It seems that the stack size within asio explodes because both strands of execution come from the same executor, but I'm not clear on why this is the case?

This works iteration works fine if the strands are from different threads:

    asio::thread_pool tp(1);
    auto my_strand = make_strand(tp);

... Testing a little bit more, I'm noticing that it segfaults with an even simpler test:

asio::awaitable<void> progress() {
    for (;;) {
        co_await dispatch(asio::deferred);
    }
}

int main() {
    asio::io_context io;

    co_spawn(io, progress(), asio::detached);

    io.run();
}

Maybe the segfault has to do with the stackful coroutine implementation? I thought this should be stackless but maybe that's not the case.

So three questions:

Why is this segfaulting?
Why isn't it segfaulting if I create the strand on a separate executor
Is this the proper way to think about and create strands?

Note: Code modified from this blog post

Solution

Pretty good catch. Yes. This behavior results in a stack overflow because of the use of dispatch.

If you use post there is not such an issue. Using dispatch allows Asio to "bypass" the scheduler in some circumstances (the least of which is that the running thread satisfies the scheduler requirement, such as strand availability), and directly invoke the handler on the same stack frame.

E.g. docs

This function is used to ask the strand to execute the given function object on its underlying executor. The function object will be executed inside this function if the strand is not otherwise busy and if the underlying executor's dispatch() function is also able to execute the function before returning.

Compare with the direct docs for post:

This function submits an object for execution using the object's associated executor. The function object is queued for execution, and is never called from the current thread prior to returning from post().

The use of post(), rather than defer, indicates the caller's preference that the function object be eagerly queued for execution.

The gotcha here is that due to coroutine model, the infinite loop can actually become infinite recursion if allowed this way. At least break the direct invocation chain once every iteration and you should be fine.

Caveat

Also, note that it's probably anti-pattern to switch executors inside coroutines. I know it's an oft-recurring question, but I feel this is probably mostly coming from people directly translating old-fashioned concurrency based on mutual exclusion to the Asio strand paradigm. I suppose you will be happier embracing signals (e.g. using timers, or perhaps something like https://klemens.dev/sam/) or channels