Switch to new call stack in C?

I am writing a toy programming language and would like to implement cooperative multi-tasking. The compiler and runtime are written in C.

Instead of using threads, I would like to be able to switch the C call stack. I.e. I would like to have multiple call stacks "alive" and switch between them by calling a function. This should be conceptually similar to a longjmp but preserve the old/new call stack. The goal is essentially to implement something like "async" runtimes in other languages.

I would like to do this without a library in the "simplest" way possible. My idea is to essentially just allocate a new C call stack and switch to it. Is there any way to do this in C in a platform agnostic way (i.e. without assembly)?

The closest thing I found was a small block of assembly in this answer (linked from this question)

Edit: by "platform agnostic" I meant hardware agnostic using linux, although general Unix/BSD support would be nice.

Solution

It is unclear what answer will satisfy you.

The C language by itself does not provide any such facility, so any solution will rely on something else providing the functionality.

Note that since C.2011, C introduced optional thread support in <threads.h>, but there are no thread attribute manipulations possible to allow one to distinguish between ordinary and lightweight thread execution.

Maybe a state-machine, but needs help from the OS.

If you are asking how one achieves this using only the primitives provided by the C language, then the answer choice is pretty much limited to implementing your own state machine mechanism. But this would require that you use mechanisms provided by your operating system to enable non-blocking interactions when accessing operations that would normally block until completion. This is to give you the opportunity to park the current execution context and try an alternate one instead.

For example, in UNIX, one could create a state machine data structure to indicate the progress of sending a file over a socket.

struct send_a_file_ctx {
    int socket;
    int file;
    size_t file_size;     /* total number of bytes to send */
    size_t bytes_sent;    /* bytes sent so far over the socket */
    size_t buffer_size;   /* number of file bytes in the buffer */
    size_t buffer_offset; /* number of file bytes from buffer sent */
    char buffer[];        /* buffered chunk of the file */
};

There is no execution stack per se, but there is a structure that is tracking what is going on with the delivery of the bytes in the buffer onto the socket.

In UNIX, the send() call could have the MSG_DONTWAIT flag passed in to indicate that the operation should not block for completion, but instead return a value to indicate that the operation would have blocked. This would be indicated with an error result and errno set to EAGAIN or EWOULDBLOCK. This context structure should then be saved in some queue to be picked up again in the future to retry the operation. The same queue could have some other context structure in it upon which the send() could be attempted again to see if progress can be made.

Instead of continuously looping on a retry, UNIX provides mechanisms to allow you to wait for an indication that one or more of the sockets you are trying to send files over will no longer block (see poll, select, epoll, kevent, and perhaps other variations depending on your OS).

What about `ucontext.h`?

The makecontext() and swapcontext() functions is the mechanism classic UNIX systems had provided for cooperative concurrency. It lets a process switch its current execution stack with an alternate execution stack. Because it lets you switch execution stacks, it can avoid the need for defining a state machine structure. The state is intrinsic to the execution stack.

However, it does not avoid need to enable non-blocking interactions with operations that would normally block until completion. Upon receiving an indication the operation would block, you would switch to an alternate stack to retry their previously blocked operation.

And again, to avoid continuous retries, you would still need to implement some kind of scheduling queue of stacks to resume, and use a demultiplexing waiting primitive, like poll, to know when which stack can be resumed and not get blocked.

While Linux and BSD systems still support ucontext.h, it is no longer part of the POSIX standard. It was part of the original POSIX.1-2001, but was marked for obsolescence in POSIX.1-2004. In POSIX.1-2008, it was removed.

Are there any good alternatives?

It depends on what you would consider "good".

State machines?

If you want something that takes care of demultiplexing for you, and you just want to write code to run your state machine on resumption, you can consider using something similar to libevent or one of its forks/clones. It provides a common interface for registering your context with the demultiplexer, and works on a variety of operating systems.

However, you still need to manage non-blocking interactions yourself.

Like threads?

If you want a solution that allows you easy programming like with threads, and avoid managing your own non-blocking context scheduling, you can try GNU's pth, Libwire, Windows Fibers, or even pthreads.

GNU pth can provide a thread like programming paradigm. If you follow the provided APIs, it will appear like all your threads are performing block to completion operations, when pth is actually intercepting the calls and performing the non-blocking versions under the covers and context switching on your behalf.
If LGPL is too restrictive, you may consider using Libwire, which is released under the MIT license. This is a user-space threading library, and is meant to bring a GoLang like coroutine programming interface to C.
Windows Fibers is a facility that provides lightweight threading, while maintaining a block to completion programming style. Fibers has a noteworthy feature where a fiber can be converted into a thread, and back to a fiber again.
POSIX pthreads defines a contentionscope parameter to pthread_attr_setscope. One of the values for this parameter is PTHREAD_SCOPE_PROCESS. It seems like this would be the way to achieve lightweight scheduling using pthreads. However, the behavior of this scope is implementation defined.

References

Non-blocking I/O
POSIX history on ucontext.h
Libevent
GNU Portable Threads
Libwire
Windows Fibers
POSIX pthread scope attribute and contention scope