When speaking about asynchronous I/O, I want to understand the difference between POSIX interface used in Linux and concurrent.futures interface used in Python. I use the former one when I want to achieve asynchronous I/O in C
code and the latter one in python
code. I understand that concurrent.futures
in python is a thread-based technique that attaches a callback to a thread so that it can be polled later for its status. However, I don't know how POSIX works! Is it also thread based as well?
Thank you
concurrent.futures
is not specifically thread based (there are thread and process based executors available), nor is it specifically about async I/O; it's general parallelism. You could parallelize I/O with it, but it's the worker tasks that are async, with the I/O being a specific thing that can be parallelized.
As it happens, for I/O, you would want to use the ThreadPoolExecutor
; CPython's GIL isn't a problem for I/O bound tasks, and the IPC necessary to return results from a ProcessPoolExecutor
's worker processes would largely eliminate the benefits of parallelizing the I/O. I just wanted to be clear that concurrent.futures
is not purely about threads.
POSIX AIO is, at least on Linux, just a user space library wrapping threads (roughly equivalent to using concurrent.futures.ThreadPoolExecutor
to perform your I/O tasks), per the NOTES in the man page you linked:
The current Linux POSIX AIO implementation is provided in user space by glibc. This has a number of limitations, most notably that maintaining multiple threads to perform I/O operations is expensive and scales poorly. Work has been in progress for some time on a kernel state-machine-based implementation of asynchronous I/O (see io_submit(2), io_setup(2), io_cancel(2), io_destroy(2), io_getevents(2)), but this implementation hasn't yet matured to the point where the POSIX AIO implementation can be completely reimplemented using the kernel system calls.
Point is, in both cases, it's fundamentally about dispatching I/O requests in background threads with handles of some sort to allow polling and retrieval of results.
Kernel supported async I/O could avoid or limit threading by any of the following:
but none of these techniques are actually used in Linux's implementation of POSIX AIO, and if any of them were used in Python via concurrent.futures
, it would be a hand-rolled solution (since as mentioned, concurrent.futures
performs arbitrary parallelism, it doesn't specifically support I/O).