Search code examples
pythonpython-asynciopython-multithreading

Is asyncio in Python user-level threading model, cooperative scheduling?


I have been working for a long time with asyncio in Python, however I would like to clarify some thoughts on how asyncio actually works. I will break down my thoughts so that I can give context and that you can correct me if I have any errors in the bases. I understand that Python programs are kernel-level threading model, but with GIL. So, a python program will run in a kernel process and each python will invoke an OS thread, but due to the GIL, only each of these threads will run at a time. I also understand that the only way to get multiple threads in a Python program is through the "threading" module. That is, in a normal Python program without using this module I will simply have a process with a single thread running. Then the asyncio library arrives, my question is if asyncio would be an implementation of the user-level threading model, cooperative scheduling. The event loop manages all user threads (coroutines) and each of these coroutines has a cooperative approach since through await they determine when they return control to the scheduler. Additionally, all of these user threads are mapped to a single OS thread. I am right? Is this how it works?


Solution

  • Yes, the asyncio is definitely based on a cooperative scheduling. Yes, a task switch can occur only at an await, if the value is not available yet. And yes, it runs in a single CPU thread - with the implication, that it can do only one "thing" at a time, but it can have several other "things" in progress (i.e. asyncio is concurent, but not parallel).

    And I think if we replace the term "user-thread" with "asyncio task", you are almost right. We programmers are often having own mental models of something - in this case the asyncio - with our own terminology. Let me add few notes somehow related to the terminology:

    1. a coroutine and a task are not the same. From the docs: "Tasks are used to schedule coroutines concurrently."

    2. The event loop does not manage tasks (the scheduler does). The event loop manages pieces of code handling possibly blocking I/O. Only when the I/O channel is ready (i.e. will not block) it calls the corresponding handler. You could write an async program using just the event loop.

      And you could implement the event loop without any async python functionality. I would say 100-200 lines of code would be enough. The ability to do something asynchronously is based on the underlying OS, not on the prorgamming language.

      Even if the event loop is the core of asyncio, it is a simple core and the problem is that an application code using just the event loop would be a real pain to write. The simplest example: the event loop "pushes" the incoming junks of data as they arrive, but a program wants to read ("pull") the data line by line. So the third main benefit of asyncio (first two were the event loop and the task scheduler) is that it provides a well-tested layer with buffers and lots of other stuff between a high-level application code and the low-level event-loop.