Search code examples
pythonasync-awaitpython-asyncio

Start async task now, await later


C# programmer trying to learn some Python. I am trying to run a CPU intensive calc while letting an IO bound async method quietly chug away in the background. In C#, I would typically set the awaitable going, then kick off the CPU intensive code, then await the IO task, then combine results.

Here's how I'd do it in C#

static async Task DoStuff() {
    var ioBoundTask = DoIoBoundWorkAsync();
    int cpuBoundResult = DoCpuIntensizeCalc();
    int ioBoundResult = await ioBoundTask.ConfigureAwait(false);

    Console.WriteLine($"The result is {cpuBoundResult + ioBoundResult}");
}

static async Task<int> DoIoBoundWorkAsync() {
    Console.WriteLine("Make API call...");
    await Task.Delay(2500).ConfigureAwait(false); // non-blocking async call
    Console.WriteLine("Data back.");
    return 1;
}

static int DoCpuIntensizeCalc() {
    Console.WriteLine("Do smart calc...");
    Thread.Sleep(2000);  // blocking call. e.g. a spinning loop
    Console.WriteLine("Calc finished.");
    return 2;
}

And here's the equivalent code in python

import time
import asyncio

async def do_stuff():
    ioBoundTask = do_iobound_work_async()
    cpuBoundResult = do_cpu_intensive_calc()
    ioBoundResult = await ioBoundTask
    print(f"The result is {cpuBoundResult + ioBoundResult}")

async def do_iobound_work_async(): 
    print("Make API call...")
    await asyncio.sleep(2.5)  # non-blocking async call
    print("Data back.")
    return 1

def do_cpu_intensive_calc():
    print("Do smart calc...")
    time.sleep(2)  # blocking call. e.g. a spinning loop
    print("Calc finished.")
    return 2

await do_stuff()

Importantly, please note that the CPU intensive task is represented by a blocking sleep that cannot be awaited and the IO bound task is represented by a non-blocking sleep that is awaitable.

This takes 2.5 seconds to run in C# and 4.5 seconds in Python. The difference is that C# runs the asynchronous method straight away whereas python only starts the method when it hits the await. Output below confirms this. How can I achieve the desired result. Code that would work in Jupyter Notebook would be appreciated if at all possible.

--- C# ---
Make API call...
Do smart calc...
Calc finished.
Data back.
The result is 3
--- Python ---
Do smart calc...
Calc finished.
Make API call...
Data back.
The result is 3

Update 1

Inspired by knh190's answer, it seems that I can get most of the way there using asyncio.create_task(...). This achieves the desired result (2.5 secs): first, the asynchronous code is set running; next, the blocking CPU code is run synchronously; third the asynchronous code is awaited; finally the results are combined. To get the asynchronous call to actually start running, I had to put an await asyncio.sleep(0) in, which feels like a horrible hack. Can we set the task running without doing this? There must be a better way...

async def do_stuff():
    task = asyncio.create_task(do_iobound_work_async())
    await asyncio.sleep(0)  #   <~~~~~~~~~ This hacky line sets the task running

    cpuBoundResult = do_cpu_intensive_calc()
    ioBoundResult = await task

    print(f"The result is {cpuBoundResult + ioBoundResult}")

Solution

  • Update Python 3.12

    Now it can be done without await asyncio.sleep(0) trick after every single task creation. You need to set the task factory to asyncio.eager_task_factory.

    Here is the full working code:

    import asyncio
    import time
    
    async def do_stuff():
        ioBoundTask = asyncio.create_task(do_iobound_work_async())  # new
        cpuBoundResult = do_cpu_intensive_calc()
        ioBoundResult = await ioBoundTask
        print(f"The result is {cpuBoundResult + ioBoundResult}")
    
    async def do_iobound_work_async():
        print("Make API call...")
        await asyncio.sleep(2.5)
        print("Data back.")
        return 1
    
    def do_cpu_intensive_calc():
        print("Do smart calc...")
        time.sleep(2)
        print("Calc finished.")
        return 2
    
    async def main():
        loop = asyncio.get_event_loop()
        loop.set_task_factory(asyncio.eager_task_factory)  # new
        await do_stuff()
    
    asyncio.run(main())
    

    output:

    - ~/vscode_python3_12  % time uv run file.py
    Make API call...
    Do smart calc...
    Calc finished.
    Data back.
    The result is 3
    uv run file.py  0.07s user 0.02s system 3% cpu 2.592 total
    

    Explanation:

    As you already conclude, It should be a "Task" not a coroutine(in order to get scheduled), so we need a asyncio.create_task() method. Before Python 3.12 you had to use the await asyncio.sleep(0) trick to let other tasks a chance to run, but now you can instruct the event loop to use eager task-factory when it creates tasks.