Search code examples
python-3.xpython-asyncio

Is it more efficient to use create_task(), or gather()?


I'm still at the basics of asynchronous python, and some things confuse me.

import asyncio
loop=asyncio.get_event_loop()
for variation in args:
    loop.create_task(coroutine(variation))
loop.run_forever()

Seems very similar to this

import asyncio
loop=asyncio.get_event_loop()
loop.run_forever(
    asyncio.gather(
        coroutine(variation_1),
        coroutine(variation_2),
        ...))

They might do the same thing, but that doesn't seem useful, so what's the difference?


Solution

  • Your second example should use asyncio.run, not run_forever.

    They might do the same thing, but that doesn't seem useful, so what's the difference?

    asyncio.gather is a higher-level construct.

    • create_task submits the coroutine to the event loop, effectively allowing it to run "in the background" (provided the event loop itself is active). As the name implies, it returns a task, a handle over the execution of the coroutine, most importantly providing the ability to cancel it. You can create any number of such tasks in an event loop, and they will all run until their respective completions.

    • asyncio.gather is for when you are actually interested in the results of the coroutines you have spawned. It spawns them as if with create_task, allowing them to run in parallel, but also waits for all of them to complete, and then returns their respective results (or raises an exception if any of them raised one).

    For example, if you have a download coroutine that downloads a URL and returns its contents, and you are downloading a list of URLs, gather allows you to match URLs to their data:

    url_list = [...]
    data_list = await asyncio.gather(*[download(url) for url in url_list]
    
    # url_list and data_list now have matching elements, so this works:
    for url, data in zip(url_list, data_list):
        ...
    

    Doing this with just create_task would be somewhat more involved.