Search code examples
pythonasynchronousasync-awaitpython-asyncio

asyncio.gather vs asyncio.wait (vs asyncio.TaskGroup)


asyncio.gather and asyncio.wait seem to have similar uses: I have a bunch of async things that I want to execute/wait for (not necessarily waiting for one to finish before the next one starts).

Since Python 3.11 there is yet another similar feature, asyncio.TaskGroup.

They use a different syntax, and differ in some details, but it seems very un-pythonic to me to have several functions that have such a huge overlap in functionality.

What am I missing?


Solution

  • Although similar in general cases ("run and get results for many tasks"), each function has some specific functionality for other cases (and see also TaskGroup for Python 3.11+ below):

    asyncio.gather()

    Returns a Future instance, allowing high level grouping of tasks:

    import asyncio
    from pprint import pprint
    
    import random
    
    
    async def coro(tag):
        print(">", tag)
        await asyncio.sleep(random.uniform(1, 3))
        print("<", tag)
        return tag
    
    
    loop = asyncio.get_event_loop()
    
    group1 = asyncio.gather(*[coro("group 1.{}".format(i)) for i in range(1, 6)])
    group2 = asyncio.gather(*[coro("group 2.{}".format(i)) for i in range(1, 4)])
    group3 = asyncio.gather(*[coro("group 3.{}".format(i)) for i in range(1, 10)])
    
    all_groups = asyncio.gather(group1, group2, group3)
    
    results = loop.run_until_complete(all_groups)
    
    loop.close()
    
    pprint(results)
    

    All tasks in a group can be cancelled by calling group2.cancel() or even all_groups.cancel(). See also .gather(..., return_exceptions=True),

    asyncio.wait()

    Supports waiting to be stopped after the first task is done, or after a specified timeout, allowing lower level precision of operations:

    import asyncio
    import random
    
    
    async def coro(tag):
        print(">", tag)
        await asyncio.sleep(random.uniform(0.5, 5))
        print("<", tag)
        return tag
    
    
    loop = asyncio.get_event_loop()
    
    tasks = [coro(i) for i in range(1, 11)]
    
    print("Get first result:")
    finished, unfinished = loop.run_until_complete(
        asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED))
    
    for task in finished:
        print(task.result())
    print("unfinished:", len(unfinished))
    
    print("Get more results in 2 seconds:")
    finished2, unfinished2 = loop.run_until_complete(
        asyncio.wait(unfinished, timeout=2))
    
    for task in finished2:
        print(task.result())
    print("unfinished2:", len(unfinished2))
    
    print("Get all other results:")
    finished3, unfinished3 = loop.run_until_complete(asyncio.wait(unfinished2))
    
    for task in finished3:
        print(task.result())
    
    loop.close()
    

    TaskGroup (Python 3.11+)

    Update: Python 3.11 introduces TaskGroups which can "automatically" await more than one task without gather() or await():

    # Python 3.11+ ONLY!
    async def main():
        async with asyncio.TaskGroup() as tg:
            task1 = tg.create_task(some_coro(...))
            task2 = tg.create_task(another_coro(...))
        print("Both tasks have completed now.")