Fetch multiple URLs with asyncio/aiohttp and retry for failures

I'm attempting to write some asynchronous GET requests with the aiohttp package, and have most of the pieces figured out, but am wondering what the standard approach is when handling the failures (returned as exceptions).

A general idea of my code so far (after some trial and error, I am following the approach here):

import asyncio
import aiofiles
import aiohttp
from pathlib import Path

with open('urls.txt', 'r') as f:
    urls = [s.rstrip() for s in f.readlines()]

async def fetch(session, url):
    async with session.get(url) as response:
        if response.status != 200:
            response.raise_for_status()
        data = await response.text()
    # (Omitted: some more URL processing goes on here)
    out_path = Path(f'out/')
    if not out_path.is_dir():
        out_path.mkdir()
    fname = url.split("/")[-1]
    async with aiofiles.open(out_path / f'{fname}.html', 'w+') as f:
        await f.write(data)

async def fetch_all(urls, loop):
    async with aiohttp.ClientSession(loop=loop) as session:
        results = await asyncio.gather(*[fetch(session, url) for url in urls],
                return_exceptions=True)
        return results

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    results = loop.run_until_complete(fetch_all(urls, loop))

Now this runs fine:

As expected the results variable is populated with None entries where the corresponding URL [i.e. at the same index in the urls array variable, i.e. at the same line number in the input file urls.txt] was successfully requested, and the corresponding file is written to disk.
This means I can use the results variable to determine which URLs were not successful (those entries in results not equal to None)

I have looked at a few different guides to using the various asynchronous Python packages (aiohttp, aiofiles, and asyncio) but I haven't seen the standard way to handle this final step.

Should the retrying to send a GET request be done after the await statement has 'finished'/'completed'?
...or should the retrying to send a GET request be initiated by some sort of callback upon failure
- The errors look like this: (ClientConnectorError(111, "Connect call failed ('000.XXX.XXX.XXX', 443)") i.e. the request to IP address 000.XXX.XXX.XXX at port 443 failed, probably because there's some limit from the server which I should respect by waiting with a time out before retrying.
Is there some sort of limit I might consider putting on, to batch the number of requests rather than trying them all?
I am getting about 40-60 successful requests when attempting a few hundred (over 500) URLs in my list.

Naively, I was expecting run_until_complete to handle this in such a way that it would finish upon succeeding at requesting all URLs, but this isn't the case.

I haven't worked with asynchronous Python and sessions/loops before, so would appreciate any help to find how to get the results. Please let me know if I can give any more information to improve this question, thank you!

Solution

Should the retrying to send a GET request be done after the await statement has 'finished'/'completed'? ...or should the retrying to send a GET request be initiated by some sort of callback upon failure

You can do the former. You don't need any special callback, since you are executing inside a coroutine, so a simple while loop will suffice, and won't interfere with execution of other coroutines. For example:

async def fetch(session, url):
    data = None
    while data is None:
        try:
            async with session.get(url) as response:
                response.raise_for_status()
                data = await response.text()
        except aiohttp.ClientError:
            # sleep a little and try again
            await asyncio.sleep(1)
    # (Omitted: some more URL processing goes on here)
    out_path = Path(f'out/')
    if not out_path.is_dir():
        out_path.mkdir()
    fname = url.split("/")[-1]
    async with aiofiles.open(out_path / f'{fname}.html', 'w+') as f:
        await f.write(data)

Naively, I was expecting run_until_complete to handle this in such a way that it would finish upon succeeding at requesting all URLs

The term "complete" is meant in the technical sense of a coroutine completing (running its course), which is achieved either by the coroutine returning or raising an exception.