Search code examples
python-3.xpython-requestspython-asyncioaiohttp

Connections aren't closing with Python3 asyncio concurrent HTTP get requests


I've just started using the asyncio libs from Python3.4 and wrote a small program which attempts to concurrently fetch 50 webpages at a time. The program blows up after a few hundred requests with a 'Too many open files' exception.

I thought that my fetch method closes the connections with the 'response.read_and_close()' method call.

Any ideas what's going on here? Am I going about this problem the right way?

import asyncio
import aiohttp

@asyncio.coroutine
def fetch(url):
    response = yield from aiohttp.request('GET', url)
    response = yield from response.read_and_close()
    return response.decode('utf-8')

@asyncio.coroutine
def print_page(url):
    page = yield from fetch(url)
    # print(page)

@asyncio.coroutine
def process_batch_of_urls(round, urls):
  print("Round starting: %d" % round)
  coros = []
  for url in urls:
      coros.append(asyncio.Task(print_page(url)))
  yield from asyncio.gather(*coros)
  print("Round finished: %d" % round)

@asyncio.coroutine
def process_all():
  api_url = 'https://google.com'
  for i in range(10):
    urls = []
    for url in range(50):
      urls.append(api_url)
    yield from process_batch_of_urls(i, urls)


loop = asyncio.get_event_loop()
loop.run_until_complete(process_all())

The error I'm getting is:

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/site-packages/aiohttp/client.py", line 106, in request
  File "/usr/local/lib/python3.4/site-packages/aiohttp/connector.py", line 135, in connect
  File "/usr/local/lib/python3.4/site-packages/aiohttp/connector.py", line 242, in _create_connection
  File "/usr/local/Cellar/python3/3.4.1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/asyncio/base_events.py", line 424, in create_connection
  File "/usr/local/Cellar/python3/3.4.1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/asyncio/base_events.py", line 392, in create_connection
  File "/usr/local/Cellar/python3/3.4.1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 123, in __init__
OSError: [Errno 24] Too many open files

During handling of the above exception, another exception occurred:

Solution

  • Ok I finally got it to work.

    Turns out I had to use a TCPConnector which pools connections.

    So I made this variable:

    connector = aiohttp.TCPConnector(share_cookies=True, loop=loop)
    

    and pass it through to each get request. My new fetch routine looks like this:

    @asyncio.coroutine
    def fetch(url):
      data = ""
      try:
        yield from asyncio.sleep(1)
        response = yield from aiohttp.request('GET', url, connector=connector)
      except Exception as exc:
          print('...', url, 'has error', repr(str(exc)))
      else:
          data = (yield from response.read()).decode('utf-8', 'replace')
          response.close()
    
      return data