I'm hosting a server on localhost and I want to fire hundreds of GET requests asynchronously. For this I am using grequests
. Everything appears to work fine but I repeatedly get the warning:
WARNING:requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection: date.jsontest.com
A search shows how the full pool issue can be avoided when creating a Session()
in requests
e.g. here. However, a couple of things:
pool_maxsize
will give a warning.requests.packages.urllib3.disable_warnings()
doesn't seem to do anything.So my questions are:
grequests
library, especially when I take steps to limit the pool size? Am I inviting unexpected behaviour and fluking my expected result in my tests?Some code to test:
import grequests
import requests
requests.packages.urllib3.disable_warnings() # Doesn't seem to work?
session = requests.Session()
# Hashing the below will cause 105 warnings instead of 5
adapter = requests.adapters.HTTPAdapter(pool_connections=100,
pool_maxsize=100)
session.mount('http://', adapter)
# Test query
query_list = ['http://date.jsontest.com/' for x in xrange(105)]
rs = [grequests.get(item, session=session) for item in query_list]
responses = grequests.map(rs)
print len([item.json() for item in responses])
1) What does this warning actually mean? My interpretation is that it is simply dropping the requests from firing, but it doesn't seem to be the case.
This is actually still unclear to me. Even firing one request was enough to get the warning but would still give me the expected response.
2) Is this warning actually relevant for the grequests library, especially when I take steps to limit the pool size? Am I inviting unexpected behaviour and fluking my expected result in my tests?
For the last part: yes. The server I was communicating with could handle 10 queries concurrently. With the following code I could send 400 or so requests in a single list comprehension and everything worked out fine (i.e. my server never got swamped so it must have been throttling in some way). After some tipping point in the number of requests, the code would stop firing any requests and simply give a list of None
. It's not as though it even tried to get through the list, it didn't even fire the first query, it just blocks up.
sess = requests.Session()
adapter = requests.adapters.HTTPAdapter(pool_connections=10,
pool_maxsize=10)
sess.mount('http://', adapter)
# Launching ~500 or more requests will suddenly cause this to fail
rs = [grequests.get(item[0], session=session) for item in queries]
responses = grequests.map(rs)
3) Is there a way to disable it?
Yes, if you want to be a doofus like me and hash it out in the source code. I couldn't find any other way to silence it, and it came back to bite me.
SOLUTION
The solution was a painless transition to using requests-futures
instead. The following code behaves exactly as expected, gives no warnings and, thus far, scales to any number of queries that I throw at it.
from requests_futures.sessions import FuturesSession
session = FuturesSession(max_workers = 10)
fire_requests = [session.get(url) for url in queries]
responses = [item.result() for item in fire_requests]