I'm building a website using pyramid, and I want to fetch some data from other websites. Because there may be 50+ calls of urlopen
, I wanted to use gevent to speed things up.
Here's what I've got so far using gevent:
import urllib2
from gevent import monkey; monkey.patch_all()
from gevent import pool
gpool = gevent.pool.Pool()
def load_page(url):
response = urllib2.urlopen(url)
html = response.read()
response.close()
return html
def load_pages(urls):
return gpool.map(load_page, urls)
Running pserve development.ini --reload
gives:
NotImplementedError: gevent is only usable from a single thread
.
I've read that I need to monkey patch before anything else, but I'm not sure where the right place is for that. Also, is this a pserve-specific issue? Will I need to re-solve this problem when I move to mod_wsgi? Or is there a way to handle this use-case (just urlopen) without gevent? I've seen suggestions for requests but I couldn't find an example of fetching multiple pages in the docs.
I also tried eventlet from this SO question (almost directly copied from this eventlet example):
import eventlet
from eventlet.green import urllib2
def fetch(url):
return urllib2.urlopen(url).read()
def fetch_multiple(urls):
pool = eventlet.GreenPool()
return pool.imap(fetch, urls)
However when I call fetch_multiple
, I'm getting TypeError: request() got an unexpected keyword argument 'return_response'
The TypeError
from the previous update was likely from earlier attempts to monkeypatch with gevent and not properly restarting pserve. Once I restarted everything, it works properly. Lesson learned.
There are multiple ways to do what you want:
gevent
thread, and explicitly dispatch all of your URL-opening jobs to that thread, which will then do the gevented urlopen
requests.gevent
, one that doesn't work by magically greenletifying your code.pycurl
.gevent
too, or find some other framework that works for both your web-serving and your web-client needs.You could simulate the last one without changing frameworks by loading gevent
first, and have it monkeypatch your threads, forcing your existing threaded server framework to become a gevent
server. But this may not work, or mostly work but occasionally fail, or work but be much slower… Really, using a framework designed to be gevent
-friendly (or at least greenlet-friendly) is a much better idea, if that's the way you want to go.
You mentioned that others had recommended requests
. The reason you can't find the documentation is that the built-in async code in requests
was removed. See, an older version for how it was used. It's now available as a separate library, grequests
. However, it works by implicitly wrapping requests
with gevent
, so it will have exactly the same issues as doing so yourself.
(There are other reasons to use requests
instead of urllib2
, and if you want to gevent
it it's easier to use grequests
than to do it yourself.)