Search code examples
pythonhttpconcurrencyhttprequest

Ideal method for sending multiple HTTP requests over Python?


Possible Duplicate:
Multiple (asynchronous) connections with urllib2 or other http library?

I am working on a Linux web server that runs Python code to grab realtime data over HTTP from a 3rd party API. The data is put into a MySQL database. I need to make a lot of queries to a lot of URL's, and I need to do it fast (faster = better). Currently I'm using urllib3 as my HTTP library. What is the best way to go about this? Should I spawn multiple threads (if so, how many?) and have each query for a different URL? I would love to hear your thoughts about this - thanks!


Solution

  • If a lot is really a lot than you probably want use asynchronous io not threads.

    requests + gevent = grequests

    GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.

    import grequests
    
    urls = [
        'http://www.heroku.com',
        'http://tablib.org',
        'http://httpbin.org',
        'http://python-requests.org',
        'http://kennethreitz.com'
    ]
    
    rs = (grequests.get(u) for u in urls)
    grequests.map(rs)