Search code examples
pythonmultithreadinggrequests

How can I control python thread when I send a lot requests in a limitless loop?


Here is the situation. I need to send a ajax request to a Django view function every second and this view function will send some asynchronous requests to a third party API to get some data by grequests. These data will render to HTML after this view function returned. here show the code

  desc_ip_list=['58.222.24.253', '58.222.17.38']
  reqs = [grequests.get('%s%s' % ('http://int.dpool.sina.com.cn/iplookup/iplookup.php?format=json&ip=', desc_ip))
        for desc_ip in desc_ip_list]
  response = grequests.map(reqs)

When I runserver django and send this ajax request, the amount of threads of python is always increasing until error "can't start new thread" happened. enter image description here

error: can't start new thread
<Greenlet at 0x110473b90: <bound method AsyncRequest.send of <grequests.AsyncRequest object at 0x1103fd1d0>>(stream=False)> 
failed with error

How can I control the amount of threads? I have no idea of it because I'm a beginner pythoner. Thanks a lot.


Solution

  • Maybe your desc_ip_list is too long, and thus for let's say, a hundred IPs, you'd be spawning 100 requests, made by 100 threads!

    See here in the grequests code.

    What you should do:

    You should probably specify the size param in the map() call to a reasonable number, probably (2*n+1) where n is the number of cores in your CPU, at max. It will make sure that you don't process all the IPs in the desc_ip_list at the same time, thereby spawning as many threads.

    EDIT: More info, from a gevent doc page:

    The Pool class, which is a subclass of Group, provides a way to limit concurrency: its spawn method blocks if the number of greenlets in the pool has already reached the limit, until there is a free slot.

    Why am I mentioning this? Let's trace it back from grequests:

    In map(), we have from lineno 113-114:

    pool = Pool(size) if size else None
    jobs = [send(r, pool, stream=stream) for r in requests]
    

    And in lineno 85 in send(), we have:

    return gevent.spawn(r.send, stream=stream)
    

    This is the return statement that will be executed from send(), because its param pool will be None, because in map(), you didn't specify size. Now go back a few lines above and read what I quoted from the gevent docs.