Search code examples
poolgeventspawn

Spawning more threads than you have in a gevent pool


As I understand it the idea of a pool in gevent is to limit the total number of concurrent requests at any time, to a database or an API or similar.

Say I have code like this where I am spawning more greenlets than I have room for in the Pool:

import gevent.pool

pool = gevent.pool.Pool(50)
jobs = []
for number in xrange(300):
    jobs.append(pool.spawn(do_something, number))

total_result = [x.get() for x in jobs]

What is the actual behavior when trying to spawn the 51st request? When is the 51st request handled?


Solution

  • The pool class uses a semaphore to count active greenlets, initialized with size count in the constructor:

    class Pool(Group):
    
        def __init__(self, size=None, greenlet_class=None):
            if size is not None and size < 1:
                raise ValueError('Invalid size for pool (positive integer or None required): %r' % (size, ))
            Group.__init__(self)
            self.size = size
            if greenlet_class is not None:
                self.greenlet_class = greenlet_class
            if size is None:
                self._semaphore = DummySemaphore()
            else:
                self._semaphore = Semaphore(size)
    

    Every time spawn() is called, it tries to acquire the semaphore:

        def spawn(self, *args, **kwargs):
            self._semaphore.acquire()
            try:
                greenlet = self.greenlet_class.spawn(*args, **kwargs)
                self.add(greenlet)
            except:
                self._semaphore.release()
                raise
            return greenlet
    

    If the pool is full, the called greenlet will thus wait on _semaphore.acquire() call. Semaphore is released whenever any of the greenlets ends execution:

        def discard(self, greenlet):
           Group.discard(self, greenlet)
           self._semaphore.release()
    

    So in your case, I'd expect the 51st request to be handled (or started, to be precise) as soon as any of the first 50 requests is done.