Search code examples
pythonmultithreadinggeventgreenlets

Python gevent pool.join() waiting forever


I have a function like this

def check_urls(res):
    pool = Pool(10)
    print pool.free_count()
    for row in res:
        pool.spawn(fetch, row[0], row[1])
    pool.join()

pool.free_count() outputs value 10.

I used pdb to trace. Program works fine until pool.spawn() loop.

But its waiting forever at pool.join() line.

Can someone tell me whats wrong?


Solution

  • But its waiting forever at pool.join() line.
    Can someone tell me whats wrong?

    Nothing!

    Though, I first wrote what's below the line, the join() function in gevent is still behaving pretty much the same way as in subprocess/threading. It's blocking until all the greenlets are done.

    If you want to only test whether all the greenlets in the pool are over or not, you might want to check for the ready() on each greenlet of the pool:

    is_over = all(gl.ready() for gl in pool.greenlets)
    

    Basically, .join() is not waiting forever, it's waiting until your threads are over. If one of your threads is never ending, then join() will block forever. So make sure every greenlet thread terminate, and join() will get back to execution once all the jobs are done.


    edit: The following applies only to subprocess or threading modules standard API. The GEvent's greenlet pools is not matching the "standard" API.

    The join() method on a Thread/Process has for purpose to make the main process/thread wait forever until the children processes/threads are over.

    You can use the timeout parameter to make it get back to execution after some time, or you can use the is_alive() method to check if it's running or not without blocking.

    In the context of a process/thread pool, the join() also needs to be triggered after a call to either close() or terminate(), so you may want to:

    for row in res:
        pool.spawn(fetch, row[0], row[1])
    pool.close()
    pool.join()