Search code examples
pythonqueuemultiprocessingput

Python: Multiprocessing Queue.put not working for semi-large data


I am testing out functions of the Queue structure in module multiprocessing. I fail to see why this simple piece of code is unable to terminate for a hardly large data set

Code:

from multiprocessing import Process,Queue

if __name__ == "__main__":

    tobeQueue = Queue()

    for i in range(1,10000):
        tobeQueue.put(i)

This code which should be terminating, is working for range less than equal to 3 orders of 10... But not for higher orders of 10 than 3...


Solution

  • Ah I know now what the problem is.

    from Queue import Queue
    

    and

    from multiprocessing import Queue 
    

    are not the same queue. The multiprocessing (mp) queue has some special code in it to allow it to pass values back and forth between processes. It is a consequence of python GIL and threading handicap.

    What is happening, is the queue will not allow the process it is in to die until it is empty. Pay special attention to the second red-highlighted warning. The loop is finishing normally, the queue is not allowing your python process to terminate because the queue is not in shared memory, like you would expect with threads. I am not entirely familiar with the process behind the mp.Queue, but it involves pickling the items on the queue between the put and get processes. Thus, eliminating one process abnormally may result in deadlock.

    So you need to unload the queue completely, with queue.get(), and your process will terminate as expected.

    This code will terminate as you would expect:

    from multiprocessing import Process,Queue
    
    if __name__ == "__main__":
    
        tobeQueue = Queue()
    
        for i in range(1,10000):
            tobeQueue.put(i)
    
        for i in range(1,10000):
            tobeQueue.get() #remove all 9999 items, allow it to die.