Search code examples
pythonprocessmultiprocessingpython-multiprocessingpool

python multiprocessing, manager initiates process spawn loop


I have a simple python multiprocessing script that sets up a pool of workers that attempt to append work-output to a Manager list. The script has 3 call stacks: - main calls f1 that spawns several worker processes that call another function g1. When one attempts to debug the script (incidentally on Windows 7/64 bit/VS 2010/PyTools) the script runs into a nested process creation loop, spawning an endless number of processes. Can anyone determine why? I'm sure I am missing something very simple. Here's the problematic code: -

import multiprocessing
import logging

manager = multiprocessing.Manager()
results = manager.list()

def g1(x):
    y = x*x
    print "processing: y = %s" % y
    results.append(y)

def f1():
    logger = multiprocessing.log_to_stderr()
    logger.setLevel(multiprocessing.SUBDEBUG)

    pool = multiprocessing.Pool(processes=4)
    for (i) in range(0,15):
        pool.apply_async(g1, [i])
    pool.close()
    pool.join()

def main():
    f1()

if __name__ == "__main__":
    main()

PS: tried adding multiprocessing.freeze_support() to main to no avail.


Solution

  • Basically, what sr2222 mentions in his comment is correct. From the multiprocessing manager docs, it says that the ____main____ module must be importable by the children. Each manager " object corresponds to a spawned child process", so each child is basically re-importing your module (you can see by adding a print statement at module scope to my fixed version!)...which leads to infinite recursion.

    One solution would be to move your manager code into f1():

    import multiprocessing
    import logging
    
    def g1(results, x):
        y = x*x
        print "processing: y = %s" % y
        results.append(y)
    
    def f1():
        logger = multiprocessing.log_to_stderr()
        logger.setLevel(multiprocessing.SUBDEBUG)
        manager = multiprocessing.Manager()
        results = manager.list()
        pool = multiprocessing.Pool(processes=4)
        for (i) in range(0,15):
            pool.apply_async(g1, [results, i])
        pool.close()
        pool.join()
    
    
    def main():
        f1()
    
    if __name__ == "__main__":
        main()