Starting a large number of async processes with multiprocessing

If I call apply_async 10,000 times, assuming the OOM-killer doesn't interfere, will multiprocessing start them all simultaneously, or will it start them in batches. For example.. Every 100 starts, waiting for 90 to finish starting before starting any more?

Dustin

Solution

apply_async() is a method of multiprocessing.Pool objects, and delivers all work to the number of processes you specified when you created the Pool. Only that many tasks can run simultaneously. The rest are saved in queues (or pipes) by the multiprocessing machinery, and automatically doled out to processes as they complete tasks already assigned. Much the same is true of all the Pool methods to which you feed multiple work items.

A little more clarification: apply_async doesn't create, or start, any processes. The processes were created when you called Pool(). The processes just sit there and wait until you invoke Pool methods (like apply_async()) that ask for some real work to be done.

Example

Play with this:

MAX = 100000

from time import sleep
def f(i):
    sleep(0.01)
    return i

def summer(summand):
    global SUM, FINISHED
    SUM += summand
    FINISHED += 1

if __name__ == "__main__":
    import multiprocessing as mp
    SUM = 0
    FINISHED = 0
    pool = mp.Pool(4)

    print "queuing", MAX, "work descriptions"
    for i in xrange(MAX):
        pool.apply_async(f, args=(i,), callback=summer)
        if i % 1000 == 0:
            print "{}/{}".format(FINISHED, i),
    print

    print "closing pool"
    pool.close()

    print "waiting for processes to end"
    pool.join()

    print "verifying result"
    print "got", SUM, "expected", sum(xrange(MAX))

Output is like:

queuing 100000 work descriptions
0/0 12/1000 21/2000 33/3000 42/4000
... stuff chopped for brevity ...
1433/95000 1445/96000 1456/97000 1466/98000 1478/99000
closing pool
waiting for processes to end
... and it waits here "for a long time" ...
verifying result
got 4999950000 expected 4999950000

You can answer most of your questions just by observing its behavior. The work items are queued up quickly. By the time we see "closing pool", all the work items have been queued, but 1478 have already completed, and about 98000 are still waiting for some process to work on them.

If you take the sleep(0.01) out of f(), it's much less revealing, because results come back almost as fast as work items are queued.

Memory use remains trivial no matter how you run it, though. The work items here (the name of the function ("f") and its pickled integer argument) are tiny.