I have a problem with using multiprocessing in Python3. I use multiprocessing reading multiple files and create instances with reading data. After reading and creating instances, what I want is that just right after one instances created, doing something. But it does not work as I expected.
The following code runs actually waiting for all instances created, after that do_something code are excuted. It's not what I expected.
import multiprocessing
import os
from itertools import repeat
pool = multiprocessing.Pool(processes=3)
pool = multiproccesiin.pool.ThreadPool(processes=1)
events = list(map(lambda x: os.path.split(x)[1].split('.')[0], events))
async_results = pool.startmap_async(Writer,zip(events, repeat(self._settings['TANKDIR']), repeat(self._settings['CATDIR']),repeat(self._settings['DATADIR'])))
# Writer is my class init.
# The earliest created instance takes about 4.5 sec.
tpool.map(do_something, async_results.get())
# I thought tpool.map part start do_something right after one instance created.'
tpool.map(do_something, async_results.get())
is only going to send the first Writer
instance to the do_something
function.
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.
So, what you're doing with the tpool.map
call is equivalent to do_something(async_result.get())
-- ie. only sending the first Writer
to do_something
Without knowing more about your use case, the solution may be as simple as tpool.map(do_something, list(async_results))