Search code examples
pythonmultiprocessingthreadpool

How I can do something right after one instance created with Python3?


I have a problem with using multiprocessing in Python3. I use multiprocessing reading multiple files and create instances with reading data. After reading and creating instances, what I want is that just right after one instances created, doing something. But it does not work as I expected.

The following code runs actually waiting for all instances created, after that do_something code are excuted. It's not what I expected.

import multiprocessing
import os
from itertools import repeat

pool = multiprocessing.Pool(processes=3)
pool = multiproccesiin.pool.ThreadPool(processes=1)
events = list(map(lambda x: os.path.split(x)[1].split('.')[0], events))
async_results = pool.startmap_async(Writer,zip(events, repeat(self._settings['TANKDIR']), repeat(self._settings['CATDIR']),repeat(self._settings['DATADIR'])))
# Writer is my class init.
# The earliest created instance takes about 4.5 sec.
tpool.map(do_something, async_results.get())
# I thought tpool.map part start do_something right after one instance created.'

Solution

  • tpool.map(do_something, async_results.get()) is only going to send the first Writer instance to the do_something function.

    multiprocessing.map()

    This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.

    So, what you're doing with the tpool.map call is equivalent to do_something(async_result.get()) -- ie. only sending the first Writer to do_something

    Without knowing more about your use case, the solution may be as simple as tpool.map(do_something, list(async_results))