I need a number of workers in Python that is initialized with an object. The main process will send commands+arguments to the workers that should then execute methods in their object and return the result in parallel.
I have tried with multiprocessing.pool, but when calling pool.map, it seems random which process is executing with which argument, even when the pool is initialized with N processes and chunksize is set to 1.
import multiprocessing
def init(a):
global myA
myA = a
def get_value(_):
global myA
return myA.value
class A():
def __init__(self, value):
self.value = value
if __name__ == '__main__':
N = 4
a_lst = [A(i) for i in range(N)]
pool = multiprocessing.Pool(N)
pool.map(init, a_lst, chunksize=1)
print(pool.map(get_value, range(N), chunksize=1))
output
[3, 1, 3, 1]
Can I do it with multiprocessing.pool or how can I do it?
A solution that does not depend on either the pool size or chunksize value, i.e. where you do not care which pool process is assigned to the tasks being submitted with the multiprocessing.pool.Pool.map
method, is to initialize each pool process with the entire a_list
list by using the initializer and initargs arguments on the pool initializer.
In the following code I have purposely made the size of a_lst
and the pool size different and I am letting the map
method compute a default chunksize value
import multiprocessing
def init_pool(*args):
global a_lst
a_lst = args[0]
def get_value(i):
return a_lst[i].value
class A():
def __init__(self, value):
self.value = value
if __name__ == '__main__':
N = 10 # list is length 10
POOL_SIZE = 4 # pool size is 4
a_lst = [A(i) for i in range(N)]
pool = multiprocessing.Pool(POOL_SIZE, initializer=init_pool, initargs=(a_lst,))
print(pool.map(get_value, range(N)))
Prints:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]