I am attempting to execute multiprocessing within a for loop but the issue is that pool.map seems to be using the last value of the iterable in the for loop for all the iterations.
from multiprocessing import Pool
import random
args = [1,2]
repeat = 3
output = [0] * len(args)
def myfunc(_):
b = 2
return a +1, b +2
count = 0
for arg in args:
a = arg
if __name__ == '__main__':
with Pool(2 ) as p:
out1,out2 = zip(*p.map(myfunc, range(0, repeat ), chunksize =1))
temp = [out1,out2 ]
output[count] = temp
count += 1
Output:
[[(3, 3, 3), (4, 4, 4)], [(3, 3, 3), (4, 4, 4)]]
which suggest that myfun is using a
= 2 for all the iterations in the loop.
Intended output:
[[(2, 2, 2), (4, 4, 4)], [(3, 3, 3), (4, 4, 4)]]
Note: In reality, myfunc
is time-consuming simulation with random output ( hence i need to repeat the function multiple times even with the same argument) and it is inherent that I have to initialize a list to store the results
How can I achieve the intended output?
All variables should be inside the if name == '__main__'
.
Only thing outside of it should be the function definition and the imports.
When you spawn a subprocess it reinitializes imports and is likely overriding your variable. This is most noticeable in Windows, but can still happen in Linux.
from multiprocessing import Pool
def myfunc(a):
b = 2
return a +1, b +2
if __name__ == '__main__':
args = [1,2]
repeat = 3
output = [0] * len(args)
count = 0
for arg in args:
a = arg
with Pool(2) as p:
out1, out2 = zip(*p.map(myfunc, [a]*repeat, chunksize =1))
temp = [out1,out2]
output[count] = temp
count += 1
print(output)