Search code examples
pythonpython-multiprocessing

Python Multiprocessing pool with multiple arguments and void function


I'm trying to use Python multiprocessing library with multiple arguments on a void function that does not return anything. Here is my minimal working example.

import numpy as np
from multiprocessing import Pool

dim1 = 2
dim2 = 2

test1 = np.zeros((dim1,dim2))
test2 = np.zeros((dim1,dim2))

iteration = []
for i in range(0,dim1):
    for j in range(0,dim2):
        iteration.append((i,j))
        
def testing(num1,num2):
    test1[num1,num2] = 1
    test2[num1,num2] = 2
    
if __name__ == '__main__':
    pool = Pool(processes=4)  
    pool.starmap(testing, iteration)
    
print(test1)
print(test2)

The problem here is that variable test1 and test2 prints zero array as first initialized. Instead, what I what for test1 is an array of 1s and an array of 2s for test2. What I would like the code

if __name__ == '__main__':
    pool = Pool(processes=4)  
    pool.starmap(testing, iteration)

to do is this:

testing(0,0)
testing(1,0)
testing(0,1)
testing(1,1)

I've seen some related posts like this. The difference between this post and mine is that my function is a void function, and rather than returning the variables, I'd like the function to just change the values of the variables.


Solution

  • To update an array across multiple processes using a global array without returning results:

    • Use the multiprocessing.Array class to store the array data.
    • Use the initializer parameter when creating the pool to pass the arrays to the processes.

    Note that the Array is 1 dimensional so it must be reshaped for update and display.

    Try this code:

    import numpy as np
    from multiprocessing import Pool, Array
    
    dim1 = 2
    dim2 = 2
    
    def init(tt1,tt2):  # receive shared arrays
       global test1,test2
       test1,test2 = tt1,tt2
    
    def testing(num1,num2):
        t1 = np.frombuffer(test1.get_obj()).reshape((dim1, dim2))  # need to reshape to 2D array
        t2 = np.frombuffer(test2.get_obj()).reshape((dim1, dim2))
        t1[num1,num2] = 1
        t2[num1,num2] = 2
       
    if __name__ == '__main__':
        tt1 = Array('d', dim1*dim2)  # 1 dimensional arrays
        tt2 = Array('d', dim1*dim2)
    
        iteration = []
        for i in range(0,dim1):
            for j in range(0,dim2):
                iteration.append((i,j))
                
        pool = Pool(processes=4, initializer=init, initargs=(tt1,tt2))   # pass shared arrays to processes
        pool.starmap(testing, iteration)
        
        # still have access to the shared arrays
        t1final = np.frombuffer(tt1.get_obj()).reshape((dim1, dim2))
        t2final = np.frombuffer(tt2.get_obj()).reshape((dim1, dim2))
        print(t1final, t2final, sep='\n')
    

    Output

    [[1. 1.]
     [1. 1.]]
    [[2. 2.]
     [2. 2.]]