Here's my code:
import multiprocessing as mp
import numpy as np
def foo(p):
global i
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
dataset = np.linspace(1, 100, 100)
agents = mp.cpu_count() - 1
chunksize = 5
pool = mp.Pool(processes=agents)
result = pool.map(foo, dataset, chunksize)
print result
print i
pool.close()
pool.join()
The console prints out the array [3, 6, 9,...,300] three times with the integers 1,2,3 in-between each array printout. So i is correctly iterating between lower & upper (not inclusive), but I expected it to print out the array [1, 2, 3,...,100] first followed by [2, 4, 6,...,200] and finally [3, 6, 9,...,300]. I don't understand why it's only passing the final value of i to foo and then mapping that thrice.
When you run the new process, this is what it sees:
import multiprocessing as mp
import numpy as np
def foo(p):
global i
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
# This part is not run, as
# in a different process,
# __name__ is set to '__mp_main__'
# i is now `upper - 1`, call `foo(p)` with the provided `p`
And after executing that, it is told to run foo
(It has to run the whole script again to find out what foo
is, just because of how pickling it works)
So, after it runs that, i
will be upper - 1
, and it will return p * 3
always.
You want to make i
a parameter given to foo
, or some multiprocessing specific memory sharing object, as descibed here