I'm pretty new to multiprocessing using Python and I'm trying to understand how to use Pool properly. I have some code which looks like this:
import numpy as np
from multiprocessing.dummy import Pool as ThreadPool
...
pool = ThreadPool(15)
arb = np.arange(0,len(np.concatenate((P,V),axis=0)),1)
F = pool.map(tttt,arb)
pool.close()
pool.join()
NT = 1000
for test in range(0,NT):
(P,V) = Dynamics(P,V,F)
pool = ThreadPool(15)
F = pool.map(tttt,arb)
pool.close()
pool.join()
...
tttt and Dynamics are two functions that are previously defined. I want to use Pool to be able to calculate a lot of values simultaneosly using tttt but I also want to update the values that I use for those calculations (tttt depends on P and V although not explicitly).
Do I have to create and close the pool twice as I am doing right now or can I do it just once?
It seems you would like to use a pool of processes on each iteration of a for
loop. You've made things more complicated than you need to for using Pool.map
, but your calls to .join()
and .close()
suggest you'd rather be using Pool.map_async
. Here's a simple example:
import numpy as np
from multiprocessing import Pool
from time import sleep
def print_square(x):
sleep(.01)
print x**2
if __name__=='__main__':
for k in range(10):
pool = Pool(3)
arb = np.arange(0,10)
pool.map_async(print_square,arb)
pool.close()
pool.join()
You should generally include a minimal, complete, verifiable example. Your example couldn't be run. Worse, it contained lots of extraneous domain-specific code (e.g. P
, V
, Dynamics
) which discourages others from trying to run your example.
State what the observed behavior of your code is (e.g. wrong output, run time error, etc.) and the desired behavior.
It's confusing to import Pool
as ThreadPool
, since threads and processes are different and yet have very similar APIs.