This code shows the structure of what I am trying to do.
import multiprocessing
from foo import really_expensive_to_compute_object
## Create a really complicated object that is *hard* to initialise.
T = really_expensive_to_compute_object(10)
def f(x):
return T.cheap_calculation(x)
P = multiprocessing.Pool(processes=64)
results = P.map(f, range(1000000))
print results
The problem is that each process starts by spending a lot of time recalculating T instead of using the original T that was computed once. Is there a way to prevent this? T has a fast (deep) copy method, so can I get Python to use that instead of recalculating?
Why not have f
take a T
parameter instead of referencing the global, and do the copies yourself?
import multiprocessing, copy
from foo import really_expensive_to_compute_object
## Create a really complicated object that is *hard* to initialise.
T = really_expensive_to_compute_object(10)
def f(t, x):
return t.cheap_calculation(x)
P = multiprocessing.Pool(processes=64)
results = P.map(f, (copy.deepcopy(T) for _ in range(1000000)), range(1000000))
print results