I have some code I've just started trying to speed up in Python 3.5. I am trying to accomplish this with the multiprocessing
module. Here is a minimum example to demonstrate what I'm trying to do.
Serially, the code is more straightforward. The Momma_Serial
class has a list of Baby
objects inside of it. Occasionally, we want to call the Baby.evolve()
method on each of these. In practice, there are going to be a lot of these Baby
objects (only 100 in this example). This was the original motivation for seeking parallelism.
What complicates this whole thing is that the top level of the program tells how this is done on each of the many Baby
objects by passing a function pass_this_func()
. This function is an argument to Momma_Serial.evolve_all_elems()
, and is passed along to all of the little baby objects inside this momma object.
class Baby:
def __init__(self, lol):
self.lol = lol
def evolve(self, f):
self.lol = f(self.lol)
def pass_this_func(thing):
return( 2 * thing )
class Momma_Serial:
def __init__(self, num):
self.my_list = [Baby(i) for i in range(num)]
def evolve_all_elems(self, the_func):
for baby in self.my_list:
baby.evolve(the_func)
momma1 = Momma_Serial(100)
[baby.lol for baby in momma1.my_list]
momma1.evolve_all_elems(pass_this_func)
[baby.lol for baby in momma1.my_list]
This works as it should. But it's slow. Here's my attempt at re-writing the Momma class using the multiprocessing module.
import multiprocessing as mp
class Momma_MP:
def __init__(self, num):
self.my_list = [Baby(i) for i in range(num)]
def evolve_all_elems(self, the_func):
num_workers = 2
def f(my_obj):
my_obj.evolve(the_func)
with mp.Pool(num_workers) as pool:
pool.map(f, self.my_list)
Then I try to run it:
momma2 = Momma_MP(100)
[baby.lol for baby in momma2.my_list]
momma2.evolve_all_elems(pass_this_func) #error comes here
# [baby.lol for baby in momma2.my_list]
But I get the error:
AttributeError: Can't pickle local object 'Momma_MP.evolve_all_elems.<locals>.f'
An answer to this stackoverflow question states "functions are only picklable if they are defined at the top-level of a module." This statement makes it seem like the only way to accomplish this is by defining a function outside of the Momma_MP
class. But I really don't want to do that, because it would raise a lot more issues for my code.
(edited a bit)
Is there any workaround? Assume that I cannot define the mapped function outside of the class. Also assume Momma()
is not being instantiated in the __main__
top-level script environment. Also, I don't want to deviate too much from this program design, because I want all these Baby() instances being abstracted away; I don't want the places/programs that instantiate instances or interact with instances of Momma()
having to worry or know about anything to do with the Baby()
class. These extra restrictions make the problem slightly different from the situation here.
By the way, the following doesn't throw an error, but there might be some copying going on, because nothing happens to the constituent Baby objects.
def outside_f(obj):
obj.evolve(pass_this_func)
class Momma_MP:
def __init__(self, num):
self.my_list = [Baby(i) for i in range(num)]
def evolve_all_elems(self, the_func):
num_workers = 2
with mp.Pool(num_workers) as pool:
pool.map(outside_f, self.my_list)
momma2 = Momma_MP(100)
[baby.lol for baby in momma2.my_list]
momma2.evolve_all_elems(pass_this_func)
[baby.lol for baby in momma2.my_list] # no change here?
I'll try to give an answer that isn't covered elsewhere that I could find (see my comments above). I'm going to assume you have different kinds of Momma's that have different f()
functions.
You could make a single function evolver()
:
def evolver(baby):
momma = baby.momma
momma.evolve(baby)
You would need to assign self.momma
in the __init__()
of the Baby
, passing the Momma
instance to the Baby
:
class Baby:
def __init__(self, lol, momma):
self.lol = lol
self.momma = momma
Now you would derive from Momma
to override the evolve()
method to specialize the evolve()
function.
So now when you call pool.map(evolver, babies)
, it would pass the baby
to evolver()
, which would then ask the momma
to evolve()
the baby
.
An answer I linked to above says you could also do the following as well:
class Momma:
evolver = staticmethod(evolver)
...to put the global method into the class.