I have been reading for a few hours on how globals=globals() works on Parallel Python, and I am still a little bit confused, maybe you can help me... I am writing a code which can basically be summarized as:
import pp
class Foo(object):
def __init__(self):
self.h = 2
def f():
foo = Foo()
return foo.h
ppservers = ()
job_server = pp.Server(ppservers=ppservers)
#print globals()
g = job_server.submit(f, (), globals = globals())
r = g()
The output is "global name 'Foo' is not defined", as a few others have encountered before me... I know that the globals argument is used to transfer functions and classes to the server in a simplified way, so I hoped that it had passed the Foo() class, as it is in the global variables before executing the job_server.submit instruction (as you can see by de-commenting the print globals() instruction).
What am I missing about PP that I should know? Is there a way to make this code run without too many changes?
Thanks for reading me, I hope that some of you get how this works!
(Remark: I do not want to create an instance of Foo outside of the parallelized jobs, simply because initializing an instance triggers the whole program, which is within the init(self) function... bad idea, I know...)
This is an issue revolving around being able to have pp
track down and associate the code dependencies correctly. pp
inspects the first object passed in to submit
(i.e. f
), and extracts the source code… which it then passes to the other processes. It also passes in any additional objects that are passed in globals
. However, pp
primarily tracks functions, classes, and modules -- it has trouble with instances and many other objects.
If you use a fork of pp
called ppft
(it imports as pp
), it should work as expected. ppft
uses a better code inspection package (dill.source
from dill
), to extract the source code from a broader range of python objects and more complex dependencies.
>>> import pp
>>> class Foo(object):
... def __init__(self):
... self.h = 2
...
>>> def f():
... foo = Foo()
... return foo.h
...
>>> ppservers = ()
>>> job_server = pp.Server(ppservers=ppservers)
>>> g = job_server.submit(f, (), globals=globals())
>>> r = g()
>>> r
2
Get ppft
here: https://github.com/uqfoundation
It's also pip
installable.