It seems to work, but is it safe to use self
after forking? Or should I always pass arguments to the subprocess as function parameters through args
?
import multiprocessing as mp
class C():
def __init__(self):
self.v = 'bla'
p = mp.Process(target=self.worker, args=[])
#p = mp.Process(target=self.worker, args=(self.v,))
p.start()
p.join()
def worker(self):
print(self.v)
#def worker(self, v):
#print(v)
c = C()
# prints 'bla'
To be more specific, I want to pass manager.Queue() objects, not sure, if it makes a difference.
If this was a simple C fork(), since the whole process is copied identically - except for the pid -, self
would be the same. But Python multiprocessing may be doing something I am not aware of, or there may be a warning somewhere like "don't use it like this, this may change in the future". I did not find anything addressing specifically this question.
My actual worries is that arguments passed in args
, especially if they are associated with the multiprocessing module may be transformed around fork() to avoid whatever problems.
Python 3.6.5
For anything other than the fork start method, both the target and the arguments are sent to the worker processes using pickling, when Process.start()
is called. For the fork method, the child process is forked at the same point, so when Process.start()
is called.
So when you don't use the fork start method, what you need to worry about is if your data can be pickled. When that is the case then there is no reason to avoid using a class instance and self
; the whole instance is pickled as self.target
is a method that includes a reference to the instance:
>>> class C:
... def __init__(self):
... self.v = 'bla'
... def worker(self):
... print(self.v)
...
>>> c = C()
>>> data = pickle.dumps(c.worker)
>>> pickletools.dis(data)
0: \x80 PROTO 4
2: \x95 FRAME 71
11: \x8c SHORT_BINUNICODE 'builtins'
21: \x94 MEMOIZE (as 0)
22: \x8c SHORT_BINUNICODE 'getattr'
31: \x94 MEMOIZE (as 1)
32: \x93 STACK_GLOBAL
33: \x94 MEMOIZE (as 2)
34: \x8c SHORT_BINUNICODE '__main__'
44: \x94 MEMOIZE (as 3)
45: \x8c SHORT_BINUNICODE 'C'
48: \x94 MEMOIZE (as 4)
49: \x93 STACK_GLOBAL
50: \x94 MEMOIZE (as 5)
51: ) EMPTY_TUPLE
52: \x81 NEWOBJ
53: \x94 MEMOIZE (as 6)
54: } EMPTY_DICT
55: \x94 MEMOIZE (as 7)
56: \x8c SHORT_BINUNICODE 'v'
59: \x94 MEMOIZE (as 8)
60: \x8c SHORT_BINUNICODE 'bla'
65: \x94 MEMOIZE (as 9)
66: s SETITEM
67: b BUILD
68: \x8c SHORT_BINUNICODE 'worker'
76: \x94 MEMOIZE (as 10)
77: \x86 TUPLE2
78: \x94 MEMOIZE (as 11)
79: R REDUCE
80: \x94 MEMOIZE (as 12)
81: . STOP
highest protocol among opcodes = 4
In the above stream you can clearly see v
, 'blah'
and worker
named.
If you do use the fork start method, then the child process simply has full access to everything that was in memory in the parent process; self
is still referencing the same object you had before forking. Your OS takes care of the details there, such as ensuring that file descriptors are independent, and that the child process gets a copy of the memory blocks that are being altered.
Either way, further changes you make to the instance won't be visible to the parent process, unless you explicitly use data structures designed to be shared.