I have a unique problem - my solution, which I wish to package using pyinstaller, JITs some things based on sys.argv at startup. When you use multiprocessing
with freeze_support
on Windows, multiprocessing needs to pass in different arguments to initialize the new process. The original sys.argv are eventually set when the target function is invoked. How can I get the original sys.argv before invocation of the target function?
import sys
import multiprocessing
print('ArgV:', sys.argv)
def print_argv():
print(sys.argv)
if __name__ == '__main__':
multiprocessing.freeze_support()
print_argv()
p = multiprocessing.Process(target=print_argv)
p.start()
p.join()
When packaged with pyinstaller and run with --hello=True
, yields:
ArgV: ['scratch.exe', '--hello=True']
['scratch.exe', '--hello=True']
ArgV: ['scratch.exe', '--multiprocessing-fork', 'parent_pid=16096', 'pipe_handle=380']
['scratch.exe', '--hello=True']
I would like some magic code that gives me my original sys.argv
, that is, --hello=True
, when sys.argv
is set to --multiprocessing-fork...
I've never extensively played with freezing executables, but I have several ideas...
Taking a look at multiprocessing.spawn._main()
, copying across the original sys.argv
happens here:
preparation_data = reduction.pickle.load(from_parent)
prepare(preparation_data)
If you override Process.__new__
, you should be able to run code before _bootstrap
(which eventually calls run
on the process object), but after sys.argv
is received.
import sys
import multiprocessing
print('ArgV:', sys.argv)
def print_argv():
print(sys.argv)
class myProcess(multiprocessing.Process):
def __new__(cls, *args, **kwargs):
if __name__ == "__mp_main__":
print("hook", sys.argv)
instance = super(myProcess, cls).__new__(cls)
instance.__init__(*args, **kwargs)
return instance
if __name__ == '__main__':
multiprocessing.freeze_support()
print_argv()
p = myProcess(target=print_argv)
p.start()
p.join()
Another idea is to hook the unpickle process by overriding __getstate__
and __setstate__
.
class myProcess(multiprocessing.Process):
def __getstate__(self):
return self.__dict__.copy()
def __setstate__(self, state):
print("hook", sys.argv)
self.__dict__.update(state)
Finally you could hook the audit event generated when pickle looks for a custom class to unpickle:
class myProcess(multiprocessing.Process):
pass
def hook(event_name, args):
if "pickle.find_class" in event_name:
if args[1] == myProcess.__name__:
print("hook", sys.argv)
sys.addaudithook(hook)
All of these occur roughly at the same time during loading of the new process, and I couldn't say which is the most robust...