I'm fairly new to python multiprocessing, and I am trying to write a class that is able to asynchronously execute functions and attached callbacks.
First of all, let's settle a common nomenclature for this specific problem:
┬─ process1 (parent)
└─┬─ process2 (child of parent1)
└─── process3 (child of parent2)
After documenting me a bit on the subject, and following this SO question, in order to do so, I've come up with the following code for the run
method:
import multiprocessing
class AsyncProcess:
def __init__(self, target, callback=None, args=(), kwargs={}):
self._target = target
self._callback = callback
self._args = args
self._kwargs = kwargs
self._process = None
def run(self):
def wrapper():
return_value = self._target(*self._args, **self._kwargs)
if self._callback is not None:
process = multiprocessing.Process(target=self._callback, args=(return_value,))
process.start()
multiprocessing.process._children.discard(process)
self._process = multiprocessing.Process(target=wrapper)
self._process.start()
The AsyncProcess
class is bigger than this (it is intended to work as an adapter between multiprocessing.Process
and subprocess.Popen
for executing both external processes and python functions in new processes); that's why it is not a subclass of multiprocessing.Process
and is instead just using it (in case anyone wondered).
What I'm trying to achieve here is to be able to launch a child process (process3
) from within another process (process2
) without it (process2
) having to wait for the child process (process3
) (since it may take way longer for the child to finish than for the parent). The daemon
attribute of multiprocessing.Process
is not useful since, when the parent process dies (process2
), the child process (process3
) is killed too (and I just want to leave it running until it finishes).
There is, however, two things I don't like at all about the solution I've come up with:
multiprocessing
, which I don't like a bit.process3
) from the children pool of the parent (process2
), I'm guessing that leaves the poor child orphan (which I don't really know what exactly implies but most certainly is no good practice at all).The thing here is how can I achieve the parent not waiting for the children without actually killing the children nor falling into orphan processes? Is there any other, more correct or more elegant way of achieving what I'm trying to do?
I was thinking about assigning the child process (process3
) to the parent of the process that spawned this child process (process1
), i.e. the grandparent of the child process (which I know will be alive for sure), but I haven't found a way to actually do that.
Some clarifications:
Popen
does what I want to achieve, but it only does it with external processes, i.e. I cannot execute a python function with all the context by using Popen
(that I know, of course).os.fork
had come to mind, but I find the way of distinguishing parent vs. child code a bit cumbersome (handling PID == 0
and PID != 0
cases, etc.).threading
package since I wanted to manage processes, not threads, and leave thread management to the OS.process3
from process1
directly solves the problem of orphan processes, but then I have to do an active polling on process1
in order to know when process2
finishes, which is actually not an option (process1
manages a server that cannot be blocked).process2
to finish as soon as possible in order to get some data from it; that's why I'm not executing the content of process3
directly inside process2
.Something I came up with while writing the question:
Since my problem is having to launch process3
from within process2
and launching it from process1
solves the problem but active polling on process1
is not an option, I could also launch process2
and process3
from within process1
at the same time, passing the process2
object to process3
and perform the active polling on process3
with a small interval to ensure quick response after process2
finished.
Does this have any sense or is it an over complicated solution for something that is already solved (and that I don't know of)?
What you want to do (ignoring the question of whether it is a good idea) is not possible with the multiprocessing library without the kind of tinkering with the internals that you want to avoid, precisely because the multiprocessing library is designed specifically for child processes that do not outlive the parent.
I think the answer is indeed to use subprocess.Popen()
, although that will mean foregoing the nice high-level API of the multiprocessing library. No, you can't execute a Python function directly but you can create a separate script.py to call the function you want.