I am making a library that needs to spawn multiple processes.
I want to be able to know the set of all descendant processes that were spawned during a test. This is useful for terminating well-behaved daemons at the end of a passed test or for debugging deadlocks/hanging processes by getting the stack trace of any processes present after a failing test.
Since some of this requires spawning daemons (fork, fork, then let parent die), we cannot find all processes by iterating over the process tree.
Currently my approach is:
os.register_at_fork
(pid, process start time)
into another fileThe downsides of this approach are:
multiprocessing
or os.fork
- does not work when spawning a new Python process using subprocess
or a non-Python process.I am looking for a different way to track child processes that avoids these 2 downsides.
Alternatives I have considered:
Can someone suggest an approach to this problem that avoids the pitfalls and downsides of the ones above? I am only interested in Linux right now, and ideally it shouldn't require a kernel later than 4.15.
Given the constraints from my original post, I used the following approach:
putenv("PID_DIR", <some tempdir>)
fork
and clone
with versions which will trace the process start time to $PID_DIR/<pid>
. The override is done using plthook and applies to all loaded shared objects. dlopen
should also be overridden to override the functions on any other dynamically loaded libraries.__libc_start_main
, fork
, and clone
as LD_PRELOAD
.An initial implementation is available here used like:
import process_tracker; process_tracker.install()
import os
pid1 = os.fork()
pid2 = os.fork()
pid3 = os.fork()
if pid1 and pid2 and pid3:
print(process_tracker.children())