Search code examples
pythonconcurrencymultiprocessingconcurrent.futures

Get process id of spawned process in Python Pebble library?


I am using a ProcessPool from Pebble library to launch a subprocess that is prone to crashing. I'd like to log the process-id of the subprocess that crashed but from the main process & not the child process(reason for this is I have a log line in the main process with a bunch of relevant information related to one request where I want to include this instead of being scattered across multiple log lines). Is there some way to access this process-id? I can't seem to find this information in the documentation.

I guess as a workaround I can get the pid in the subprocess before doing anything using os.getpid() and use IPC to communicate it back to the parent process. But I'd like to avoid this if possible.


Solution

  • The ProcessPool is designed to abstract its inner workings from the User. Therefore, it hides the access to the processes used to execute the workers.

    If you need this information just for debugging purposes, my suggestion would be to mark your jobs with unique identifiers and then log from the worker processes both these identifiers and the worker PID. In this way you can correlate what is the job which is causing your function to crash.

    def function(jobid, *args):
        logging.debug("Job ID %d started on worker %d", jobid, os.getpid())
        
        ...
    
    pool.schedule(function, (jobid, arg1, arg2))