Search code examples
python-3.xmultiprocessingpython-multiprocessing

Sharing file descriptors in Python multiprocessing


I am trying to use Pythons multiprocessing module to spawn a server to receive UDP messages, modify them a little, and pass them on to a grep process started with the subprocess module. Since the stdin for a Popen subprocess accepts a file descriptor, that's what I would like to pass it.

The issue I am having is getting a file descriptor that communicates with the server process that I can pass to the grep subprocess. I have done this in the past using plain os.fork() and os.pipe(), but I would like to use multiprocessing with the spawn start method now. I have tried taking a write descriptor from os.pipe, making it inheritable, and passing it as an argument to a new process via multiprocess.Process. When I try to open it in the other process with os.fdopen(fd, 'wb'), I get an OSError for bad file descriptor. Here is a snipped of the code I have tested.

def _listen_syslog(ip_address, port, write_pipe):
    f = os.fdopen(write_pipe, 'wb')
    #do stuff like write to the file



def listen_syslog(ip_address, port):
    r, w = os.pipe()
    os.set_inheritable(w, True)
    proc = mp.Process(target=_listen_syslog, args=(ip_address, port, w))
    proc.start()
    #this process doesn't need to write, so close it
    os.close(w)
    #this is the descriptor I want to pass to a grep subprocess stdin
    #a similar scenario has worked before using os.fork()
    return r

Finally, if it's not possible to do this with a pipe created via os.pipe(), can I use multiprocessing.Pipe(), and use the file descriptors from the connections objects fileno() function to use directly? More importantly, is that safe to do as long as I don't use the connection objects for anything else?


Solution

  • I found a solution. I haven't figured out how to use os.pipe(), but if I use multiprocessing.Pipe(), I can use the file descriptor from each connection object by calling their fileno() functions. Another thing I found is if you want to use the file descriptors after the connection objects are no longer referenced, you have to call os.dup() on each file descriptor, or else they will close and you will get a bad file descriptor error when the connection objects get garbage collected.

    import multiprocessing as mp
    
    def _listen_syslog(ip_address, port, write_pipe):
        f = os.fdopen(write_pipe.fileno(), 'wb')
        #do stuff
        
    
    def listen_syslog(ip_address, port):
        r, w = mp.Pipe(False)
        proc = mp.Process(target=_listen_syslog, args=(ip_address, port, w))
        proc.start()
        return os.dup(r.fileno())