Search code examples
pythonsubprocesspastezcat

How to run paste <(zcat f1.gz) <(zcat f2.gz) using python's subprocess?


I am trying to run paste <(zcat f1.gz) <(zcat f2.gz) using subprocess. Here's what I have done so far:

ps1 = subprocess.Popen(('zcat', 'f1.gz'), stdout=subprocess.PIPE)
ps2 = subprocess.Popen(('zcat', 'f2.gz'), stdout=subprocess.PIPE)
ps3 = subprocess.Popen('paste', stdout=subprocess.PIPE, stdin=subprocess.PIPE)

But I am not sure how to provide ps3 with both ps1.stdout and ps2.stdout as inputs. I would appreciate it if you guys can help me with this and let me know if I am on the right track.


Solution

  • My answer is largely inspired by this post allowing multiple inputs to python subprocess where the problem is slightly different.

    Basically one solution is to use a fifo : subprocesses write in the fifo while a thread consumes data written by subprocesses.

    import subprocess
    import os
    import threading
    
    #  create our fifo for data exchange between processes
    os.mkfifo('my-fifo')
    
    #  create a reader thread that consumes data from our fifo
    def read_from_fifo():
        with open('my-fifo', 'rb') as fd:
            subprocess.Popen('paste', stdin=fd)
    t = threading.Thread(target=read_from_fifo) 
    t.start()
    
    #  write commands output to our fifo
    with open('my-fifo', 'wb') as fifo:
        for cmd in [('zcat', 'f1.gz'), ('zcat', 'f2.gz')]:
            p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
            fifo.write(p.stdout.read())
    
    t.join()  # wait that our thread has consumed all data
    os.unlink('my-fifo')