Search code examples
pythonpandasffmpegterminalpopen

Capturing terminal output into pandas dataframe without creating external text file


I am using ffmpeg's extract_mvs file to generate some text information. I would use a command like this in the terminal:

/extract_mvs input.mp4 > output.txt

I would like to use this command with Popen or other subprocess in python such that instead of output.txt, the data is passed straight to a pandas data frame without actually generating the text file.

The idea is to automate this multiple times, so, I am trying to avoid many .txt files from being generated and thus having to open() them one by one.

I thought of something like this:

import subprocess
cmd = ['./extract_mvs', 'input.mp4']
a = subprocess.Popen(cmd, stdout=subprocess.PIPE)
df = pd.read_csv(a.communicate()[0], sep=',')

But then I get an error: OSError: Expected file path name or file-like object, got <class 'bytes'> type

Can it be fixed and extended so as to read straight from subprocess to pandas?


Solution

  • I found a workaround, using part of the answer of Keith and the one found here, to pass information from string to pandas dataframe.

    The final working code is:

    import sys
    import subprocess
    import pandas as pd
    
    cmd = ['./extract_mvs', 'input.mp4']
    a = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    
    if sys.version_info[0] < 3: 
        from StringIO import StringIO
    else:
        from io import StringIO
    
    b = StringIO(a.communicate()[0].decode('utf-8'))
    
    df = pd.read_csv(b, sep=",")