Search code examples
pythonstringsubprocess

subprocess "TypeError: a bytes-like object is required, not 'str'"


I'm using this code from a previously asked question a few years ago, however, I believe this is outdated. Trying to run the code, I receive the error above. I'm still a novice in Python, so I could not get much clarification from similar questions. Why is this happening?

import subprocess

def getLength(filename):
  result = subprocess.Popen(["ffprobe", filename],
    stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
  return [x for x in result.stdout.readlines() if "Duration" in x]

print(getLength('bell.mp4'))

Traceback

Traceback (most recent call last):
  File "B:\Program Files\ffmpeg\bin\test3.py", line 7, in <module>
    print(getLength('bell.mp4'))
  File "B:\Program Files\ffmpeg\bin\test3.py", line 6, in getLength
    return [x for x in result.stdout.readlines() if "Duration" in x]
  File "B:\Program Files\ffmpeg\bin\test3.py", line 6, in <listcomp>
    return [x for x in result.stdout.readlines() if "Duration" in x]
TypeError: a bytes-like object is required, not 'str'

Solution

  • subprocess returns bytes objects for stdout or stderr streams by default. That means you also need to use bytes objects in operations against these objects. "Duration" in x uses str object. Use a bytes literal (note the b prefix):

    return [x for x in result.stdout.readlines() if b"Duration" in x]
    

    or decode your data first, if you know the encoding used (usually, the locale default, but you could set LC_ALL or more specific locale environment variables for the subprocess):

    return [x for x in result.stdout.read().decode(encoding).splitlines(True)
            if "Duration" in x]
    

    The alternative is to tell subprocess.Popen() to decode the data to Unicode strings by setting the encoding argument to a suitable codec:

    result = subprocess.Popen(
        ["ffprobe", filename],
        stdout=subprocess.PIPE, stderr = subprocess.STDOUT,
        encoding='utf8'
    )
    

    If you set text=True (Python 3.7 and up, in previous versions this version is called universal_newlines) you also enable decoding, using your system default codec, the same one that is used for open() calls. In this mode, the pipes are line buffered by default.