I am having issues with non ASCII characters.
I am using python 2.7.3
python -V
Python 2.7.3
I installed http://pymediainfo.readthedocs.org/en/latest/
easy_install pymediainfo
imported as below
from pymediainfo import MediaInfo
media_info = MediaInfo.parse(os.path.join(path, to, file))
using this with ascii characters is fine
for debugging I printed the 'command' from:
the class that 'runs' the mediainfo command in /usr/local/lib/python2.7/dist-packages/pymediainfo-1.3.5-py2.7.egg/pymediainfo/init.py
"PATH": "/usr/local/bin/:/usr/bin/",
"LD_LIBRARY_PATH": "/usr/local/lib/:/usr/lib/"}
def parse(filename, environment=ENV_DICT):
command = ["mediainfo", "-f", "--Output=XML", filename]
print command
print repr(command)
fileno_out, fname_out = mkstemp(suffix=".xml", prefix="media-")
fileno_err, fname_err = mkstemp(suffix=".err", prefix="media-")
fp_out = os.fdopen(fileno_out, 'r+b')
fp_err = os.fdopen(fileno_err, 'r+b')
p = Popen(command, stdout=fp_out, stderr=fp_err, env=environment)
xml_dom = MediaInfo.parse_xml_data_into_dom(fp_out.read())
return MediaInfo(xml_dom)
both print and print repr() display:
['mediainfo', '-f', '--Output=XML', "/mnt/path/Long 73\xc2\xb0 58' W.avi"]
the filename is:
Long 73° 58' W.avi
Looking at a UTF-8 table \xc2\xb0 corresponds to °
I am aware this might just be just the console not interpreting the encoding as it should but the output of mediainfo is just
<?xml version="1.0" encoding="UTF-8"?>
<Mediainfo version="0.7.58">
which means "file not found"
os.path.isfile(os.path.join(path, to, file))
returns true for these files
and in bash using
mediainfo -f --Output=XML "/path/to/file"
i have goggled and searched around and cannot see the answer.
Any ideas?
I used this new test script
# -*- coding: utf-8 -*-
import sys
import os
import subprocess as sub
root = "/mnt/path"
for rootfldr in sorted(os.listdir(root)):
if os.path.isfile(os.path.join(root, rootfldr)):
command = ['mediainfo', '-f', '--Output=XML', rootfldr]
aa = sub.Popen(command, stdout=sub.PIPE, stderr=sub.PIPE, stdin=sub.PIPE)
result = aa.communicate()[0]
print rootfldr
print result
And the results were fine (some of then had non ASCII chars).
I then decided to change (from the pymedinfo parse function):
p = Popen(command, stdout=fp_out, stderr=fp_err, env=environment)
p = Popen(command, stdout=fp_out, stderr=fp_err)
and the problem was solved
I am guessing that something is missing and or wrong from
"PATH": "/usr/local/bin/:/usr/bin/",
"LD_LIBRARY_PATH": "/usr/local/lib/:/usr/lib/"}
The command looks ok. The filename is passed as a bytestring that contains text encoded using utf-8. If your filesystem uses utf-8 then it is correct filename:
>>> print "/mnt/path/Long 73\xc2\xb0 58' W.avi".decode('utf-8')
/mnt/path/Long 73° 58' W.avi
It might be a bug in pymediainfo
. Try passing the environment
argument explicitely as a workaround e.g., environment=os.environ