How to specify encoding when running python script as a module?
For example, I want to run my_script.py
as python -m my_script -utf8
. But there is no such an option. Instead, I should provide my_script.py
with encoding on top of the file. And it fails with some python-2.7 packages.
Consider next scenario:
my_script.py:
# coding=utf-8
from pyglet.gl import *
$ cd ~/Documents
$ mkdir вафля
$ cd вафля
my_script.py
with the code abovepython my_script.py
-- works wellpython -m my_script
-- failsWork station: Ubuntu 14.04.3 x64 + Python 2.7.6 x64 (built-in)
Do not suggest me to switch on Python 3.4 because I've already done it and just want to support both 2.7 and 3.4 versions of Python.
Added traceback.
File "my_script.py", line 22, in <module>
from pyglet.gl import *
File "/usr/local/lib/python2.7/dist-packages/pyglet/gl/__init__.py", line 236, in <module>
import pyglet.window
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/__init__.py", line 1817, in <module>
gl._create_shadow_window()
File "/usr/local/lib/python2.7/dist-packages/pyglet/gl/__init__.py", line 205, in _create_shadow_window
_shadow_window = Window(width=1, height=1, visible=False)
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/xlib/__init__.py", line 163, in __init__
super(XlibWindow, self).__init__(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/__init__.py", line 559, in __init__
self._create()
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/xlib/__init__.py", line 353, in _create
self.set_caption(self._caption)
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/xlib/__init__.py", line 511, in set_caption
self._set_text_property('WM_NAME', caption, allow_utf8=False)
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/xlib/__init__.py", line 785, in _set_text_property
buf = create_string_buffer(value.encode('ascii', 'ignore'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 19: ordinal not in range(128)
This appears to be a bug in pyglet
. It using sys.argv[0]
as its default window caption, but it expects the caption string to be a unicode
instance, which it can later encode
to ASCII (ignoring non-representable unicode values). However, in Python 2, sys.argv[0]
will be a bytestring (a str
instance) in some encoding (I'm not sure if the encoding specified anywhere or if it might vary from filesystem to filesystem). When you try to encode
an already encoded bytestring, Python 2 first tries to decode the string to a unicode
object using the ascii
codec, before encoding as requested.
You're seeing this bug bite you only when you use the -m
flag because only in that situation (of the ways you tested) is the non-ASCII part of the path included in sys.argv[0]
. When you call python my_script.py
, sys.argv[0]
is "my_script.py"
. When you use -m
, sys.argv[0]
will be the absolute path to the script file (including the non-ASCII folder).
I'm not sure exactly what a proper fix would be, since, as I mentioned above, I'm not sure the encoding used by sys.argv
is well specified in Python 2. If you want to fix the issue just for your system, you can probably just change these lines in pyglet/window/__init__.py
(they should be roughly lines 555-556):
if caption is None:
caption = sys.argv[0]
To:
if caption is None:
caption = sys.argv[0].decode("utf8")