Search code examples
pythonencodingwxpython

Output console in wxpython can't print cyrillic


Even tho I have #coding=utf-8 at the top of my .py document and covnvert cyrillica strings to utf-8 before passing them to the console, it still gives me:

File "C:\Python27\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table) UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 16: character maps to

What else can I do?

This is my to_utf8 function:

def to_utf8(obj):
    if isinstance(obj, dict):
        return dict([(to_utf8(key), to_utf8(value)) for key, value in obj.iteritems()])
    elif isinstance(obj, list):
        return [to_utf8(element) for element in obj]
    elif isinstance(obj, unicode):
        return obj.encode('utf-8')
    else:
        return obj 

Solution

  • You are going the wrong way: Obviously the bytes in your str are utf8. However, python does not care what is in a str (a sequence of UTF-8-encoded unicode codepoints is just another sequence of bytes from pythons viewpoint).

    This remains to be answered: For reasons I don't know it tries to decode to cp1252.

    If you spoon-feed utf8 to python, it works. Equally, if you explicitly prefix the u literal, Python does know what is in the character sequence (it is a unicode type now). str != unicode != utf8.

    # -*- coding: utf-8 -*-
    import wx
    # works
    mystr= "СТАЛИНГРАД".decode('utf8')
    # this also works
    mystr= u"СТАЛИНГРАД"
    # uncomment to make code fail
    #mystr= "СТАЛИНГРАД"
    app = wx.App(0)
    frm = wx.Frame(None, -1, mystr)
    frm.Show()
    app.MainLoop())
    

    wxPython 3.0 is unicode only and accepts utf-8 AND unicode.