Search code examples
pythonwindows-10python-2.xpython-idleunicode-string

How to print unicode strings in a Python 2 shell under Windows?


I'm having problems when trying to print symbols such as €, ≤, Å, Ω, ℃, etc., in Python 2.7.11 under Windows 10. I expected that running this piece of code from IDLE:

print u'\u20AC\u2A7D\u212B\u2126\u2103'

would produce the following output on the screen:

>>> ================================ RESTART ================================
>>> 
€⩽ÅΩ℃
>>>

But it didn't. I obtained a funky string of non-ascii characters instead. After struggling for a while, I finally got the expected output by setting up an environment variable:

PYTHONIOENCODING=UTF-8

So far, so good. My problem is that I am unable to get the same output from the Python shell:

>>> print u'\u20AC\u2A7D\u212B\u2126\u2103'
Ôé¼Ô®¢Ôä½ÔäªÔäâ
>>>

I have unsuccessfully tried a number of workarounds I found in answers to similar questions:

  1. Changed the code page from 850 (which is the default in my system) to 65001 (which corresponds to utf-8 enconding)

  2. Wrapped sys.stdout to ensure the appropriate encoding

    sys.stdout = codecs.getwriter('utf8')(sys.stdout)
    
  3. Even changed - although it is widely discouraged - the default encoding

    sys.setdefaultencoding("UTF-8")
    

None of the above worked for me.

My question is twofold:

  • Why if I run print u'\u20AC\u2A7D\u212B\u2126\u2103' from IDLE the output is €⩽ÅΩ℃ (as expected) whereas if I run this code from the Python shell the output is incorrect?
  • Does anyone have any tips for printing those symbols correctly from the shell?

Solution

  • Why: IDLE uses tkinter, which wraps the tcl/tk GUI framework. Tcl/tk uses unicode strings, like Python 3, except that it is limited to the first 2**16 characters (the Basic Multilingual Plane, BMP). On Windows, Python uses Command Prompt, which uses code pages mostly limited to 256 chars. CP65001 seems to be a fraud; join the large crowd of people who have failed to get it to work over the last decade. (Search web for code page 65001.)

    Tip: unless you limit output to chars in a working codepage, use IDLE to run the program. IDLE has a -r file startup option. See Help => IDLE Help, 3.1 Command line usage. I don't normally recommend using IDLE to run already developed programs, but do on Windows for BMP output.