Search code examples
pythonunicode

Python, Unicode, and the Windows console


When I try to print a string in a Windows console, sometimes I get an error that says UnicodeEncodeError: 'charmap' codec can't encode character ..... I assume this is because the Windows console cannot handle all Unicode characters.

How can I work around this? For example, how can I make the program display a replacement character (such as ?) instead of failing?


Solution

  • Note: This answer is sort of outdated (from 2008). Please use the solution below with care!!


    Here is a page that details the problem and a solution (search the page for the text Wrapping sys.stdout into an instance):

    PrintFails - Python Wiki

    Here's a code excerpt from that page:

    $ python -c 'import sys, codecs, locale; print sys.stdout.encoding; \
        sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout); \
        line = u"\u0411\n"; print type(line), len(line); \
        sys.stdout.write(line); print line'
      UTF-8
      <type 'unicode'> 2
      Б
      Б
    
      $ python -c 'import sys, codecs, locale; print sys.stdout.encoding; \
        sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout); \
        line = u"\u0411\n"; print type(line), len(line); \
        sys.stdout.write(line); print line' | cat
      None
      <type 'unicode'> 2
      Б
      Б
    

    There's some more information on that page, well worth a read.