Search code examples
pythonlinuxunicodeencodinglocale

Linux/Python: encoding a unicode string for print


I have a fairly large python 2.6 application with lots of print statements sprinkled about. I'm using unicode strings throughout, and it usually works great. However, if I redirect the output of the application (like "myapp.py >output.txt"), then I occasionally get errors such as this:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa1' in position 0: ordinal not in range(128)

I guess the same issue comes up if someone has set their LOCALE to ASCII. Now, I understand perfectly well the reason for this error. There are characters in my Unicode strings that are not possible to encode in ASCII. Fair enough. But I'd like my python program to make a best effort to try to print something understandable, maybe skipping the suspicious characters or replacing them with their Unicode ids.

This problem must be common... What is the best practice for handling this problem? I'd prefer a solution that allows me to keep using plain old "print", but I can modify all occurrences if necessary.

PS: I have now solved this problem. The solution was neither of the answers given. I used the method given at http://wiki.python.org/moin/PrintFails , as given by ChrisJ in one of the comments. That is, I replace sys.stdout with a wrapper that calls unicode encode with the correct arguments. Works very well.


Solution

  • I have now solved this problem. The solution was neither of the answers given. I used the method given at http://wiki.python.org/moin/PrintFails , as given by ChrisJ in one of the comments. That is, I replace sys.stdout with a wrapper that calls unicode encode with the correct arguments. Works very well.