Search code examples
pythontextlotus-notespython-unicode

read a unicode text file using python


when I read the file it's coming out with all very odd characters special characters I don't understand. How do I get python to read the file in the same way notepad++ does?

the text file came from exporting an email from lotus notes 9 to unicode text.


Solution

  • "Unicode" mode on Windows generally means UTF-16LE with a byte-order marker (BOM). If you're on Python 2.X, open the file with codecs.open(filename, encoding='utf-16') as described in the Unicode How-To section on reading Unicode data. If you're on 3.x, you can just use open(filename, encoding='utf-16').

    Writing it out again will depend on what encoding you're trying to write to.