Search code examples
pythonwindowsutf-16

Bug with Python UTF-16 output and Windows line endings?


With this code:

test.py

import sys
import codecs

sys.stdout = codecs.getwriter('utf-16')(sys.stdout)

print "test1"
print "test2"

Then I run it as:

test.py > test.txt

In Python 2.6 on Windows 2000, I'm finding that the newline characters are being output as the byte sequence \x0D\x0A\x00 which of course is wrong for UTF-16.

Am I missing something, or is this a bug?


Solution

  • Try this:

    import sys
    import codecs
    
    if sys.platform == "win32":
        import os, msvcrt
        msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
    
    class CRLFWrapper(object):
        def __init__(self, output):
            self.output = output
    
        def write(self, s):
            self.output.write(s.replace("\n", "\r\n"))
    
        def __getattr__(self, key):
            return getattr(self.output, key)
    
    sys.stdout = CRLFWrapper(codecs.getwriter('utf-16')(sys.stdout))
    print "test1"
    print "test2"