Writing unicode with python - what is wrong with this character

With python 2.7 I am reading as unicode and writing as utf-16-le. Most characters are correctly interpreted. But some are not, for example, u'\u810a', also known as unichr(33034). The following code code does not write correctly:

import codecs
with open('temp.txt','w') as temp:
    temp.write(codecs.BOM_UTF16_LE)     
    text = unichr(33034)  # text = u'\u810a'
    temp.write(text.encode('utf-16-le'))

But either of these things, when replaced above, make the code work.

unichr(33033) and unichr(33035) work correctly.
'utf-8' encoding (without BOM, byte-order mark).

How can I recognize characters that won't write correctly, and how can I write a 'utf-16-le' encoded file with BOM that either prints these characters or some replacement?

Solution

You are opening the file in text mode, which means that line-break characters/bytes will be translated to the local convention. Unfortunately the character you are trying to write includes a byte, 0A, that is interpreted as a line break and does not make it to the file correctly.

Open the file in binary mode instead:

open('temp.txt','wb')