I'm trying to construct and print a Unicode string with Python 3.x. So, for example, the following works fine:
a = '\u0394'
print(a)
Δ
But if I try to construct this by appending two strings, I have several problems:
a = '\u'
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
a = '\\u'
b = '0394'
c = a + b
print(c)
\u4308
What am I missing here?
\uhhhh
is an escape sequence, a notation used in string literals. You can't construct that notation from parts, at least not directly like that.
Generally, you'd use the chr()
function to produce individual characters from an integer instead:
>>> chr(int('0394', 16))
'Δ'
for example, where I first interpreted the hex string 0394
as an integer in base 16.
If you must generate the Python string literal escape notation, use codecs.decode()
with the unicode_escape
codec:
>>> import codecs
>>> r'\u' + '0394'
'\\u0394'
>>> codecs.decode(r'\u' + '0394', 'unicode_escape')
'Δ'