Search code examples
pythonstringunicode

translating Unicode characters from input


I have a string with unicode characters that I need to decode. When I hardcode the string into python it seems to work. However, if I get it through an input, it doesn't translate. For example,

input_0 = input() #f\u00eate
print(input_0) # prints f\u00eate
word = "f\u00eate"
print(word) # prints fête

How could I turn the Unicode parts of the string from the input into regular characters? I have tried using str(word) too.


Solution

  • What you get from input() is a raw-string which means you don't have escape sequence they are literal characters. \u00ea is 6 characters.

    You should encode it with "raw-unicode-escape" and then decode it with "unicode-escape":

    input_0 = input()  # f\u00eate
    print(input_0.encode("raw-unicode-escape").decode("unicode-escape"))
    

    Explanation for these two encodings: https://docs.python.org/3/library/codecs.html#text-encodings