Search code examples
pythonpython-3.xunicodehebrew

How to get the Unicode character from a code point variable?


I have a variable which stores the string "u05e2" (The value is constantly changing because I set it within a loop). I want to print the Hebrew letter with that Unicode value. I tried the following but it didn't work:

>>> a = 'u05e2'
>>> print(u'\{}'.format(a))

I got \u05e2 instead of ע(In this case).

I also tried to do:

>>> a = 'u05e2'
>>> b = '\\' + a
>>> print(u'{}'.format(b))

Neither one worked. How can I fix this?

Thanks in advance!


Solution

  • All you need is a \ before u05e2. To print a Unicode character, you must provide a unicode format string.

    a = '\u05e2'
    print(u'{}'.format(a))
    
    #Output
    ע
    

    When you try the other approach by printing the \ within the print() function, Python first escapes the \ and does not show the desired result.

    a = 'u05e2'
    print(u'\{}'.format(a))
    
    #Output
    \u05e2
    

    A way to verify the validity of Unicode format strings is using the ord() built-in function in the Python standard library. This returns the Unicode code point(an integer) of the character passed to it. This function only expects either a Unicode character or a string representing a Unicode character.

    a = '\u05e2'
    print(ord(a)) #1506, the Unicode code point for the Unicode string stored in a
    

    To print the Unicode character for the above Unicode code value(1506), use the character type formatting with c. This is explained in the Python docs.

    print('{0:c}'.format(1506))
    
    #Output
    ע
    

    If we pass a normal string literal to ord(), we get an error. This is because this string does not represent a Unicode character.

    a = 'u05e2'
    print(ord(a))
    
    #Error
    TypeError: ord() expected a character, but string of length 5 found