Search code examples
pythonpython-3.xunicodeemojipython-unicode

Showing text representation of Unicode symbol in Python 3


I can do this in my ipython notebook:

print(u"\u2605")

But how do I go backwards? That is, going from the symbol to the unicode string. Encoding it in UTF-8 or UTF-16 is giving binary representations. For example:

print('★'.encode('utf-16'))

b'\xff\xfe\x05&'


Solution

  • You can use unicode-escape encoding:

    >>> '★'.encode('unicode-escape')
    b'\\u2605'
    >>> print('★'.encode('unicode-escape').decode())
    \u2605
    

    or ord if you just want to know the codepoint:

    >>> ord('★')
    9733
    >>> hex(ord('★'))  # as hexa decimal
    '0x2605'
    >>> print(r'\u%x' % ord('★'))
    \u2605
    

    UPDATE

    You can also use ascii:

    >>> print(ascii('★'))  # NOTE: surrounding quote
    '\u2605'
    >>> print(ascii('★').strip("'"))
    \u2605