Search code examples
pythonpython-2.7unicodeutf-8

How to convert a string to utf-8 in Python


I have a browser which sends utf-8 characters to my Python server, but when I retrieve it from the query string, the encoding that Python returns is ASCII. How can I convert the plain string to utf-8?

NOTE: The string passed from the web is already UTF-8 encoded, I just want to make Python to treat it as UTF-8 not ASCII.


Solution

  • In Python 2

    >>> plain_string = "Hi!"
    >>> unicode_string = u"Hi!"
    >>> type(plain_string), type(unicode_string)
    (<type 'str'>, <type 'unicode'>)
    

    ^ This is the difference between a byte string (plain_string) and a unicode string.

    >>> s = "Hello!"
    >>> u = unicode(s, "utf-8")
    

    ^ Converting to unicode and specifying the encoding.

    In Python 3

    All strings are unicode. The unicode function does not exist anymore. See answer from @Noumenon