Search code examples
pythonpython-3.xunicodeutf-8character-encoding

What is the difference between encoding utf-8 and utf8 in Python 3.5?


What is the difference between encoding utf-8 and utf8 (if there is any)?

Given the following example:

u = u'€'
print('utf-8', u.encode('utf-8'))
print('utf8 ', u.encode('utf8'))

It produces the following output:

utf-8 b'\xe2\x82\xac'
utf8  b'\xe2\x82\xac'

Solution

  • There's no difference. See the table of standard encodings. Specifically for 'utf_8', the following are all valid aliases:

    'U8', 'UTF', 'utf8'
    

    Also note the statement in the first paragraph:

    Notice that spelling alternatives that only differ in case or use a hyphen instead of an underscore are also valid aliases; therefore, e.g. 'utf-8' is a valid alias for the 'utf_8' codec