Search code examples
pythonarraysencodingpython-2to3

Unable to decode bytearray in python 3 but possible in python 2


I am trying to print a bytearray as a string of ascii characters in Python 3.

I have a bytearray which I have tried to print using both Python 2 and Python 3. In Python 2 the bytearray is printed to the console in proper ascii characters. However, when I try it in Python 3 i get an error like so:

Python2:

print(bytearray(b"\x0e6G\xe8Y-5QJ\x08\x12CX%6\xed=\xe6s@Y\x00\x1e?S\\\xe6\'\x102"))

# 6G?Y-5QCX%6?=?s@Y?S\?'2

Python3:

print(bytearray(b"\x0e6G\xe8Y-5QJ\x08\x12CX%6\xed=\xe6s@Y\x00\x1e?S\\\xe6\'\x102").decode("ascii"))

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 3: ordinal not in range(128)

How do I achieve the same behaviour in Python 3 as in Python 2? Does print in Python 2 do something else than simply decode the byte array as ascii?


Solution

  • ascii is 7-bit. Use iso-8859-15 or the like that is 8-bit. Which one of the 8-bit codecs you chose will depend on your preferred mapping of the high bit characters.

    >>> print(bytearray(b"\x0e6G\xe8Y-5QJ\x08\x12CX%6\xed=\xe6s@Y\x00\x1e?S\\\xe6\'\x102").decode("iso-8859-15"))
    6GèY-5QCX%6í=æs@Y?S\æ'2
    >>> print(bytearray(b"\x0e6G\xe8Y-5QJ\x08\x12CX%6\xed=\xe6s@Y\x00\x1e?S\\\xe6\'\x102").decode("iso-8859-15").encode("iso-8859-15") == bytearray(b"\x0e6G\xe8Y-5QJ\x08\x12CX%6\xed=\xe6s@Y\x00\x1e?S\\\xe6\'\x102"))
    True