In python2, I can produce these hex bytes represented in a string format all day
'\x00\xaa\xff'
>>>’00'.decode('hex') + 'aa'.decode('hex') + 'ff'.decode('hex')
>>>'\x00\xaa\xff'
Similarily, I can do this in python3
>>> bytes.fromhex(’00’) + bytes.fromhex(‘aa’) + bytes.fromhex(‘ff’)
>>>b'\x00\xaa\xff'
According to py2->py3 changes here
Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however encoded Unicode is represented as binary data.
So with the Py2 version the output is a string while the Py3 version’s is binary data of type bytes
But I really need a string version!
According to the aforementioned doc:
As the str and bytes types cannot be mixed, you must always explicitly convert between them. Use str.encode() to go from str to bytes, and bytes.decode() to go from bytes to str. You can also use bytes(s, encoding=...) and str(b, encoding=...), respectively.
Ok, so now I have to decode this binary data of type bytes…
>>> b'\x00\xaa\xff'.decode()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xaa in position 1: invalid start byte
Oops! I don’t care about UTF-8 encodings here.
Can I just get a dummy pass-through codec?
PS
Why do I need '\x00\xaa\xff'
instead of b'\x00\xaa\xff'
?
Because I am taking this string and passing it into
a crc function written in pure python
crc16pure.crc16xmodem('\x00\xaa\xff')
This function expects to iterate through a string composed of bytes.
If I give the function b'\x00\xaa\xff'
then that is just a number which cannot be iterated with.
The question: Can I just get a dummy pass-through codec?
The answer: Yes, use iso-8859-1
In python3, the following doesn't work
b'\x00\xaa\xff'.decode()
The default codec 'utf-8' can't decode byte 0xaa
As long you don't care about the character sets (as in, what char you see when you print()
) and just want a string of 8bit chars like what you would get in python2, then use an 8bit codec iso-8859-1
b'\x00\xaa\xff'.decode('iso-8859-1')