So I have a stream of bytes, which I collect into a list as so:
byte_list.append(bytes[0])
This format decodes the bytes into integers (one of the few quirks I am finding about Python is why it decodes bytes into ASCII or integers without my asking)
So after a while I have this list of bytes
byte_list = [83, 0, 116, 0, 97, 0, 110, 0, 100, 0, 97, 0, 114, 0, 100, 0, 70, 0, 105, 0, 114, 0, 109, 0, 97, 0, 116, 0, 97, 0, 46, 0, 105, 0, 110, 0, 111]
How Can I decode this list into it's string values? What I thought of was:
for b in byte_list:
new_list.append(chr(byte_list[b]))
But this seems not correct. Could someone please offer guidance on how to decode this?
So I have a stream of bytes
And you want text. Looking at the data, it is in UTF-16LE (little-endian) encoding. Decode it:
>>> byte_list = [83, 0, 116, 0, 97, 0, 110, 0, 100, 0, 97, 0, 114, 0, 100, 0, 70, 0, 105, 0, 114, 0, 109, 0, 97, 0, 116,
0, 97, 0, 46, 0, 105, 0, 110, 0, 111, 0]
>>> bytes(byte_list).decode('utf-16le')
'StandardFirmata.ino'
Note that I added a final zero because the data was one byte short for a full UTF-16 stream. I assume the data was just a sample and not complete. UTF-16 requires two to four bytes per character.
If you started with a byte stream, it is a list of values 0-255. It is only displayed, for convenience, as ASCII:
>>> bytes(byte_list)
b'S\x00t\x00a\x00n\x00d\x00a\x00r\x00d\x00F\x00i\x00r\x00m\x00a\x00t\x00a\x00.\x00i\x00n\x00o'
In bytes format, you just need to .decode()
it into Unicode text.