I want to decode a series of strings of variable length which have been encoded in UTF16-BE preceded by a two-bytes long big-endian integer indicating the half the byte-length of the following string. e.g:
Length String (encoded) Length String (encoded) ...
\x00\x05 \x00H\x00e\x00l\x00l\x00o \x00\x06 \x00W\x00o\x00r\x00l\x00d\x00! ...
All these strings and their length headers are concatenated in one big bytestring
.
I have the encoded bytestring as bytes
object in memory. I would like to have an iterable function which would yield strings until it reaches the end of the ByteString
.
Not a huge improvement, but your code can be streamlined a bit.
def decode_strings(byte_string: ByteString) -> Generator[str]:
with io.BytesIO(byte_string) as stream:
while (s := stream.read(2)):
length = int.from_bytes(s, byteorder="big")
yield bytes.decode(stream.read(length), encoding="utf_16_be")