Search code examples
pythoncharacter-encoding

Python: How can I append literal bytes to a string with no decoding?


In Python, a string can have arbitrary bytes, via "\x??" escaping. These bytes don't necessarily have to map to a char in an encoding. For example, we can have "\xa0", even though 0xa0 isn't a good utf-8 char.

However, if I have a byte array, such as b'\xa0', I can't append it to a string without decoding it. What if I want to just append literally, just like "\xa0"?

How can I append a series of bytes to a string without decoding them at all, just like "\x" escape chars? Is there a "literal decoding" or "no decoding" option to decode()? If not, is there another way to do this?


Solution

  • First, consider whether storing these in a string is truly the best for your usecase. Storing as bytes/bytesarray is usually the more idiomatic option.

    However, if you have considered this and still decided to proceed, then you should pass "latin1" as the encoding option to bytes.decode. This converts the bytes directly to the characters with the corresponding value.