import base64
base64.b64encode(b'bytes required')
>>>b'Ynl0ZXMgcmVxdWlyZWQ='
If I understand correctly, base64 is a bytes <----> string notation. Then why doesn't it give me string 'Ynl0ZXMgcmVxdWlyZWQ='
directly?
Or does it expect me to do some decoding furthermore?
Like b'Ynl0ZXMgcmVxdWlyZWQ='.decode('ascii')
or
b'Ynl0ZXMgcmVxdWlyZWQ='.decode('utf-8')
? But they result in the same thing.
You are correct that base64
is meant to be a textual representation of binary data.
However, you are neglecting constraints on the actual implementation side of things.
import sys
>>> sys.getsizeof("Hello World")
60
>>> sys.getsizeof("Hello World".encode("utf-8"))
44
str
objects simply take up more system resources than bytes
. This overhead can lead to non-trivial degradation in performance when working with larger bodies of base64
encoded data.
I also suspect that since the original python module was ported from python2.7 (which did not distinguish between str
and bytes
), that this might also just be a legacy part of the language.