When having a look at the types str
and bytes
in Python, it turns out they are very similiar. The only differences wrt. their attributes are:
>>> set(dir(bytes)) - set(dir(str))
{'hex', 'fromhex', 'decode'}
>>> set(dir(str)) - set(dir(bytes))
{'isidentifier', 'encode', 'isdecimal', 'isnumeric', 'casefold', 'format', 'isprintable', 'format_map'}
Checking the Python documentation, I figured that these differences should not be relevant for their relation to the abstract base class collections.abc.ByteString
. However, bytes
is regarded a subclass while str
is not:
>>> issubclass(bytes, collections.abc.ByteString)
True
>>> issubclass(str, collections.abc.ByteString)
False
While the observed behaviour is useful to discern these types, I do not understand why Python behaves that way. In my understanding of Python's duck typing concept, both str
and bytes
should be regarded as subclasses, as long as they bring the relevant attributes.
A str
isn't a string of bytes. ByteString
's meaning isn't encompassed by its methods, and str
does not fit the meaning of ByteString
. (The ABC mostly exists as a way to bundle bytes
and bytearray
for isinstance
checks, hence the "This unifies bytes and bytearray." in its docstring.)
You might wonder why issubclass
doesn't automatically consider str
a ByteString
subclass anyway based on its methods. Unless an ABC specifically implements __subclasshook__
to check for methods, issubclass
will not automatically consider a class a subclass of an ABC based on the presence of any particular methods. bytes
and bytearray
are subclasses of ByteString
because they are specifically register
ed as subclasses.