Search code examples
python-3.xabc

is there a pythonics way to distinguish Sequences objects like "tuple and list" from Sequence objects like "bytes and str"


I have a function like this one

def print_stuff(items):
    if isinstance(items, (str, bytes)):
        items = (items,)
    for item in items:
        print (item)

that can be called as follows:

In [37]: print_stuff(('a', 'b'))
a
b

In [38]: print_stuff('a')
a

I don't like doing isinstance (items, (str, bytes)) I would prefer to do isinstance(item, (collections.abc.MAGIC))

where MAGIC is a ABC of all the sequence objects that can contain other sequence objects such as

  • tuple
  • list
  • numpy.array
  • some user defined vector class, etc

but not:

  • str
  • bytes
  • some user defined str class for UTF-16, etc

I am afraid this is impossible as tuple and str have the same 7 ABCs :(

In [49]: [v for k, v in vars(collections.abc).items()
    ...:                                   if inspect.isclass(v) and issubclass(tuple, v) ]
Out[49]:
[collections.abc.Hashable,
 collections.abc.Iterable,
 collections.abc.Reversible,
 collections.abc.Sized,
 collections.abc.Container,
 collections.abc.Collection,
 collections.abc.Sequence]

In [50]: [v for k, v in vars(collections.abc).items()
    ...:                                   if inspect.isclass(v) and issubclass(list, v) ]
Out[50]:
[collections.abc.Iterable,
 collections.abc.Reversible,
 collections.abc.Sized,
 collections.abc.Container,
 collections.abc.Collection,
 collections.abc.Sequence,
 collections.abc.MutableSequence]

In [51]: [v for k, v in vars(collections.abc).items()
    ...:                                   if inspect.isclass(v) and issubclass(str, v) ]
Out[51]:
[collections.abc.Hashable,
 collections.abc.Iterable,
 collections.abc.Reversible,
 collections.abc.Sized,
 collections.abc.Container,
 collections.abc.Collection,
 collections.abc.Sequence]

Solution

  • Good question.

    • There is (currently) no ABC that distinguishes a string from a tuple or other immutable sequence; since there is just one string type in Python 3, the most Pythonic solution is indeed with an isinstance(x, str).
    • Byte sequence types such as bytes and bytearray can be distinguished using the collections.abc.ByteString ABC.

    Of course, you could also define your own ABC which includes both str and ByteString, or even give it a __subclasshook__ that checks classes for a method such as capitalize.