Search code examples
pythonattributeerror

How to set __len__ method for an itertools.chain object?


Let's say I'm building an itertools.chain instance as follows:

from itertools import chain

list_1 = list(range(5, 15))
list_2 = list(range(20, 30))
chained = chain(list_1, list_2)

Now, since I already know the length of the lists contained in chained I can easily get the length of chained. How can I add the __len__ to chained?

I tried this:

full_len = len(list_1) + len(list_2)
setattr(chained, '__len__', lambda: full_len)

but it fails with the error

AttributeError: 'itertools.chain' object has no attribute '__len__'

Edit: I need this to be able to display the progress of a long process with tqdm, which relays in the __len__ method to be able to show the progress bar


Solution

  • You could extend the class using __new__. See here for why.. Taking your example we could write:

    class Chain(itertools.chain):
        def __new__(cls, *args):
            obj = super().__new__(cls, *args)
            obj.args = args
            return obj
    
        def __len__(self) -> int:
            return sum(map(len, self.args))
    
    >>> chained = Chain([1], [2, 3])
    >>> len(chained)
    3
    

    Although returning the length of this generator is somewhat awkward due to the content being exhausted after the first iteration (you can only loop over a generator once, it does not store).

    What you probably want is a simple helper that will allow easy chaining, but return a list implementation which supports multiple iteration and len.

    def chain_list(*args):
        return list(itertools.chain(*args))
    

    That might become pretty expensive depending on the iterables provided (say a range(1, 1000000000)). In which case you should probably define your own interface that implements methods such as __iter__, potentially using itertools.chain under the hood, but not subclassing it directly.