Search code examples
pythonlistobjectcastingduck-typing

How exactly python converts arbitrary object into list?


Documentation says:

The constructor builds a list whose items are the same and in the same order as iterable‘s items. iterable may be either a sequence, a container that supports iteration, or an iterator object. If iterable is already a list, a copy is made and returned, similar to iterable[:]...

But if I have an object a of my class A, that implements __iter__, __len__ and __getitem__, which interface is used by list(a) to iterate my object and what logic is behind this?

My quick experimenting confuses me:

class A(object):
    def __iter__(self):
        print '__iter__ was called'
        return iter([1,2,3])
    def __len__(self):
        print '__len__ was called'
        return 3
    def __getitem__(self, index):
        print '__getitem(%i)__ was called' % index
        return index+1

a = A()
list(a)

Outputs

__iter__ was called
__len__ was called
[1, 2, 3]

A.__iter__ was called first, ok. But why then A.__len__ was called? And then why A.__getitem__ was not called?

Then I turned __iter__ to a generator

And this changed the order of magic method calls!

class B(object):
    def __iter__(self):
        print '__iter__ was called'
        yield 1
        yield 2
        yield 3
    def __len__(self):
        print '__len__ was called'
        return 3
    def __getitem__(self, index):
        print '__getitem(%i)__ was called' % index
        return index+1      

b = B()
list(b)

Outputs

__len__ was called
__iter__ was called
[1, 2, 3]

Why B.__len__ was called first now? But why then B.__getitem__ was not called, and conversion was done with B.__iter__?

And what confuses me most is why the order of calls of __len__ and __iter__ is different in cases of A and B?


Solution

  • The call order didn't change. __iter__ still got called first, but calling __iter__ doesn't run the function body immediately when __iter__ is a generator. The print only happens once next gets called.

    __len__ getting called is an implementation detail. Python wants a hint for how much space to allocate for the list, so it calls _PyObject_LengthHint on your object, which uses len if the object supports it. It is expected that calling len on an object will generally be fast and free of visible side effects.