Search code examples
pythongeneratoryield

What's the advantage of using yield in __iter__()?


What is the advantage of using an generator(yield) inside an __iter__() function? After reading through Python Cookbook I understand "If you want a generator to expose extra state to the user, don’t forget that you can easily implement it as a class, putting the generator function code in the __iter__() method."

import io

class playyield:
    def __init__(self,fp):
        self.completefp = fp

    def __iter__(self):
        for line in self.completefp:
            if 'python' in line:
                yield line

if __name__ =='__main__':
    with io.open(r'K:\Data\somefile.txt','r') as fp:
        playyieldobj = playyield(fp)
        for i in playyieldobj:
            print I

Questions:

  1. What does extra state means here?
  2. What is the advantage of using yield inside __iter__ () instead of using a separate function for yield?

Solution

  • Without generator functions, you would have to implement something like this, if you want to follow best practices:

    In [7]: class IterableContainer:
       ...:     def __init__(self, data=(1,2,3,4,5)):
       ...:         self.data = data
       ...:     def __iter__(self):
       ...:         return IterableContainerIterator(self.data)
       ...:
    
    In [8]: class IterableContainerIterator:
       ...:     def __init__(self, data):
       ...:         self.data = data
       ...:         self._pos = 0
       ...:     def __iter__(self):
       ...:         return self
       ...:     def __next__(self):
       ...:         try:
       ...:              item = self.data[self._pos]
       ...:         except IndexError:
       ...:             raise StopIteration
       ...:         self._pos += 1
       ...:         return item
       ...:
    
    In [9]: container = IterableContainer()
    
    In [10]: for x in container:
        ...:     print(x)
        ...:
    1
    2
    3
    4
    5
    

    Of course, the above example is contrived, but hopefully you get the point. With generators, this can simply be:

    In [11]: class IterableContainer:
        ...:     def __init__(self, data=(1,2,3,4,5)):
        ...:         self.data = data
        ...:     def __iter__(self):
        ...:         for x in self.data:
        ...:             yield x
        ...:
        ...:
    
    In [12]: list(IterableContainer())
    Out[12]: [1, 2, 3, 4, 5]
    

    As for state, well, it's exactly that - objects can have state, e.g. attributes. You can manipulate that state at runtime. You could do something like the following, although, I would say it is highly inadvisable:

    In [19]: class IterableContainerIterator:
        ...:     def __init__(self, data):
        ...:         self.data = data
        ...:         self._pos = 0
        ...:     def __iter__(self):
        ...:         return self
        ...:     def __next__(self):
        ...:         try:
        ...:              item = self.data[self._pos]
        ...:         except IndexError:
        ...:             raise StopIteration
        ...:         self._pos += 1
        ...:         return item
        ...:     def rewind(self):
        ...:         self._pos = min(0, self._pos - 1)
        ...:
    
    In [20]: class IterableContainer:
        ...:     def __init__(self, data=(1,2,3,4,5)):
        ...:         self.data = data
        ...:     def __iter__(self):
        ...:         return IterableContainerIterator(self.data)
        ...:
    
    In [21]: container = IterableContainer()
    
    In [22]: it = iter(container)
    
    In [23]: next(it)
    Out[23]: 1
    
    In [24]: next(it)
    Out[24]: 2
    
    In [25]: it.rewind()
    
    In [26]: next(it)
    Out[26]: 1
    
    In [27]: next(it)
    Out[27]: 2
    
    In [28]: next(it)
    Out[28]: 3
    
    In [29]: next(it)
    Out[29]: 4
    
    In [30]: next(it)
    Out[30]: 5
    
    In [31]: it.rewind()
    
    In [32]: next(it)
    Out[32]: 1