Search code examples
pythoniterator

Is the File-Object iterator "broken?"


According to the documentation:

Once an iterator’s __next__() method raises StopIteration, it must continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.

However, for file-objects:

>>> f = open('test.txt')
>>> list(f)
['a\n', 'b\n', 'c\n', '\n']
>>> next(f)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> f.seek(0)
0
>>> next(f)
'a\n'

Are file-object iterators broken? Is this just one of those things that can't be fixed because it would break too much existing code that relies one it?


Solution

  • I think this is, if anything, a docs bug on that paragraph, not a bug in io objects. (And io object’s aren’t the only thing—most trivially, a csv.reader wrapper around a file is just as restartable as a file.)

    If you just use an iterator as an iterator, once it raises it will keep on raising. But if you call methods outside of the iterator protocol, you’re not really using it as an iterator anymore, but as something more than an iterator. And in that case, it seems legal and even idiomatic for the object to be “refillable” if it makes sense. As long as it never refills itself while it’s quacking as an iterator, only when it’s quacking as some other type that goes beyond that.

    In a similar situation in C++, the language committee might well declare that this breaks substitutability and therefore the iterator becomes invalid as an iterator once you call such a method on it, even if the language can’t enforce that. Or come up with a whole new protocol for refillable iterators. (Of course C++ iterators aren’t quite the same thing as Python iterators, but hopefully you get what I mean.)

    But in Python, practicality beats purity. I’m pretty sure Guido intended this behavior from the start, and that an object is allowed to do this and still be considered an iterator, and the core devs continue to intend it, and it’s just that nobody has thought about how to write something sufficiently rigorous to explain it accurately because nobody has asked.

    If you ask by filing a docs bug, I’ll bet that this paragraph gets a footnote, rather than the io and other refillable iterator objects being reclassified as not actually iterators.