Search code examples
pythonfileioreadlines

Is readlines() guaranteed to read from the current position rather than the beginning of the file (in all Python implementations)?


Consider:

with open('test.txt', 'w') as f:
    for i in range(5):
        f.write("Line {}\n".format(i))

with open('test.txt', 'r') as f:
    f.readline()
    for line in f.readlines():
        print(line.strip())

This outputs

Line 1
Line 2
Line 3
Line 4

That is, f has an internal iterator and f.readline() consumes the first line and f.readlines() reads all other lines till the end of file. Is this expected/guaranteed from a language point of view?

The only information I found is from docs.python.org,

If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().

which I feel is ambiguous.


Solution

  • Two reasons to believe that readlines() reading from the current position instead of from the beginning of the file is 'guaranteed':

    1. Per the docs, open returns a file object which per the glossary means something that implements the contract defined in the io module. The io module docs tell us that .readlines() will

      Read and return a list of lines from the stream.

      Note also that the term "stream position" is used frequently throughout the io docs. I suppose I have to admit that the docs don't 100% unambiguously and explicitly say that readlines() will start reading from the current stream position rather than from the beginning of the file (or the middle, or a random position, or a position that varies depending upon the day of the week). However, I think it's fair to say that - given that it's established in the io docs that streams have positions - any interpretation other than reading from the current stream position would be perverse, even if we didn't have any real-life implementations to look at.

    2. It's what CPython does, and CPython is widely understood to be Python's official reference interpreter (as noted in the docs at, for example, https://docs.python.org/devguide/#other-interpreter-implementations).

    Maybe that argument isn't quite as formal or rigorous as an equivalent argument could be that looked at the specs of, say, C, C++, or ECMAScript. If that troubles you, then too bad, because you're not going to find that level of formality in the Python world. Python's docs are its specification, but they're also documentation meant for ordinary developers working in the language, and as a consequence don't define behaviour quite as anally as the formal standards of other languages tend to. When in doubt, interpret the docs in the most natural way and presume that Python will follow the principle of least astonishment, and if that doesn't provide enough certainty, trust the CPython reference interpreter.