Search code examples
pythonpython-2.7python-3.xpython-2.6readlines

What does io.IOBase.readlines(hint) mean in python?


from the doc:

readlines(hint=-1)
    Read and return a list of lines from the stream. 
    hint can be specified to control the number of lines read: 
      no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

What's the real meaning of hint?

In some enviroments:

python3 -c 'from io import StringIO;print(StringIO(u"hello\n"*10).readlines(6));import sys;print(sys.version_info[0:3])'
['hello\n', 'hello\n']
(3, 3, 0)

python -c 'from io import StringIO;print(StringIO(u"hello\n"*10).readlines(6));import sys;print(sys.version_info[0:3])'
[u'hello\n', u'hello\n']
(2, 7, 2)

python -c 'from io import StringIO;print(StringIO(u"hello\n"*10).readlines(6));import sys;print(sys.version_info[0:3])'
[u'hello\n']
(2, 6, 6)

Why more than 6 characters?

Some one said that depended on buffer size.

But in my machine, i can not unbuffer Text I/O.

>>> import sys
>>> sys.version
'3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 01:25:11) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]'
>>> open('/etc/hosts','r',3).readlines(3)
['##\n', '# Host Database\n']
>>> open('/etc/hosts','r',0).readlines(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: can't have unbuffered text I/O
>>> 

Or is it a bug in this method?


2013/02/25 Updated:

I check the source(from python 2.6/2.7/3.x), but I can not explain this:

def readlines(self, hint=None):
    """Return a list of lines from the stream.

    hint can be specified to control the number of lines read: no more
    lines will be read if the total size (in bytes/characters) of all
    lines so far exceeds hint.
    """
    if hint is None or hint <= 0:
        return list(self)
    n = 0
    lines = []
    for line in self:
        lines.append(line)
        n += len(line)
        if n >= hint:
            break
    return lines

Solution

  • I found the difference from StringIO and BytesIO(But I have no idea why):

    First check this (python 2.7/3.3):

    Python 2.7.2 (default, Jun 20 2012, 16:23:33) 
    [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from io import BytesIO,StringIO
    >>> print(BytesIO(b'hello\n'*10).readlines(6))
    ['hello\n']
    >>> print(StringIO(u'hello\n'*10).readlines(6))
    [u'hello\n', u'hello\n']
    >>> 
    

    The C source codes from StringIO and BytesIO link here:

    Modules/_io/iobase.c#l591

    620     while (1) {
    621         PyObject *line = PyIter_Next(self);
    622         if (line == NULL) {
    623             if (PyErr_Occurred()) {
    624                 Py_DECREF(result);
    625                 return NULL;
    626             }
    627             else
    628                 break; /* StopIteration raised */
    629         }
    630 
    631         if (PyList_Append(result, line) < 0) {
    632             Py_DECREF(line);
    633             Py_DECREF(result);
    634             return NULL;
    635         }
    636         length += PyObject_Size(line);
    637         Py_DECREF(line);
    638 
    639         if (length > hint)
    640             break;
    641     }
    

    Modules/_io/bytesio.c#l380

    413     while ((n = get_line(self, &output)) != 0) {
    414         line = PyBytes_FromStringAndSize(output, n);
    415         if (!line)
    416             goto on_error;
    417         if (PyList_Append(result, line) == -1) {
    418             Py_DECREF(line);
    419             goto on_error;
    420         }
    421         Py_DECREF(line);
    422         size += n;
    423         if (maxsize > 0 && size >= maxsize)
    424             break;
    425     }
    426     return result;