Search code examples
pythonpython-3.xiteratoriterable

Iterable and iterator


with open("weather_data.csv", 'r') as data_file:
    data = csv.reader(data_file)
    for x in data:
        print(x)

My understand is: csv.reader(data_file) is an iterable, it calls iter(self) and return _i as an iterator. This _i calls next() each time to enter next iteration. However,I use print(help(csv.reader(data_file)) and found that

 Methods defined here:
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __next__(self, /)
 |      Implement next(self).

My question is, the method __next__(self, /) here is exactly the same thing that was called by _i every time? Does _i also carry the data?


Solution

  • The csv.reader object is its own iterator. This is a common practice for iterables which are single-pass (i.e. can only be run through once). We can confirm this by inspection.

    >>> data
    <_csv.reader object at 0x7fe5d4a057b0>
    >>> iter(data)
    <_csv.reader object at 0x7fe5d4a057b0> # Note: Same as above
    >>> id(data)
    140625091516336
    >>> id(iter(data))
    140625091516336 # Note: Same as above
    >>> data is iter(data)
    True
    

    Compare this to something like a list, which is an iterable but is not itself an iterator.

    >>> lst = [1, 2, 3]
    >>> lst
    [1, 2, 3]
    >>> iter(lst)
    <list_iterator object at 0x7fe5d59747f0> # Note: NOT the same as before
    >>> lst is iter(lst)
    False
    

    This allows us to iterate over a list several times by calling iter(lst) multiple times, since each call gives us a fresh iterator. But your csv.reader object is single-pass, so we only have the one iterator to it.

    In Python, every iterator is an iterable, but not every iterable is an iterator. From the glossary

    Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted.