Search code examples
pythonpython-2.7generatoryield

yielding islice from reading file


Hoping that someone can help me understand the following. Writing a small program to read a csv file in K line chunks. I've seen the other stack questions about this an that's not what I'm asking here. I'm trying to understand why one program terminates and the other never does.

This code never terminates:

from __future__ import print_function
from itertools import islice
import time
import csv
def gen_csv1(input_file, chunk_size=50):
    try:
        with open(input_file) as in_file:
            csv_reader = csv.reader(in_file)
            while True:
                yield islice(csv_reader, chunk_size)
    except StopIteration:
        pass

gen1 = gen_csv1('./test100.csv')

for chunk in gen1:
    print(list(chunk))
    time.sleep(1)

While this works fine. With the only difference being the islice outside the yield from the generator.

def gen_csv(input_file):
    try: 
        with open(input_file) as in_file:
            csv_reader = csv.reader(in_file)
            while True:
                yield next(csv_reader)
    except StopIteration:
        pass


gen = gen_csv('./test100.csv')
for chunk in gen:
    rows = islice(gen, 50)
    print(list(rows))
    time.sleep(1)

I'm stumped. Any guidance is hugely appreciated. This is more out of curiosity than for work reasons.


Solution

  • Per the docs,

    [islice] works like a slice() on a list but returns an iterator.

    When you slice an empty list, an empty list is returned:

    In [118]: [][:3]
    Out[118]: []
    

    Similarly, when you islice an empty iterator, an empty iterator is returned. In contrast, calling next on an empty iterator raises StopIteration:

    In [98]: from itertools import islice
    In [114]: reader = iter([])
    
    In [115]: list(islice(reader, 3))
    Out[115]: []
    
    In [116]: next(reader)
    StopIteration: 
    

    Since islice never raises a StopIteration exception, the first version of the code never terminates.