Hoping that someone can help me understand the following. Writing a small program to read a csv file in K line chunks. I've seen the other stack questions about this an that's not what I'm asking here. I'm trying to understand why one program terminates and the other never does.
This code never terminates:
from __future__ import print_function
from itertools import islice
import time
import csv
def gen_csv1(input_file, chunk_size=50):
try:
with open(input_file) as in_file:
csv_reader = csv.reader(in_file)
while True:
yield islice(csv_reader, chunk_size)
except StopIteration:
pass
gen1 = gen_csv1('./test100.csv')
for chunk in gen1:
print(list(chunk))
time.sleep(1)
While this works fine. With the only difference being the islice
outside the yield
from the generator.
def gen_csv(input_file):
try:
with open(input_file) as in_file:
csv_reader = csv.reader(in_file)
while True:
yield next(csv_reader)
except StopIteration:
pass
gen = gen_csv('./test100.csv')
for chunk in gen:
rows = islice(gen, 50)
print(list(rows))
time.sleep(1)
I'm stumped. Any guidance is hugely appreciated. This is more out of curiosity than for work reasons.
Per the docs,
[islice] works like a slice() on a list but returns an iterator.
When you slice an empty list, an empty list is returned:
In [118]: [][:3]
Out[118]: []
Similarly, when you islice
an empty iterator, an empty iterator is returned.
In contrast, calling next
on an empty iterator raises StopIteration
:
In [98]: from itertools import islice
In [114]: reader = iter([])
In [115]: list(islice(reader, 3))
Out[115]: []
In [116]: next(reader)
StopIteration:
Since islice
never raises a StopIteration
exception, the first version of the code never terminates.