I have a function, which yields specific columns from a csv file as a list and appends them to a list until a limit of n is reached. The problem is...
LIMIT = 10
def read_csv(filename):
with open(filename, 'r') as infile:
header = next(infile)
for line in infile:
# get column by header and append to mylist
yield mylist
new_list = []
for dataset in read_csv('some.csv'):
new_list.append(dataset)
if len(new_list) == LIMIT:
# call a func to create xml file with dataset
# grab the remaining data
else:
new_list.append(dataset)
# call a func to create xml file with dataset
new_list = []
...this (ugly) for/else workaround. I've read about itertools.islice
and itertools.takewhile
How would you write this task w/o using a for/else?
for dataset in itertools.islice(read_csv('some.csv'), LIMIT):
new_list.append(dataset)
I'm stuck here, because i have to find a way to capture islice
s StopIteration and repeat it until read_csv()
is done
Any ideads?
A for-loop over islice
won't raise StopIteration
, so no need to worry about that and islice
takes care of EOF as well. So, at the end of the loop you can simply call a func to create xml file with data. And instead of looping over islice
I'd suggest you to simply call list()
on it to get its data in a list.
data = read_csv('some.csv')
new_list = list(islice(data, LIMIT))
# call a func to create xml file with data
# do something with remaining `data`
Or if you want to break the data from read_csv
in chunks of size LIMIT
then you can use the grouper
recipe from itertools:
from itertools import islice, izip_longest
def grouper(iterable, n, fillvalue=None):
args = [iter(iterable)] * n
return izip_longest(fillvalue='', *args)
for dataset in grouper(read_csv('some.csv'), LIMIT):
# call a func to create xml file with dataset
Note that if the number of items returned by read_csv
are not an exact multiple of LIMIT
then the last dataset will contain the ''
fill value.