I am optimizing my script and find this problem:
Here I have a csv file where the first column is just index and the second column contains a string (sentence of arbitrary length). I want to create two variables "index" and "string" that contains all the index and string respectively. This is my code:
with open(file_name, 'r', encoding="utf8") as csvfile:
train_set_x = csv.reader(csvfile, delimiter=',', quotechar='|')
index = [[c[0],c[1]] for c in train_set_x]
text = [a[1] for a in index]
this does the job, however, it takes 2 iterations. I am asking if there is a cleaner way to do it? Thank you
There definitely is. Use zip
with iterable unpacking.
index, text = zip(*((c[0], c[1]) for c in train_set_x))
MCVE:
In [152]: x, y = zip(*[(1, 2), (3, 4), (5, 6)])
In [153]: x
Out[153]: (1, 3, 5)
In [154]: y
Out[154]: (2, 4, 6)