Search code examples
pythoncsvzipiterable-unpacking

Efficiently unpack CSV columns into separate lists


I am optimizing my script and find this problem:

Here I have a csv file where the first column is just index and the second column contains a string (sentence of arbitrary length). I want to create two variables "index" and "string" that contains all the index and string respectively. This is my code:

with open(file_name, 'r', encoding="utf8") as csvfile:
    train_set_x = csv.reader(csvfile, delimiter=',', quotechar='|')
    index = [[c[0],c[1]] for c in train_set_x]
    text = [a[1] for a in index]

this does the job, however, it takes 2 iterations. I am asking if there is a cleaner way to do it? Thank you


Solution

  • There definitely is. Use zip with iterable unpacking.

    index, text = zip(*((c[0], c[1]) for c in train_set_x))
    

    MCVE:

    In [152]: x, y = zip(*[(1, 2), (3, 4), (5, 6)])
    
    In [153]: x
    Out[153]: (1, 3, 5)
    
    In [154]: y
    Out[154]: (2, 4, 6)