Search code examples
pythonlistcsvcoordinate

Read CSV and Separate by column


Brand new to Python (and programming in general), if this is simple and/or answered somewhere I didn't find, feel free to harass me in typical forum fashion.

I've got a bunch of CSVs, each containing 10 XY coordinates like this:

10,5
2,4
5,6 
7,8
9,12
3,45
2,4
6,5
0,3 
5,6 

I'm looking to separate the X coordinates and Y coordinates into two seperate lists, so that I can subtract a value from each value in a given list. For example, subtracting 5 from every value in the X coordinate list and 3 from every value in the Y coordinate list. I'm then going to take the abs() of each value and find the minimum. Once those minimums are found, I want to add the lists together so that each value is added to it's counterpart

IE) if the absolute values of X were something like

4
5
....

and Y something like

6
7
....

I'd want to add 4 and 6, then 5 and 7, etc.

To separate them, I tried

import csv
filein = open("/path/here")
reader = csv.reader(filein, skipinitialspace = True)
listofxys = []
for row in reader:
    listofxys.append(row)

Xs = listofxys.pop(0) # to pop all the X's

Ys = listofxys.pop() # to pop all the Y's

But instead of all the leading values, it provides the first XY pair. What am I doing wrong here?

The eventual goal is to find the closest point to an XY coordinate, so if this is a bad way to go about it, feel free to steer me in another direction.

Thanks in advance!


Solution

  • It's worth noting that you should try to use the with statement when opening files in Python. This is both more readable and removes the possibility of a file being left unclosed (even when exceptions occur).

    Your actual problem comes in that you are not doing what you want to do.

    reader = csv.reader(filein, skipinitialspace = True)
    listofxys = []
    for row in reader:
        listofxys.append(row)
    

    All this does is reader = list(csv.reader(filein, skipinitialspace = True)) in a very inefficient manner.

    What you want to do is use the zip() builtin to take a list of pairs and turn it into two lists. You do this with the star operator:

    import csv
    
    with open("test") as filein:
        reader = csv.reader(filein, skipinitialspace = True)
        xs, ys = zip(*reader)
    
    print(xs)
    print(ys)
    

    Which gives:

    ('10', '2', '5', '7', '9', '3', '2', '6', '0', '5')
    ('5', '4', '6', '8', '12', '45', '4', '5', '3', '6')
    

    Do note the fact these values are strings. If you want to have them as numbers, you will want to use csv.QUOTE_NONNUMERIC, e.g: reader = csv.reader(filein, quoting=csv.QUOTE_NONNUMERIC, skipinitialspace = True)

    Which gives:

    (10.0, 2.0, 5.0, 7.0, 9.0, 3.0, 2.0, 6.0, 0.0, 5.0)
    (5.0, 4.0, 6.0, 8.0, 12.0, 45.0, 4.0, 5.0, 3.0, 6.0)