Search code examples
pythonpython-2.7csvgpx

How to clean up .gpx data before writing to a .csv file in python


I am trying to extract specific data from a .gpx file. The data required is 'trkpt' and 'ele', which is location and elevation data. My code listed below does pretty much what I need it to do, but it looks messy, and I only need the numerical data.

gpx_list = []
gpx = open('G:\\14022705.gpx', 'r')  
for line in gpx:
    info = line.split(',')
    if 'trkpt ' in line:
        gpx_list.append(info)
        print line
    if 'ele' in line:
        gpx_list.append(info)
        print line

gpx_list_out = open('G:\\Position_Data2.csv', 'w')  
for line in gpx_list:
    gpx_list_out.write(line[0])

gpx_list_out.close()

Output Example:

['<trkpt lat="-42.6150634" lon="+147.4397831">']
['<ele>1.431</ele>']

Instead I would like it to look like: -42.6150634, +147.4397831, 1.431 all on one line

Any tips on achieving this would be appreciated. I have tried for hours messing around with adding different bits of code but have failed to achieve desired outcome!


Solution

  • Try to incorporate this into your code. The regular expression extracts all digits in each line

    import re
    
    gpx_list = []
    gpx = open('G:\\14022705.gpx', 'r')      
    gpx_list_out = open('G:\\Position_Data2.csv', 'w') 
    
    for line in gpx:
        if 'trkpt ' in line:
          print re.findall(r"[-+]?\d*\.\d+|\d+",line)
          numerical_value=re.findall(r"[-+]?\d*\.\d+|\d+",line)
          gpx_list_out.write(",".join(numerical_value))
    
    gpx_list_out.close()