Search code examples
pythonalgorithmpandasgpx

How do I calculate speed from a gpx file if the speed tag itself is not given?


For example, I might have some information like this:

<trkpt lat="-33.8161780" lon="150.8710320">
 <ele>73.0</ele>
 <time>2017-07-08T22:05:45Z</time>
 <extensions>
  <power>0</power>
  <gpxtpx:TrackPointExtension>
    <gpxtpx:atemp>7</gpxtpx:atemp>
    <gpxtpx:hr>115</gpxtpx:hr>
    <gpxtpx:cad>27</gpxtpx:cad>
  </gpxtpx:TrackPointExtension>
 </extensions>
</trkpt>

How would I calculate speed from this info? I've used the python etree module to parse the file, and have all the info in a pandas database.

It was mentioned that I should probably instead show the pandas dataframe. It looks something like this:

                     longitude   latitude   ele   temp
time                
2017-07-08 22:05:45 150.8710320 -33.8161780 73.0    7
2017-07-08 22:05:46 150.8710350 -33.8161500 73.0    7
2017-07-08 22:05:47 150.8710440 -33.8161170 73.0    7
2017-07-08 22:05:48 150.8710540 -33.8160820 73.0    7
2017-07-08 22:05:49 150.8710690 -33.8160430 73.0    7

and so on.


Solution

  • speed equals distance / time. The longitude and latitude presumably represent locations on the surface of the Earth. If we accept a sphere of radius 6371 km as an approximation of the Earth, then we can easily translate the longitude and latitude into xyz-coordinates:

    r = 6371000 # meters
    df['theta'] = np.deg2rad(df['longitude'])
    df['phi'] = np.deg2rad(df['latitude'])
    df['x'] = r*np.cos(df['theta'])*np.sin(df['phi'])
    df['y'] = r*np.sin(df['theta'])*np.sin(df['phi'])
    df['z'] = r*np.cos(df['phi'])
    

    Now it is not hard to compute distances between consecutive points:

    df['x2'] = df['x'].shift()
    df['y2'] = df['y'].shift()
    df['z2'] = df['z'].shift()
    df['distance'] = np.sqrt((df['x2']-df['x'])**2 + (df['y2']-df['y'])**2 + (df['z2']-df['z'])**2)
    

    However, this is a chord length -- the straight-line distance between two points on the surface of the sphere. If the points are far apart, the chord would be burrowing through the surface of the Earth. Presumably the motion is on the surface of the Earth. So a more accurate calculation for distance would use the arclength:

    df['central angle'] = np.arccos((df['x']*df['x2'] + df['y']*df['y2'] + df['z']*df['z2'])/r**2)
    df['arclength'] = df['central angle']*r
    

    The central angle is using the dot product formula.

    Having computed the arclength (distance), we must now also compute the time interval between consecutive observations (i.e. rows of the DataFrame):

    df['time'] = (df.index.to_series().diff() / pd.Timedelta(seconds=1))
    

    So using speed = distance / time:

    df['speed'] = df['arclength'] / df['time']  # in meters/second
    

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame({'ele': [73.0, 73.0, 73.0, 73.0, 73.0], 'latitude': [-33.816178, -33.81615, -33.816117, -33.816082, -33.816043], 'longitude': [150.871032, 150.871035, 150.87104399999998, 150.87105400000002, 150.871069], 'temp': [7, 7, 7, 7, 7], 'time': ['2017-07-08 22:05:45', '2017-07-08 22:05:46', '2017-07-08 22:05:47', '2017-07-08 22:05:48', '2017-07-08 22:05:49']})
    df['time'] = pd.to_datetime(df['time'])
    df = df.set_index('time')
    columns = df.columns.tolist()
    
    r = 6371000 # radius of the Earth in meters
    df['theta'] = np.deg2rad(df['longitude'])
    df['phi'] = np.deg2rad(df['latitude'])
    df['x'] = r*np.cos(df['theta'])*np.sin(df['phi'])
    df['y'] = r*np.sin(df['theta'])*np.sin(df['phi'])
    df['z'] = r*np.cos(df['phi'])
    df['x2'] = df['x'].shift()
    df['y2'] = df['y'].shift()
    df['z2'] = df['z'].shift()
    df['distance'] = np.sqrt((df['x2']-df['x'])**2 + (df['y2']-df['y'])**2 + (df['z2']-df['z'])**2)
    
    df['central angle'] = np.arccos((df['x']*df['x2'] + df['y']*df['y2'] + df['z']*df['z2'])/r**2)
    df['arclength'] = df['central angle']*r
    
    df['time'] = (df.index.to_series().diff() / pd.Timedelta(seconds=1))
    df['speed'] = df['arclength'] / df['time']  # in meters/second
    df = df[columns + ['speed']]
    print(df)
    

    yields

                          ele   latitude   longitude  temp     speed
    time                                                            
    2017-07-08 22:05:45  73.0 -33.816178  150.871032     7       NaN
    2017-07-08 22:05:46  73.0 -33.816150  150.871035     7  3.119892
    2017-07-08 22:05:47  73.0 -33.816117  150.871044     7  3.712201
    2017-07-08 22:05:48  73.0 -33.816082  150.871054     7  3.940673
    2017-07-08 22:05:49  73.0 -33.816043  150.871069     7  4.433590
    

    If you comment out

    df = df[columns + ['speed']]
    

    and re-run the script, you'll see all the intermediate calculations. You'll notice that df['distance'] is quite close to df['arclength']. Since the points are not very far apart on the surface of the Earth, the chord length is a good approximation of the arclength. So for the data you posted

    df['speed'] = df['distance'] / df['time'] 
    

    would have worked just as well. However, the arclength calculation is a bit more robust since it will give a more accurate value if the points are far apart.