Search code examples
pythonpandasgeopy

Calculating distance of latitude/longitude on pandas dataframe using GeoPy


I'm trying to calculate distance between latitude and longitude using geopy on a Pandas Dataframe.

here is my dataframe:

    latitude    longitude   altitude
    -15.836310  -48.020298  1137.199951
    -15.836360  -48.020512  1136.400024
    -15.836415  -48.020582  1136.400024
    -15.836439  -48.020610  1136.400024
    -15.836488  -48.020628  1136.599976

I tried two different ways:

from geopy import distance

for i in range(1, len(df)):
   before = (df.loc[i-1, 'latitude'], df.loc[i-1, 'longitude'])
   actual = (df.loc[i, 'latitude'], df.loc[i, 'longitude'])
   df.loc[i, 'geodesic'] = distance.distance(before, actual).miles

error:

 KeyError: 0

Apparently, df.loc[i, 'column_name'] does not work.

and:

from geopy import distance

df['geodesic'] = distance.distance((df.latitude.shift(1), df.longitude.shift(1)), (df.latitude, df.longitude)).miles

Error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Official GeoPy Documentation:

from geopy import distance
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
print(distance.distance(newport_ri, cleveland_oh).miles)

Solution

  • I got the error.

    1 - I had to check if latitude or longitude is NaN.

    2 - I couldn't set time as index. (i don't know why, that's took a long time to discover)

    Once checked this, the error was gone.