Search code examples
pythonpandasdataframegeopy

Calculating distance in pandas data frame giving error


I have a data set contains lat/long for two points in four columns and trying to calculate the distance between them in the newly added column using geopy.distance.

It is working fine if I calculate for a single value but doesn't work for the whole column.

import pandas as pd
from geopy import distance

sub_set = main[['Site_1','Site_Longitude_1','Site_Latitude_1','Site_2','Site_Longitude_2','Site_Latitude_2']]

lat1 = sub_set['Site_Latitude_1']
lat2 = sub_set['Site_Latitude_2']
long1 = sub_set['Site_Longitude_1']
long2 = sub_set['Site_Longitude_2']

The data frame sub_set is as follows

  Site_1 Site_Longitude_1 Site_Latitude_1 Site_2 Site_Longitude_2 Site_Latitude_2
0      A      -118.645167       34.237917     A2     -118.6499422     34.24973484
1      A      -118.645167       34.237917     A2     -118.6499422     34.24973484
2      B      -118.626659       34.224762     A2     -118.6499422     34.24973484
3      B      -118.626659       34.224762     A2     -118.6499422     34.24973484
4      B      -118.626659       34.224762     A2     -118.6499422     34.24973484

On executing,

sub_set['Distance'] = distance.distance((lat1,long1),(lat2,long2)).miles

the following error message is thrown,

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

Solution

    • The following will get you the row-wise calculation you want.
    • The subset stuff is not required
    • This is a long line, but it benefits from an absolute location for the required columns
    df['Distance'] = df[['Site_Latitude_1', 'Site_Longitude_1', 'Site_Latitude_2', 'Site_Longitude_2']].apply(lambda x: distance.distance((x[0],x[1]), (x[2],x[3])).miles, axis=1)
    

    Shorter line of code

    • just make certain x[] is properly indexed for the correct column in df
    df['Distance'] = df.apply(lambda x: distance.distance((x[2],x[1]), (x[5],x[4])).miles, axis=1)
    

    Output:

      Site_1 Site_Longitude_1 Site_Latitude_1 Site_2 Site_Longitude_2 Site_Latitude_2  Distance
    0      A      -118.645167       34.237917     A2     -118.6499422     34.24973484  0.859202
    1      A      -118.645167       34.237917     A2     -118.6499422     34.24973484  0.859202
    2      B      -118.626659       34.224762     A2     -118.6499422     34.24973484  2.177003
    3      B      -118.626659       34.224762     A2     -118.6499422     34.24973484  2.177003
    4      B      -118.626659       34.224762     A2     -118.6499422     34.24973484  2.177003