Search code examples
pythonpandasgeopy

Changing the geodesic datatype to an integer


With this code, I want to create a distance matrix, which works! I have used the geopy package and use the geodesic distance method to calculate the distance between coordinates that are stored in a Pandas dataframe.

def get_distance(col):
    end = RD1.loc[col.name, 'Eindlocatie_Coord']
    return RD1['Eindlocatie_Coord'].apply(geodesic, args=(end,), ellipsoid='WGS-84')

def get_totaldistance(matrix):
    square = pd.DataFrame(np.zeros(len(RD1)**2).reshape(len(RD1), len(RD1)), index=RD1.index, columns=RD1.index)
    distances = square.apply(get_distance, axis=1).T
    totaldist = np.diag(distances,k=1).sum()
    return totaldist

distances = get_totaldistance(RD1)

However, these distances are in a geodesic datatype, and I want to have these distances as floats because that would make my further calculations easier.

I know that print(geodesic(newport_ri, cleveland_oh).miles) (an example from the geopy documentation) would return floats, but I'm not sure how to apply this to an entire pandas dataframe column.

So, how can I change my code such that floats are returned?


Solution

  • I made an additional subfunction within my function to change the output, which was exactly what I was looking for. Here is the solution:

    def get_distance(col):
        end = RD1.loc[col.name, 'Eindlocatie_Coord']
        return RD1['Eindlocatie_Coord'].apply(geodesic, args=(end,), ellipsoid='WGS-84')
    
    def get_totaldistance(matrix):
        square = pd.DataFrame(np.zeros(len(RD1)**2).reshape(len(RD1), len(RD1)), index=RD1.index, columns=RD1.index)
        distances = square.apply(get_distance, axis=1).T
        
        def units(input_instance):
            return input_instance.km
        
        distances_km = distances.applymap(units)
        
        totaldist = np.diag(distances_km,k=1).sum()
        return totaldist
    

    Where the function def units(input_instance) is the solution to my problem.