Search code examples
pythonperformancevectorization

harvesine vectorization for vector list


I have a code snippet that computes a distance matrix between two lists of coordinates using the haversine function. While the current implementation works, it involves nested loops and can be time-consuming for large datasets. I am looking for a more efficient alternative that avoids the use of a for loop.

import numpy as np
from haversine import haversine
    
string_list_1 = [(20.00,-100.1),...]  # List of vector pair coordinates (lat,long)

string_list_2 = [(21.00,-101.1),...]  # Another list of pair coordinates

dist_mat = np.zeros((len(string_list_1), len(string_list_2)))

for i, coord1 in enumerate(string_list_1):
   dist_mat[i, :] = np.array([haversine(coord1, coord2) for coord2 in string_list_2])

I would appreciate suggestions or code examples for a more efficient and faster implementation that avoids the use of a for loop.


Solution

  • use haversine from sklearn.metrics:

    from sklearn.metrics.pairwise import haversine_distances
    haversine_distances(string_list_1,string_list_2)