List with the names of points in a given radius for all rows of the dataframe

I have a dataframe like:

projectName latitude    longitude
a          56.864229    60.609576
b          55.810413    37.701168
c          55.924912    37.966033
d          56.804987    60.590667
e          55.806000    37.569863

I want to get a list of points in a given radius for each point. Example for 30 km it should be like that:

projectName latitude    longitude   30km
a          56.864229    60.609576  [d]
b          55.810413    37.701168  [c, e]
c          55.924912    37.966033  [b, e]
d          56.804987    60.590667  [a]
e          55.806000    37.569863  [b, c]

How can I get this most quickly?

Solution

You can compute the pairwise haversine_distances and filter the values:

from sklearn.metrics.pairwise import haversine_distances

DIST = 10 # distance in km

tmp = np.radians(df.set_index('projectName')[['latitude', 'longitude']])

# compute pairwise distance
keep = haversine_distances(tmp)*6371 <= DIST

# remove self (e.g. a/a)
np.fill_diagonal(keep, False)

# combine the strings
df[f'{DIST}km'] = (keep @ (tmp.index+',')).str[:-1]

Output (for 10 and 30 km):

  projectName   latitude  longitude 10km 30km
0           a  56.864229  60.609576    d    d
1           b  55.810413  37.701168    e  c,e
2           c  55.924912  37.966033       b,e
3           d  56.804987  60.590667    a    a
4           e  55.806000  37.569863    b  b,c

If you want a list:

df[f'{DIST}km'] = [tmp.index[x].tolist() for x in keep]

Output:

  projectName   latitude  longitude 10km    30km
0           a  56.864229  60.609576  [d]     [d]
1           b  55.810413  37.701168  [e]  [c, e]
2           c  55.924912  37.966033   []  [b, e]
3           d  56.804987  60.590667  [a]     [a]
4           e  55.806000  37.569863  [b]  [b, c]