Trying to use minkowski distance and pass weights but the sklearn metrics do not allow this. Tried pdist and cdist from scipy but these calculate the distances before hand!
import pandas as pd
from sklearn.neighbors import NearestNeighbors
X = pd.read_csv('.file.csv')
weights = [1] * X.shape[1] # filled with 1's for now
nbrs = NearestNeighbors(
algorithm = 'brute',
metric = minkowski(u, v, p=1, w=weights), n_jobs = -1)
.fit(X)
distances, indices = nbrs.kneighbors(X=X, n_neighbors=50, return_distance=True)
This returns:
"NameError: name 'u' is not defined"
callable(minkowski)
returns True!
I know I'm not passing u and v so unsurprisingly the error shows up. The documentation for this is a bit poor for using other metrics outside from those supported in sklearn. How can I use a weighted metric from scipy for example?
The way you are trying to include the weights is your problem. As u
and v
are not defined and are internally passed to the metric callable you shouldn't actually include them in your code. You should create a partial function with functools.partial
from minkowski
with the values of p
and w
predefined.
from functools import partial
w_minkowski = partial(minkowski, p=1, w=weights)
nbrs = NearestNeighbors(algorithm='brute', metric=w_minkowski, n_jobs=-1)
nbrs.fit(X)
...