I'm trying to run cosine_similarity
with KNN Classifier
with no success.
from sklearn.metrics.pairwise import cosine_similarity
knn = KNeighborsClassifier(n_neighbors=10, metric=cosine_similarity).fit(x, y)
shape of x (150 sample with 4 features):
(150, 4)
shape of y:
(150,)
I'm getting error:
ValueError: Expected 2D array, got 1D array instead
I have tried to reshape x
with reshape(-1,1)
or rehsape(1,-1)
with no success.
How can I run KNN Classifier
on this dataset (x have 4 features) with cosine_similarity
?
The problem is that the cosine similarity is only supported by the brute-force variant of the nearest neighbor algorithm. You have two options here to make this work:
Option 1: Explicitly specify to use the brute-force algorithm with algorithm='brute'
:
from sklearn.datasets import make_classification
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.neighbors import KNeighborsClassifier
X, y = make_classification(n_samples=150, n_features=4, random_state=42)
knn = KNeighborsClassifier(n_neighbors=10, algorithm='brute', metric=cosine_similarity)
knn.fit(X, y)
Option 2: Specify metric='cosine'
which will automatically pick the brute-force algorithm:
from sklearn.datasets import make_classification
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.neighbors import KNeighborsClassifier
X, y = make_classification(n_samples=150, n_features=4, random_state=42)
knn = KNeighborsClassifier(n_neighbors=10, metric='cosine')
knn.fit(X, y)
If you want to read more about the different nearest neighbor algorithms you can refer to the user guide.