Search code examples
pythonknnmahalanobis

KNN Mahalanobis error - size of V does not match - Python


I am trying to implement a KNN model, using Mahalanobis as the distance metric, however when I execute the code I am getting an error:

Value Error: "size of V does not match

where V is the covariance matrix of features.

Relevant parts of my code below:

X_train, X_test, y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=10,stratify=y)

knn2=KNeighborsClassifier(n_neighbors=20, metric='mahalanobis', metric_params={'V': np.cov(X_train)})

knn2.fit(X_train,y_train) # this is the line that causes the error. 

I have looked at the repo on github for sklearn's distance metric code (from line 628 is Mahalanobis), and can see the error arises from the following:

cdef inline DTYPE_t rdist(self, DTYPE_t* x1, DTYPE_t* x2,
                              ITYPE_t size) nogil except -1:
        if size != self.size:
            with gil:
                raise ValueError('Mahalanobis dist: size of V does not match')

I've worked out what self.size is in my case, but can't work out what size is.

Could anyone help with this error?

Thanks


Solution

  • Pass the argument rowvar=False to np.cov and it should work. Your knn constructor should look like this:

    knn2=KNeighborsClassifier(n_neighbors=20, metric='mahalanobis', metric_params={'V': np.cov(X_train, rowvar=False)})