Search code examples
knn

KNN Classifier ValueError: Unknown label type: 'continuous'


We are going to introduce an extra 20-dimensional predictor 𝑧 , which does NOT actually play a role in generating 𝑦 . Yet, when in estimation, we do not know the fact and will use both 𝑥 and 𝑧 as predictors in the KNN algorithm.

We need to generate 𝑧 , the 20-dimensional predictors, of the same sizes. Each 𝑧 is a 20-dimensional multivariate normal random variable, with mean being (0,0,…,0) and identity covariance matrix (so that the 20 elements are independent standard normal random variables). The resulted 𝑧 is a 100*20 matrix, with each row being a data point with 20 dimensionsFor a fixed 𝑘=15 , fit a KNN model to predict 𝑦 with (𝑥,𝑧) , and measure the training and test MSE. (1 mark)

What's wrong in the code below?

#training data
x = np.arange(0 , 5 , 0.05)
f_x = beta0 + beta1 * x + beta2 * x**2 + beta3 * x**3
epsilon = np.random.normal(loc=0, scale=sigma, size=100)
y = f_x + epsilon

## test data
x_test = np.arange(0 , 6, 0.1)
f_x_test = beta0 + beta1 * x_test + beta2 * x_test**2 + beta3 * x_test**3
epsilon_test = np.random.normal(loc=0, scale=sigma, size=len(x_test))
y_test = f_x_test + epsilon_test

z = np.random.multivariate_normal(size = 100, mean=[0]*20, cov=np.identity(20))
z_test = np.random.multivariate_normal(size = 60, mean=[0]*20, cov=np.identity(20))
train_x = np.concatenate((np.expand_dims(x, axis = 1),z),axis = 1)
test_x =  np.concatenate((np.expand_dims(x_test, axis = 1),z_test),axis = 1)
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing
knn = KNeighborsClassifier(n_neighbors = 15)
from sklearn.metrics import mean_squared_error
knn.fit(train_x,y)
y_pred_train = knn.predict(train_x)
y_pred_test = knn.predict(test_x)
mse_train = mean_squared_error(y,y_pred_train)
mse_test = mean_squared_error(y_test,y_pred_test)

Solution

  • instead of KNeighborsClassifier, try KNeighborsRegressor

    knn_reg_model = KNeighborsRegressor(n_neighbors=k,algorithm='auto').fit(train_x,y.reshape(-1,1))