I was going to fit a knn model with faithful data in R. My code is like this:
smp_size <- floor(0.5 * nrow(faithful))
set.seed(123)
train_ind <- sample(seq_len(nrow(faithful)), size = smp_size)
train_data = faithful[train_ind, ]
test_data = faithful[-train_ind, ]
pred = FNN::knn.reg(train = train_data[,1],
test = test_data[,1],
y = train_data[,2], k = 5)$pred
The faithful data only has 2 columns. I met this error "Error in get.knnx(train, test, k, algorithm) : Number of columns must be same!."
I don't understand why the error will come up because the columns of train and test data are the same.
Thanks first for answering my question!
?knn.reg
says that train
/test
has to be data frame or matrix. But in your case you just have one independent variable so when you do str(train_data[,1])
it is no more a data frame. So the solution is to use as.data.frame
with train & test parameters in knn.reg
.
Another important point is that you need to first 'normalize' your data before you run KNN. May be you can try below snippet as a minor improvement to your code:
library('FNN')
set.seed(123)
#normalize data
X = scale(faithful[, -ncol(faithful)])
y = faithful[, ncol(faithful)]
#split data into train & test
train_ind <- sample(seq_len(nrow(faithful)), floor(0.7 * nrow(faithful)))
test_ind <- setdiff(seq_len(nrow(faithful)), train_ind)
#run KNN model
knn_model <- knn.reg(train = as.data.frame(X[train_ind,]),
test = as.data.frame(X[test_ind,]),
y = y[train_ind],
k = 5)
pred = knn_model$pred
Hope this helps!