I just wanted to conduct a kNN classification with the situation when k is 3. I would like to predict the dependent variable “diabetes” in valid set using train set and calculate the accuracy.
But I faced to the error message with
Error in knn(train = TrainXNormDF, test = ValidXNormDF, cl = MLdata2[, : 'train' and 'class' have different lengths
I can't solve this problem with get approach with
for(i in ((length(MLValidY) + 1):length(TrainXNormDF)))+(MLValidY = c(MLValidY, 0))
What can I do for it? Please help.
My code is as like below
install.packages("mlbench")
install.packages("gbm")
library(mlbench)
library(gbm)
data("PimaIndiansDiabetes2")
head(PimaIndiansDiabetes2)
MLdata <- as.data.frame(PimaIndiansDiabetes2)
head(MLdata)
str(MLdata)
View(MLdata)
any(is.na(MLdata))
sum(is.na(MLdata))
MLdata2 <- na.omit(MLdata)
any(is.na(MLdata2))
sum(is.na(MLdata2))
View(MLdata2)
MLIdx <- sample(1:3, size = nrow(MLdata2), prob = c(0.6, 0.2, 0.2), replace = TRUE)
MLTrain <- MLdata2[MLIdx == 1,]
MLValid <- MLdata2[MLIdx == 2,]
MLTest <- MLdata2[MLIdx == 3,]
head(MLTrain)
head(MLValid)
head(MLTest)
str(MLTrain)
str(MLValid)
str(MLTest)
View(MLTestY)
MLTrainX <- MLTrain[ , -9]
MLValidX <- MLValid[ , -9]
MLTestX <- MLTest[ , -9]
MLTrainY <- as.data.frame(MLTrain[ , 9])
MLValidY <- as.data.frame(MLValid[ , 9])
MLTestY <- as.data.frame(MLTest[ , 9])
View(MLTrainX)
View(MLTrainY)
library(caret)
NormValues <- preProcess(MLTrainX, method = c("center", "scale"))
TrainXNormDF <- predict(NormValues, MLTrainX)
ValidXNormDF <- predict(NormValues, MLValidX)
TestXNormDF <- predict(NormValues, MLTestX)
head(TrainXNormDF)
head(ValidXNormDF)
head(TestXNormDF)
install.packages('FNN')
library(FNN)
library(class)
NN <- knn(train = TrainXNormDF,
test = ValidXNormDF,
cl = MLValidY,
k = 3)
Thank you
Your cl
variable is not the same length as your train
variable. MLValidY
only has 74 observations, while TrainXNormDF
has 224.
cl
should provide the true classification for every row in your training set.
Furthermore, cl
is a data.frame instead of a vector.
Try the following:
NN <- knn(train = TrainXNormDF,
test = ValidXNormDF,
cl = MLTrainY$`MLTrain[, 9]`,
k = 3)