Search code examples
rcross-validationr-caret

5 fold Cross-validation


I have used this code to perform a 5 fold cross-validation on the Davis dataset found in the carData library.

install.packages("caret")
library(caret)
trainControl<-trainControl(method="cv",number=5)
lm<-train(weight~height+repht+repwt,Davis,method="lm",trControl=trainControl)
lm

Running this I obtain the error saying that there are missing values for weight. This is the error message:

Error in na.fail.default(list(weight = c(77L, 58L, 53L, 68L, 59L, 76L, : missing values in object

I would be very grateful for any suggestions on how to solve this problem. Thanks in advance!


Solution

  • You have missing errors in your predictor, so for example:

    library(caret)
    data = mtcars
    data$mpg[c(3,6,9)]<-NA
    trainControl<-trainControl(method="cv",number=5)
    fit<-train(mpg~cyl+hp,data,method="lm",trControl=trainControl)
    
    Error in na.fail.default(list(mpg = c(21, 21, NA, 21.4, 18.7, NA, 14.3,  : 
      missing values in object
    

    Use complete.cases to get data that contains complete observations

    complete.obs = complete.cases(data[,c("mpg","cyl","hp")])
    data = data[complete.obs,]
    fit<-train(mpg~cyl+hp,data,method="lm",trControl=trainControl)
    

    In your case, it should be:

    complete.obs = Davis[,c("weight","height","repht","repwt")]