Search code examples
rcross-validationlm

Is there a simple command to do leave-one-out cross validation with the lm() function?


Is there a simple command to do leave-one-out cross validation with the lm() function in R?

Specifically is there a simple command which for the code below?

x <- rnorm(1000,3,2)
y <- 2*x + rnorm(1000)

pred_error_sq <- c(0)
for(i in 1:1000) {
  x_i <- x[-i]
  y_i <- y[-i]
  mdl <- lm(y_i ~ x_i) # leave i'th observation out
  y_pred <- predict(mdl, data.frame(x_i = x[i])) # predict i'th observation
  pred_error_sq <- pred_error_sq + (y[i] - y_pred)^2 # cumulate squared prediction errors
}

y_squared <- sum((y-mean(y))^2)/100 # Variation of the data

R_squared <- 1 - (pred_error_sq/y_squared) # Measure for goodness of fit

Solution

  • Another solution is using caret

    library(caret)
    
    data <- data.frame(x = rnorm(1000, 3, 2), y = 2*x + rnorm(1000))
    
    train(y ~ x, method = "lm", data = data, trControl = trainControl(method = "LOOCV"))
    

    Linear Regression

    1000 samples 1 predictor

    No pre-processing Resampling: Leave-One-Out Cross-Validation Summary of sample sizes: 999, 999, 999, 999, 999, 999, ... Resampling results:

    RMSE Rsquared MAE
    1.050268 0.940619 0.836808

    Tuning parameter 'intercept' was held constant at a value of TRUE