Creating function to run k-fold cross validation on glmer object (Leave One Out Cross-Validation)

I am trying to create a function to run a k-fold cross validation on a glmer object. This is just data I got online (my dataset is quite large) so the model isn't the best but if I can get this to work using this data I should be able to switch it to my dataset quite easily.

I want to do a LOOCV(Leave One Out Cross-Validation)

"LOOCV(Leave One Out Cross-Validation) is a type of cross-validation approach in which each observation is considered as the validation set and the rest (N-1) observations are considered as the training set."

The outline I got was from Caroline's answer on this researchgate thread.

#load libraries 

#add example data 
Data <- read.csv("")
Data <- select(Data, remission, IL6, CRP, DID)
Data$remission<- as.factor(Data$remission)
Data$DID<- as.factor(Data$DID)

#add ROW column  
Data <- Data %>% mutate(ROW = row_number())
for (i in 1:8825) { # i in total number of observations in dataset 
  ##Data that will be predicted
  ###To train the model
  M1 <- glmer(remission ~ 1 + IL6 + CRP + ( 1 | DID ), data = DataCV, family = binomial, control = glmerControl(optimizer ='optimx', optCtrl=list(method='L-BFGS-B')))
  P1=predict(M1, DataC1)
  PTOT= c(PTOT, P1)


This is the error I get "Error: Invalid grouping factor specification, DID"


  • DataCV is empty.

    For example:

    i <- 1  ## first time through the loop

    I think that should have been DataC$ROW), not DataC1$ROW.

    A few other comments: a more compact version of your code would look something like this:

    ## fit the full model
    M1 <- glmer(remission ~ 1 + IL6 + CRP + ( 1 | DID ), data = DataC, 
       family = binomial, control = glmerControl(optimizer ='optimx', optCtrl=list(method='L-BFGS-B')))
    res <- numeric(nrow(DataCV))
    for (i in 1:nrow(DataCV)) {
       new_fit <- update(M1, data = dataC[-i,]
       res[i] <- (predict(new_fit, newdata=dataC[i,]) - remission[i])^2

    For a well-specified model LOOCV is asymptotically equivalent to AIC, so you might be doing a lot of work to get something that's not very different from the AIC (which you can get directly from a single model fit) ...