Calculating lmer() model within a simulation study: Error in coefficients(): subscript out of bounds

I tried to simulate multilevel data for a simulation study where I systematically vary the ICC. In the following example dgp is the function to simulate data, onesimulation is just one analysis, and manysimulations is where I go through all the simulations-conditions.

The functions seem to work, yet every now and then I get the following Error and the function manysimulations is aborted.

fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
boundary (singular) fit: see ?isSingular
Error in coefficients(summary(model))["zj", c("Estimate", "Std. Error",  : 
  subscript out of bounds
Called from: t(coefficients(summary(model))["zj", c("Estimate", 
    "Std. Error", "Pr(>|t|)")])
Browse[1]>

Now coefficients(summary(model))["zj", c("Estimate","Std. Error", "Pr(>|t|)")] is just where I access the coefficients of the model. Yet it seems sometimes something goes wrong.

Following the code of the simulation:

niter <- 50
#numbers of iterations
ni <- 20
#numbers of student per class
nj <- 10
#numbers of clusters/classes

ICC <- seq(0,.9,.1)
#starting with 0, then going up .1 stepwise till .9, thus generating 10 values

# what is the random intercept standard deviation given icc? 
RI_sd <- sqrt(1 / (1-ICC) - 1) 
# RI_sd


gamma00 <- 0
#is the mean value of the L1 dependent variable, controlling for the L2 predijtor Wj
gamma01 <- 0
#is the effect (slope) of the L2 predictor
gamma10 <- 0
#is the mean value of Level1 slope, controlling for the L2 predictor Wj


sigma2 = 1
#level 1 residual with sigma2 = 1

# design matrix
design_grid <- expand.grid(ni = ni,
                      nj = nj, 
                      RI_sd = RI_sd, 
                      replication = 1:niter)



dgp <- function(ni,nj, RI_sd){
  
  dgp_grid <- expand.grid(
    ni = 1:ni,
    nj = 1:nj,
    xij = NA,
    zj = NA,
    yij = NA,
    Rij = NA,
    U0j = NA
  )
  
  dgp_grid$zj <- rep(rbinom(nj,1,0.5), each = length(1:ni))
  #create a random factorial level 2 predictor, same value for the whole cluster
  
  dgp_grid$U0j <- rep(rnorm(nj, mean = 0, sd = RI_sd), each = length(ni))
  #create level 2 residual
  
  dgp_grid$Rij <- rnorm(nj*nj, mean = 0, sd = sqrt(sigma2))
  # create level 1 residual with sigma2 = 1
  
  dgp_grid$xij <- rbinom(nj*nj,1,0.5)
  # create level 1 explanatory/predictor variable (draw from standard normal)
  
  
  dgp_grid$yij <-  gamma00 + gamma01 * dgp_grid$xij + gamma10 * dgp_grid$zj + dgp_grid$U0j + dgp_grid$Rij
  #creating Yij
  
  return(dgp_grid)
}

test_dgp<-dgp(ni = 10, nj = 5, RI_sd = 5)
#test if function works



onesimulation <- function(ni,nj,RI_sd){
  
  dgp_grid<-dgp(ni,nj,RI_sd)
  #creating the data
  
  model <- lmer(yij ~ xij + zj + (1|nj), data = dgp_grid)
  
  coefs_x <- t(coefficients(summary(model))["xij", c("Estimate", "Std. Error", "Pr(>|t|)")])
  coefs_z <- t(coefficients(summary(model))["zj", c("Estimate", "Std. Error", "Pr(>|t|)")])
  res <- cbind(data.frame(var = "x", coefs_x),
               data.frame(var = "z", coefs_z))
  
  
  colnames(res)[2] ="estimate_x"
  colnames(res)[3] ="std.err_x"
  colnames(res)[4] ="p_x"
  colnames(res)[6] ="estimate_z"
  colnames(res)[7] ="std.err_z"
  colnames(res)[8] ="p_z"
  
  return(res)
  
}

test_onesimulation<-onesimulation(ni = 10, nj = 5, RI_sd = 5)
#test if the function works

manysimulations <- function(ni, nj, RI_sd){
  
  sim_results <- expand.grid(ni = ni,
                             nj = nj, 
                             RI_sd = RI_sd, 
                             replication = 1:niter,
                             estimate_x = NA,
                             std.err_x = NA,
                             p_x = NA,
                             estimate_z = NA,
                             std.err_z = NA,
                             p_z = NA
                             )
  
  
  for(i in 1:nrow(sim_results)){
    res<-onesimulation(ni = sim_results$ni[i], nj = sim_results$nj[i], RI_sd = sim_results$RI_sd[i])
    # analysisgrid[i,2:ncol(analysisgrid)] <- res
    # print(res)
    sim_results$estimate_x[i] <- res$estimate_x
    sim_results$std.err_x[i] <- res$std.err_x
    sim_results$p_x[i] <- res$p_x
    sim_results$estimate_z[i] <- res$estimate_z
    sim_results$std.err_z[i] <- res$std.err_z
    sim_results$p_z[i] <- res$p_z
  }

  return(sim_results)
}

sim_results <- manysimulations(ni = ni, nj = nj, RI_sd = RI_sd)

I don't know why that is, nor do I know when the error occurs.

It seems that within coefficients(), something can be assigned...

Have any of you already encountered that error?

Thanks!

Solution

The way to avoid this problem without needing tryCatch() is to detect the problem upstream, in your case something like:

cc <- coefficients(summary(model))
if (!"zj" %in% rownames(cc)) {
  coefs_z <- rep(NA, 3)  ## I *think* this is the right shape/length?
} else {
  coefs_z <- t(cc["zj", c("Estimate", "Std. Error", "Pr(>|t|)")])
}

You may find that broom.mixed::tidy() gives you results in a slightly more convenient format ...