Search code examples
rstatisticssimulationlme4mixed-models

Problem with simulation study by mixed model


I'm trying to run a simulation study using a linear mixed model. As an example and in order to present the same error that I have been facing, I am posting here the structure of my model using Orthodont data as an example. My model is defined as follows:

library(lme4)
head(Orthodont)
model_example <- lmer(distance~age + (1|Subject) + (1|Sex), Orthodont) 

At this stage, I'm extracting the variance component values:

sigma_Subject <- VarCorr(model_example)[[1]][1]; sigma_Subject
sigma_Sex <- VarCorr(model_example)[[2]][1]; sigma_Sex
sigma_error <- (summary(model_example)$sigma)^2; sigma_error

The simulation study is structured as follows:

nboot = 100
sim <- function(nboot,
                dados,
                modelo,
                beta.int,
                beta.age,
                sigma2,
                sigma2_Subject, 
                sigma2_Sex  
){
  # X matriz for fixed effects
  X <- model.matrix(model_example)
  # fixed effects
  beta <- matrix(c(beta.int,
                   beta.age),
                 ncol = 1)
  # Z matrix
  Z <- model.matrix( ~ -1 + Subject + Sex, Orthodont)
  # covariance matrix for error
  R <- sigma2*diag(1, dim(Orthodont)[1])
  # covariance matrix for random effects
  G <- diag(c(sigma2_Subject,
              sigma2_Sex))
  # simulating the error of multivariate normal distribution
  e <- t(mvrnorm(n = nboot, 
                 rep(0, times=dim(dados)[1]),
                 R, 
                 empirical = FALSE))
  # simulating for random effects
  u <- t(mvrnorm(n = nboot, 
                 rep(0, times = 2),
                 G, 
                 empirical = FALSE))
  
  y <- X%*%beta%*%matrix(1, nrow = 1, ncol = nboot) + Z%*%u + e
  return(y)
}
Y <- sim(nboot = 100, #replicates for simulation
         dados = Orthodont,
         modelo = model_example,
         beta.int = model_example@beta[1],
         beta.age = model_example@beta[2],
         sigma2 = 2.04,#Compvar for error
         sigma2_Subject = 3.26, #Compvar for subject
         sigma2_Sex =  2.40 #Compvar for sex
)
data_simulated <- cbind(Orthodont[,c("age",
                             "Subject", "Sex")],
                   Y)

However, I believe that my Z matrix and my G matrix are incorrectly defined, as I have been experiencing the following error: Error in Z %*% u : non-conformable arguments.


Solution

  • There should be some repetitions in Z and G.

    Here is how to get the marginal variance-covariance matrix of the "errors terms" of the model. Thus, to simulate from the model, you'll just have to simulate from a multivariate normal distribution having this variance matrix and with mean obtained from the "fixed part" of the model.

    # marginal variance-covariance matrix
    Vmatrix <- function(vcs, sigma2, reTrms){
      G <- Diagonal(x = rep(vcs, times = diff(reTrms$Gp)))
      Zt <- reTrms$Zt # transpose of the Z matrix
      H <- Diagonal(ncol(Zt)) + t(Zt) %*% G %*% Zt
      sigma2 * H
    }
    

    The vcs argument is for the variance components. To get those estimated from the model, you can do:

    library(lme4)
    fit <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
    #
    sigma2 <- sigma(fit)^2 
    # variance components
    vcs <- sapply(VarCorr(fit), "[[", 1L) / sigma2
    

    The reTrms argument is obtained with the lFormula function:

    reTrms <- lFormula(formula(fit), data = fit@frame)$reTrms