Search code examples
rfunction

Error in my function with emmeans (no variable named . in the reference grid)


I first run the multilevel model for analysis, which has no issue.

mlm4 <- lmer(y ~ grp + t + (1|id), data=d1)
summary(mlm4)

Now, I wrote my own function to obtain some details about the results and diagnosis. However, the last parts with emmeans() do not work. The error message says, "Error in emmeans(p_m, ~p_t | p_g, at = list(p_t = c(-700, -365, 0, 365, : No variable named p_t in the reference grid"

This is my first time writing my own function in R. Please advise!

mlm_sum <- function(p_m, p_t, p_g) {
  sum <- summary(p_m)
  ci <- confint(p_m)
  
  # variance-covariance matric
  mat <- Matrix::bdiag(VarCorr(p_m))
  
  # diagnosis
  pred_res <- data.frame(predicted=predict(p_m), residual=residuals(p_m))
  plot1 <- ggplot(pred_res, aes(x=predicted, y=residual)) + geom_point() + geom_hline(yintercept=0, lty=3)
  plot2 <- ggplot(pred_res, aes(x=residual)) + geom_histogram(bins=20, color="black")
  plot3 <- ggplot(pred_res, aes(sample=residual)) + stat_qq() + stat_qq_line()

  # estimated marginal means
  emm1 <- emmeans(p_m, ~ p_t | p_g, 
          at = list(p_t = c(-700, -365, 0, 365, 700)),
          pbkrtest.limit = 43788)
  
  emm_plot1 <- emmip(p_m, ~ p_t | p_g,
        at = list(p_t = c(-700, -365, 0, 365, 700)),
        pbkrtest.limit = 43788
  )
  
  return(list(sum, ci, mat, plot1, plot2, plot3, emm1, emm_plot1))
}

mlm_sum(mlm4, t, g)

working function returning model summary, 95% CI, diagnostic plots, and estiamted marginal means and its plot


Solution

  • This is untested because you haven't given us a reproducible example, but the solution to your problem is probably along the lines of:

    ## convert variables specified as symbols (t, g) to strings
    p_t <- deparse(substitute(p_t))
    p_g <- deparse(substitute(p_g))
    
    ...
    
    ## create formula based on these strings
    ff <- reformulate(paste(p_t, p_g, sep = "|"))
    
    ## call emmeans
    emmeans(ff, ...)
    emmip(ff, ...)
    

    This is all a little bit tricky/clunky because we're doing non-standard evaluation, working with symbols rather than quoted strings. In this code I go from a symbol defined within the function (p_t) to the symbols that were passed as an argument to the function (substitute()), to a string representation of those symbols (deparse). We work with those strings (paste) and then convert them back to a formula (reformulate), which is again a symbolic representation ...

    It might be possible to skip a couple of these steps (e.g. working with quote() or bquote(), but I think this slightly clunky way is easier to understand.