Search code examples
rplotlme4mixed-modelslsmeans

Plot least-squares means for groups of factor levels


I'm looking for a simple way to extract (and plot) the least-squares means of specified combinations of levels of one factor, for each level of another factor.

Example data:

set.seed(1)
model.data <- data.frame(time = factor(paste0("day", rep(1:8, each = 16))),
  animal = factor(rep(1:16, each = 8)),
  tissue = factor(c("blood", "liver", "kidney", "brain")),
  value = runif(128)
  )

Setting up custom contrasts for factor "time":

library("phia")
custom.contrasts <- as.data.frame(contrastCoefficients(
   time ~ (day1+day2+day3)/3 - (day4+day5+day6)/3,
   time ~ (day1+day2+day3)/3 - (day7+day8)/2,
   time ~ (day4+day5+day6)/3 - (day7+day8)/2,
   data = model.data, normalize = FALSE))

colnames(custom.contrasts) <- c("early - late",
  "early - very late",
  "late  - very late")

custom.contrasts.lsmc <- function(...) return(custom.contrasts)

Fitting the model and calculating the least-squares means:

library("lme4")
tissue.model <- lmer(value ~ time * tissue + (1|animal), model.data)
library("lsmeans")
tissue.lsm <- lsmeans(tissue.model, custom.contrasts ~ time | tissue)

Plotting:

plot(tissue.lsm$lsmeans)
dev.new()
plot(tissue.lsm$contrasts)

Now, the second plot has the combinations that I want, but it shows the difference between the combined means, rather than the means themselves.

I can get the individual values from tissue.lsm$lsmeans and calculate the combined means myself, but I have the nagging feeling that there is an easier way that I just don't see. All the data should be in the lsmobj, after all.

early.mean.liver = mean(model.data$value[model.data$tissue == "liver" & 
  model.data$time %in% c("day1", "day2", "day3")])
late.mean.liver = mean(model.data$value[model.data$tissue == "liver" & 
  model.data$time %in% c("day4", "day5", "day6")])
vlate.mean.liver = mean(model.data$value[model.data$tissue == "liver" & 
  model.data$time %in% c("day7", "day8")])
# ... for each level of "tissue"


#compare to tissue.lsm$contrasts
early.mean.liver - late.mean.liver 
early.mean.liver - vlate.mean.liver
late.mean.liver - vlate.mean.liver

I'm looking forward to hearing your comments or suggestions. Thanks!


Solution

  • One alternative is to calculate the contrast coefficients for the group means of interest in addition to the contrast coefficients for the differences in group means that you calculated in custom_contrasts. For example, you could do this separately as custom.contrasts2.

    custom.contrasts2 <- as.data.frame(contrastCoefficients(
        time ~ (day1+day2+day3)/3,
        time ~ (day4+day5+day6)/3,
        time ~ (day7+day8)/2,
        data = model.data, normalize = FALSE))
    
    colnames(custom.contrasts2) <- c("early",
                              "late",
                              "very late")
    
    custom.contrasts2.lsmc <- function(...) return(custom.contrasts2)
    
    lsmeans(tissue.model, custom.contrasts2 ~ time | tissue)$contrasts
    

    Here are just the outputs for liver, which are the group means you are after.

    ...
     tissue = liver:
     contrast    estimate         SE   df t.ratio p.value
     early      0.4481244 0.07902715 70.4   5.671  <.0001
     late       0.4618041 0.07902715 70.4   5.844  <.0001
     lvery late 0.3824247 0.09678810 70.4   3.951  0.0002
    

    If you know you want both the group means and differences in group means, you can just add to the contrast coefficients matrix you create viacontrastCoefficients.

    custom.contrasts <- as.data.frame(contrastCoefficients(
        time ~ (day1+day2+day3)/3,
        time ~ (day4+day5+day6)/3,
        time ~ (day7+day8)/2,
        time ~ (day1+day2+day3)/3 - (day4+day5+day6)/3,
        time ~ (day1+day2+day3)/3 - (day7+day8)/2,
        time ~ (day4+day5+day6)/3 - (day7+day8)/2,
        data = model.data, normalize = FALSE))
    

    And then name and make the .lsmc function accordingly.