Search code examples
remmeans

Reconstruct a reference grid with the combination of two variables from emmeans::ref_grid


My real data is of similar complexity to the ideas that emmeans:MOats wants to convey. I am using MOats as a practice example.

library(emmeans)


MOats.lm = lm(yield ~ Block + Variety, data = MOats)
    ref_grid(MOats.lm)
'emmGrid' object with variables:
    Block = VI, V, III, IV, II, I
    Variety = Golden Rain, Marvellous, Victory
    rep.meas = multivariate response levels: 0, 0.2, 0.4, 0.6
    # Silly illustration of how to use 'mult.levs' to make comb's of two factors
    ref_grid(MOats.lm, mult.levs = list(T=LETTERS[1:2], U=letters[1:2]))

Assuming that the Block factor in MOats.lm is not the popular blocking factor in experiment design, but a characteristic of Oat.

Main question: I want to create a new variable from the combination of Variety and Block, called eater with add_grouping syntax, such that if Variety = Golden Rain x Block = I then eater = fox, if Variety = Golden Rain x Block = II then eater = fox, if Variety = Marvellous x Block = II then eater = cat, and so on, to make 12 combinations (12 is just arbitrary, some animals eat more varieties and some only eat one). I think I need to make a dummy variable of Block x Variety and then assign the desired eater. Eventually, I want to make contrasts of eaters in each variety.

eater <- factor(c("fox", "cat","mouse","frog"), levels = c("fox", "cat","frog", "mouse"))

How do I proceed? The add_grouping example had single factor reconstruction only. What if the levels of Block is not divisible by the levels of Variety? For example, Block has 9 levels and Variety has 4 levels.https://rdrr.io/cran/emmeans/man/add_grouping.html

fiber.lm <- lm(strength ~ diameter + machine, data = fiber)
( frg <- ref_grid(fiber.lm) )

# Suppose the machines are two different brands
brands <- factor(c("FiberPro", "FiberPro", "Acme"), levels = c("FiberPro", "Acme"))
( gfrg <- add_grouping(frg, "brand", "machine", brands) )

Side issue: Where did rep.meas = multivariate response levels: 0, 0.2, 0.4, 0.6 come from? There is no such column in View(MOats).

I haven't figured out how to construct a new variable in the form of Factor1 = Factor2*Factor3 from the source code here https://rdrr.io/github/rvlenth/emmeans/src/R/ref-grid.R. Any leads are much appreciated.

UPDATE: The following lines added the new grouping variables but removed the original grouping variables, Variety and Block.

eater <- rep(LETTERS[1:3],6)
RG_add2 <- add_grouping(RG, "eater", "BV", eater)
RG_add2
'emmGrid' object with variables:
    BV = 6 G, 5 G, 3 G, 4 G, 2 G, 1 G, 6 M, 5 M, 3 M, 4 M, 2 M, 1 M, 6 V, 5 V, 3 V, 4 V, 2 V, 1 V
    rep.meas = multivariate response levels: 0.0, 0.2, 0.4, 0.6
    eater = A, B, C
Nesting structure:  BV %in% eater


RG_add <- add_grouping(RG, "eater", "BVlev", eater)  
Error in add_grouping(RG, "eater", "BVlev", eater) : 
  Length of 'newlevs' doesn't match # levels of 'BVlev'

I don't understand the error, because

length(BV)
[1] 18
 length(eater)
[1] 18
BV
 [1] "6 G" "5 G" "3 G" "4 G" "2 G" "1 G" "6 M" "5 M" "3 M" "4 M" "2 M" "1 M"
[13] "6 V" "5 V" "3 V" "4 V" "2 V" "1 V"
BVlev
 [1] "6 G" "5 G" "3 G" "4 G" "2 G" "1 G" "6 M" "5 M" "3 M" "4 M" "2 M" "1 M"
[13] "6 V" "5 V" "3 V" "4 V" "2 V" "1 V"

Eventually, I want to do emmeans(RG_add, ~ Variety|eater)


Solution

  • The add_grouping() function currently requires a single nested factor. So you need to create that one factor. That may be done using the levels<- method:

    library(emmeans)
    MOats.lm = lm(yield ~ Block + Variety, data = MOats)
    
    RG = ref_grid(MOats.lm)
    RG
    ## 'emmGrid' object with variables:
    ##     Block = VI, V, III, IV, II, I
    ##     Variety = Golden Rain, Marvellous, Victory
    ##     rep.meas = multivariate response levels: 0, 0.2, 0.4, 0.6
    
    BVlev = do.call(paste, expand.grid(c(6, 5, 3, 4, 2, 1), c("G", "M", "V")))
    
    levels(RG) = list(BV = BVlev, rep.meas = c(0, 0.2, 0.4, 0.6))
    RG
    ## 'emmGrid' object with variables:
    ##     BV = 6 G, 5 G, 3 G, 4 G, 2 G, 1 G, 6 M, 5 M, 3 M, 4 M, 2 M, 1 M, 6 V, 5 V, 3 V, 4 V, 2 V, 1 V
    ##     rep.meas = multivariate response levels: 0.0, 0.2, 0.4, 0.6
    

    Created on 2021-08-17 by the reprex package (v2.0.0)

    Now you can proceed with add_grouping(RG, "eater", "VB", eaters). The length of eaters must be 18, such that each element specifies which eater is associated with each level of BV.

    When replacing the levels, you need to be careful to preserve the relative order of the factors in the list of levels. The factors being combined need to be consecutive.