Search code examples
rregressionlinear-regressionlme4predict

How can I get all model estimates automatically in R?


I know I can calculate models' estimates by hand, but I'm sure there's a way to get all model estimates for all categorical levels automatically. Since I'm dealing with lmers, maybe this should be suitable. Note: I don't want to predict new data, I just wanna get all estimates automatically. (just edited the post to make it easier to understand)

  • an example:
> model <- lmer(Score ~ Proficiency_c * testType + (1|ID), data = myData, REML = F)

> summary(model)

Fixed effects:
                            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)                   4.8376     0.2803 156.9206  17.259  < 2e-16 ***
Proficiency_c                -1.3381     0.4405 156.9206  -3.038  0.00279 ** 
testTypeTestB                 0.2088     0.3269 126.0000   0.639  0.52421    
testTypeTestC                 0.4638     0.3269 126.0000   1.418  0.15853    
Proficiency_c:testTypeTestB   0.5008     0.5138 126.0000   0.975  0.33157    
Proficiency_c:testTypeTestC   0.2357     0.5138 126.0000   0.459  0.64727  
---

> contrasts(myData$testType)

      TestB TestC
TestA     0     0
TestB     1     0
TestC     0     1
  • 'by hand', I would:
## estimate for Test A:

y = b0 + b1x1 + b2x2 + b3x3 + b4(x1 * x2) + b5(x1 * x3)
y = b0 + b1 * 1 + 0 + 0 + 0
y = b0 + b1 
y = 3.49

## estimate for Test B:

y = b0 + b1x1 + b2x2 + b3x3 + b4(x1 * x2) + b5(x1 * x3)
y = b0 + b1 * 1 + b2 * 1 + 0 +  b4(1 * 1) + 0
y = b0 + b2 + (b1 + b4)x1 
y = 4.20

## estimate for Test C:

y = b0 + b1x1 + b2x2 + b3x3 + b4(x1 * x2) + b5(x1 * x3)
y = b0 + b1 * 1 + b2 * 0 + b3 * 1 + 0 + b5 (1 * 1)
y = b0 + b3 + (b1 + b5)x1 
y = 4.19

edited question

  • I usually deal with people who don't know how to come up with the model's estimates by themselves, so I usually have to calculate them all 'by hand'. I just wish there was a way to get all 'ys' estimates concerning each categorical level (as I did 'by hand' above) without doing that manually? Again, for now, I don't want to predict new values. Thanks in advance!

  • data:

 dput(myData)
structure(list(ID = c("p1", "p1", "p1", "p2", "p2", "p2", "p3", 
"p3", "p3", "p4", "p4", "p4", "p5", "p5", "p5", "p6", "p6", "p6", 
"p7", "p7", "p7", "p8", "p8", "p8", "p9", "p9", "p9", "p10", 
"p10", "p10", "p11", "p11", "p11", "p12", "p12", "p12", "p13", 
"p13", "p13", "p14", "p14", "p14", "p15", "p15", "p15", "p16", 
"p16", "p16", "p17", "p17", "p17", "p18", "p18", "p18", "p19", 
"p19", "p19", "p20", "p20", "p20", "p21", "p21", "p21", "p22", 
"p22", "p22", "p23", "p23", "p23", "p24", "p24", "p24", "p25", 
"p25", "p25", "p26", "p26", "p26", "p27", "p27", "p27", "p28", 
"p28", "p28", "p29", "p29", "p29", "p30", "p30", "p30", "p31", 
"p31", "p31", "p32", "p32", "p32", "p33", "p33", "p33", "p34", 
"p34", "p34", "p35", "p35", "p35", "p36", "p36", "p36", "p37", 
"p37", "p37", "p38", "p38", "p38", "p39", "p39", "p39", "p40", 
"p40", "p40", "p41", "p41", "p41", "p42", "p42", "p42", "p43", 
"p43", "p43", "p44", "p44", "p44", "p45", "p45", "p45", "p46", 
"p46", "p46", "p47", "p47", "p47", "p48", "p48", "p48", "p49", 
"p49", "p49", "p50", "p50", "p50", "p51", "p51", "p51", "p52", 
"p52", "p52", "p53", "p53", "p53", "p54", "p54", "p54", "p55", 
"p55", "p55", "p56", "p56", "p56", "p57", "p57", "p57", "p58", 
"p58", "p58", "p59", "p59", "p59", "p60", "p60", "p60", "p61", 
"p61", "p61", "p62", "p62", "p62", "p63", "p63", "p63"), Score = c(5.33, 
5.05, 5.15, 5.82, 2.29, 7.54, 4.46, 2.43, 1.53, 8.97, 7.69, 7.21, 
6.76, 8.41, 3.77, 3.33, 11.57, 7.69, 2.15, 3.84, 3.29, 3.36, 
6.66, 5.6, 4.23, 4.41, 3.07, 2.29, 4.9, 4.46, 3.22, 1.72, 2.08, 
4.47, 2.4, 2.54, 2.73, 6.57, 7.31, 4.46, 9.27, 4.31, 4.54, 6.32, 
8.97, 3.44, 4.68, 9.7, 2.15, 5.68, 5.26, 9.3, 5.68, 8.97, 4.65, 
4.13, 4.57, 11.22, 11.39, 7.52, 3.94, 4.47, 3.52, 5, 8, 5.81, 
2.96, 4.05, 2.22, 4.41, 5.64, 4.79, 2.43, 2.5, 4.16, 7.57, 9.21, 
2.59, 3.12, 3.84, 7.76, 8.77, 5.08, 7.81, 4.49, 2.17, 7.4, 5.81, 
4.9, 3.19, 3.2, 2.72, 3.67, 4.42, 3.57, 1.02, 4.42, 2.45, 5.88, 
7.84, 4.93, 9.61, 3.75, 1.8, 3.47, 0.65, 1.39, 2.9, 6.36, 2.77, 
2.67, 6.89, 6.74, 6.81, 1.94, 3.22, 3.12, 4.08, 5.31, 11.23, 
4.1, 4.28, 3.89, 2.98, 3.52, 3.64, 3.63, 5.08, 4.9, 6.66, 7.56, 
3.14, 5.26, 1.03, 4.58, 2.9, 2.5, 3.57, 4, 7.54, 3.5, 5.19, 2.56, 
2.38, 1.4, 3.97, 2, 8.69, 5.33, 6.42, 3.62, 2.59, 4.63, 4.85, 
6.87, 5.55, 3.14, 2.29, 4.68, 7.76, 3.53, 8.88, 3.44, 8, 5.15, 
6.77, 12.28, 6.25, 4.91, 7.01, 7.4, 5.21, 3, 4.87, 7.5, 5.47, 
8.97, 7.89, 7.54, 9.25, 7.24, 5.37, 6.41, 2.94, 5.47, 7.14, 5.4, 
5.06, 6.32), Proficiency_c = c(0.44, 0.44, 0.44, 0.69, 0.69, 
0.69, 1.24, 1.24, 1.24, -0.16, -0.16, -0.16, 1.14, 1.14, 1.14, 
0.69, 0.69, 0.69, -0.26, -0.26, -0.26, 0.94, 0.94, 0.94, -0.26, 
-0.26, -0.26, 1.04, 1.04, 1.04, 0.39, 0.39, 0.39, -0.06, -0.06, 
-0.06, -0.41, -0.41, -0.41, 0.54, 0.54, 0.54, -0.51, -0.51, -0.51, 
-0.81, -0.81, -0.81, 0.14, 0.14, 0.14, -0.31, -0.31, -0.31, 0.44, 
0.44, 0.44, -0.11, -0.11, -0.11, -0.21, -0.21, -0.21, -0.51, 
-0.51, -0.51, 0.24, 0.24, 0.24, 0.59, 0.59, 0.59, -0.21, -0.21, 
-0.21, -0.66, -0.66, -0.66, -0.06, -0.06, -0.06, -1.01, -1.01, 
-1.01, -0.26, -0.26, -0.26, 0.19, 0.19, 0.19, 0.84, 0.84, 0.84, 
-0.11, -0.11, -0.11, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.79, 
0.79, 0.79, 1.09, 1.09, 1.09, -0.76, -0.76, -0.76, 0.14, 0.14, 
0.14, 0.64, 0.64, 0.64, 0.49, 0.49, 0.49, -0.71, -0.71, -0.71, 
-0.31, -0.31, -0.31, -0.11, -0.11, -0.11, -0.61, -0.61, -0.61, 
0.19, 0.19, 0.19, -0.36, -0.36, -0.36, -0.31, -0.31, -0.31, -1.01, 
-1.01, -1.01, 1.19, 1.19, 1.19, -0.96, -0.96, -0.96, 0.99, 0.99, 
0.99, 0.74, 0.74, 0.74, 0.24, 0.24, 0.24, -0.06, -0.06, -0.06, 
-0.31, -0.31, -0.31, -0.66, -0.66, -0.66, -0.96, -0.96, -0.96, 
0.89, 0.89, 0.89, -0.96, -0.96, -0.96, -1.01, -1.01, -1.01, -0.66, 
-0.66, -0.66, -0.71, -0.71, -0.71, -0.36, -0.36, -0.36), testType = structure(c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("TestA", 
"TestB", "TestC"), class = "factor")), row.names = c(NA, -189L
), class = c("tbl_df", "tbl", "data.frame"))

Solution

  • I'm not sure why you're calculating predictions at a reference proficiency of 1 (0 would be the default), but maybe you're looking for emmeans?

    library(emmeans)
    emmeans(model, ~testType, at = list(Proficiency_c=1))
    

    The at = argument is the way to specify in emmeans that we want to calculate marginal means with the non-focal parameters (Proficiency_c in this case) set to a value other than the default [typically the mean of a numeric covariate]. See vignette("basics", package = "emmeans") (emmeans has many high-quality vignettes). It's specified as a list because we may have multiple non-focal parameters to set.

    Results:

    NOTE: Results may be misleading due to involvement in interactions
     testType emmean    SE  df lower.CL upper.CL
     TestA      3.50 0.529 162     2.45     4.54
     TestB      4.21 0.529 162     3.16     5.25
     TestC      4.20 0.529 162     3.15     5.24
    
    Degrees-of-freedom method: kenward-roger 
    Confidence level used: 0.95 
    

    If you're looking for the estimated slope within each test type, use emtrends:

    emtrends(model, ~testType, "Proficiency_c")
     testType Proficiency_c.trend    SE  df lower.CL upper.CL
     TestA                 -1.338 0.448 162    -2.22  -0.4541
     TestB                 -0.837 0.448 162    -1.72   0.0467
     TestC                 -1.102 0.448 162    -1.99  -0.2185
    
    Degrees-of-freedom method: kenward-roger 
    Confidence level used: 0.95