Search code examples
rregressionlinear-regression

Are regression coefficients same as effect size?


I am trying to implement and analyse a full factorial experiment in R but I don't understand why the results presented in the book are different. Here are the problem details:

enter image description here

I tried to use the least square model to estimate the effects of different factors such as gap, power and flow rate but the effect sizes mentioned in the book are completely different :

enter image description here

My implementation of the problem in R and the results are as follows:

et_rate = c(550, 669, 633, 642, 1037, 749, 1075, 729,
            604, 650, 601, 635, 1052, 868, 1063, 860)

gap = factor(rep(1:2, times = 8))

flw_rate = factor(rep(1:2, each = 2, times = 4))

pwr = factor(rep(1:2, each = 4, times= 2))

df <- data.frame(gap, flw_rate, pwr, et_rate)


md3 <- lm(et_rate ~ .^3, data = df)
summary(md3)

And my results are:

Call:
lm(formula = et_rate ~ .^3, data = df)

Residuals:
   Min     1Q Median     3Q    Max 
-65.50 -11.12   0.00  11.12  65.50 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)           577.00      33.56  17.193 1.33e-07 ***
gap2                   82.50      47.46   1.738  0.12036    
flw_rate2              40.00      47.46   0.843  0.42382    
pwr2                  467.50      47.46   9.850 9.50e-06 ***
gap2:flw_rate2        -61.00      67.12  -0.909  0.39000    
gap2:pwr2            -318.50      67.12  -4.745  0.00145 ** 
flw_rate2:pwr2        -15.50      67.12  -0.231  0.82317    
gap2:flw_rate2:pwr2    22.50      94.92   0.237  0.81859    
---
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 47.46 on 8 degrees of freedom
Multiple R-squared:  0.9661,    Adjusted R-squared:  0.9364 
F-statistic: 32.56 on 7 and 8 DF,  p-value: 2.896e-05

Show in New WindowClear OutputExpand/Collapse Output

Call:
lm(formula = et_rate ~ gap * pwr, data = df)

Coefficients:
(Intercept)         gap2         pwr2    gap2:pwr2  
      597.0         52.0        459.7       -307.2  

I was expecting the coefficients of my model to be equal to the effect size estimate but they are completely different than mentioned in the solution in the book. Am I mistaken in my approach to get the effect sizes?


Solution

  • 1) Using Helmert contrasts as in @Roland's comment:

    options(contrasts = c(unordered = "contr.helmert", ordered = "contr.poly"))
    
    md3 <- lm(et_rate ~ .^3, df)
    2 * coef(md3)
    

    giving the effects shown in the question as:

        (Intercept)                gap1           flw_rate1                pwr1 
           1552.125            -101.625               7.375             306.125 
     gap1:flw_rate1           gap1:pwr1      flw_rate1:pwr1 gap1:flw_rate1:pwr1 
            -24.875            -153.625              -2.125               5.625 
    

    2) Using md3 from above, this also gives the effects shown in the question:

    mm <- model.matrix(md3)
    crossprod(mm, df$et_rate) / 8
    

    giving:

                            [,1]
    (Intercept)         1552.125
    gap1                -101.625
    flw_rate1              7.375
    pwr1                 306.125
    gap1:flw_rate1       -24.875
    gap1:pwr1           -153.625
    flw_rate1:pwr1        -2.125
    gap1:flw_rate1:pwr1    5.625
    

    3) This gives the coded factors table shown in the question:

    coded <- mm[1:8, 2:4]
    coded
    

    giving:

      gap1 flw_rate1 pwr1
    1   -1        -1   -1
    2    1        -1   -1
    3   -1         1   -1
    4    1         1   -1
    5   -1        -1    1
    6    1        -1    1
    7   -1         1    1
    8    1         1    1
    

    coded could also be obtained using the following where the indexing picks out the main effect columns:

    H2 <- matrix(c(-1, -1, -1, 1), 2)
    kronecker(kronecker(H2, H2), H2)[, c(2:3, 5)]
    

    The Total column in the question sums the two replicates:

    Total <- rowSums(matrix(df$et_rate, 8))
    Total
    ## [1] 1154 1319 1234 1277 2089 1617 2138 1589
    

    and in terms of Total and coded we can get the effects:

    coef(lm(Total ~ .^3, as.data.frame(coded)))
    ##     (Intercept)                gap1           flw_rate1                pwr1 
     ##       1552.125            -101.625               7.375             306.125 
     ## gap1:flw_rate1           gap1:pwr1      flw_rate1:pwr1 gap1:flw_rate1:pwr1 
     ##        -24.875            -153.625              -2.125               5.625