I have completed example 5.3 from Design and Analysis of Experiments (Montgomery 2013, ISBN:9781118097939) in Excel and got the same results. However, when completing the ANOVA in R, I get different results. Although being familiar with python, I am still new to R, so I am not sure what I am doing wrong. I suspect it has something to do with the degrees of freedom calculated in R, since they differ significantly from the degrees of freedom calculated in Excel. Find attached the results as an image. Textbook results confirmed in Excel by myself. The question statement is attached here as well. Question statement.
My code in R was as follows
bottel_vul_data <- data.frame(A = rep(c(10, 12, 14), each = 2),
B = rep(c(25, 30), each = 12),
C = rep(c(200, 250), each = 6),
vul = c(-3, -1, 0, 1, 5, 4,
-1, 0, 2, 1, 7, 6,
-1, 0, 2, 3, 7, 9,
1, 1, 6, 5, 10, 11))
bottel_anova <- aov(vul ~ A * B * C, data = bottel_vul_data)
summary(bottel_anova)
which yielded
Df Sum Sq Mean Sq F value Pr(>F)
A 1 248.06 248.06 275.306 1.67e-11 ***
B 1 45.37 45.37 50.358 2.53e-06 ***
C 1 22.04 22.04 24.462 0.000146 ***
A:B 1 5.06 5.06 5.618 0.030675 *
A:C 1 0.56 0.56 0.624 0.441016
B:C 1 1.04 1.04 1.156 0.298229
A:B:C 1 0.06 0.06 0.069 0.795626
Residuals 16 14.42 0.90
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Although not part of the main question, I also have no clue as to what the Signif. codes mean. If anyone knows, that would be great for background knowledge.
The issue here is that R is interpreting variable A as if it is a numeric predictor, whereas in you text book example they have treated it as a factor (i.e., a categorical variable). Once you convert A to a factor in r you get the correct result (the same as the photo from your text book). There's many ways to do this but perhaps the easiest is:
bottel_anova <- aov(vul ~ factor(A) * B * C, data = bottel_vul_data)
summary(bottel_anova)
Df Sum Sq Mean Sq F value Pr(>F)
factor(A) 2 252.75 126.38 178.412 1.19e-09 ***
B 1 45.37 45.37 64.059 3.74e-06 ***
C 1 22.04 22.04 31.118 0.00012 ***
factor(A):B 2 5.25 2.63 3.706 0.05581 .
factor(A):C 2 0.58 0.29 0.412 0.67149
B:C 1 1.04 1.04 1.471 0.24859
factor(A):B:C 2 1.08 0.54 0.765 0.48687
Residuals 12 8.50 0.71
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
As to your signif. code question it is just a quick way of showing your p value in the final column (***
p < .001, **
p < .01, *
p < .05 etcetera).