Is there a simple way to compute independent t-tests with an aov object, similar to how pairwise comparisons can be computed by pairs(emmeans(aov.object,~variable)
?
I'm trying to discern what the question really is. This answer is based on this guess:
Is it possible to use
emmeans()
to do all independent t tests comparing several samples, two at a time?
The answer to that question is "not exactly". I'll illustrate with the InsectSprays
data in the datasets package. We may fit this aov
model:
> fm1 <- aov(count ~ spray, data = InsectSprays)
Then use emmeans
to compare the sprays:
> library("emmeans")
> pairs(emmeans(fm1, "spray"), adjust = "none")
contrast estimate SE df t.ratio p.value
A - B -0.833 1.6 66 -0.520 0.6045
A - C 12.417 1.6 66 7.755 <.0001
A - D 9.583 1.6 66 5.985 <.0001
A - E 11.000 1.6 66 6.870 <.0001
A - F -2.167 1.6 66 -1.353 0.1806
B - C 13.250 1.6 66 8.276 <.0001
B - D 10.417 1.6 66 6.506 <.0001
B - E 11.833 1.6 66 7.391 <.0001
B - F -1.333 1.6 66 -0.833 0.4080
C - D -2.833 1.6 66 -1.770 0.0814
C - E -1.417 1.6 66 -0.885 0.3795
C - F -14.583 1.6 66 -9.108 <.0001
D - E 1.417 1.6 66 0.885 0.3795
D - F -11.750 1.6 66 -7.339 <.0001
E - F -13.167 1.6 66 -8.223 <.0001
This does perform t tests comparing two means at a time (adding adjust = "none"
specifies that there is no multiplicity adjustment, so each test is taken as-is with no reference to the other test). The values of the means being compared are exactly the means that would be compared if you had run t.test()
separately with each pair of samples. However, these results are not exactly the same as the t.test()
results would be, because the error variance in fm1
is estimated by pooling all 6 samples together, whereas t.test()
would use the variability in only the two samples used in a particular test. I'll show this explicitly for the A - B
comparison:
> t.test(count ~ spray, data = InsectSprays, subset = (spray %in% c("A","B")), var.equal = TRUE)
Two Sample t-test
data: count by spray
t = -0.45352, df = 22, p-value = 0.6546
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.643994 2.977327
sample estimates:
mean in group A mean in group B
14.50000 15.33333
Note that the t ratio is slightly different, and there are 22 d.f. rather than 66, because only the variances of the two samples were pooled.
It is possible to use a similar method to obtain all of the t tests exactly, under different assumptions that the variances are not all equal. This requires a model that does not specify equal variances, as aov()
does. Here is the same example using a gls()
model:
> library("nlme")
> fm2 <- gls(count ~ spray, data = InsectSprays, weights = varIdent(form = ~ 1 | spray))
> pairs(emmeans(fm2, "spray", df.method = "boot"), adjust = "none")
contrast estimate SE df t.ratio p.value
A - B -0.833 1.837 21.8 -0.454 0.6547
A - C 12.417 1.477 14.7 8.407 <.0001
A - D 9.583 1.542 16.7 6.214 <.0001
A - E 11.000 1.451 13.9 7.580 <.0001
A - F -2.167 2.252 20.5 -0.962 0.3473
B - C 13.250 1.358 15.5 9.754 <.0001
B - D 10.417 1.429 17.8 7.289 <.0001
B - E 11.833 1.330 14.5 8.894 <.0001
B - F -1.333 2.177 19.5 -0.613 0.5472
C - D -2.833 0.920 20.9 -3.078 0.0057
C - E -1.417 0.758 21.6 -1.868 0.0754
C - F -14.583 1.882 13.2 -7.748 <.0001
D - E 1.417 0.879 19.6 1.612 0.1229
D - F -11.750 1.934 14.5 -6.076 <.0001
E - F -13.167 1.862 12.7 -7.071 <.0001
Degrees-of-freedom method: satterthwaite
Note that the test of the A - B
comparison is identical to that of t.test()
, which by default takes the variances to be unequal:
> t.test(count ~ spray, data = InsectSprays, subset = (spray %in% c("A","B")))
Welch Two Sample t-test
data: count by spray
t = -0.45352, df = 21.784, p-value = 0.6547
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.646182 2.979515
sample estimates:
mean in group A mean in group B
14.50000 15.33333