I have been asked to see if there is a linear trend in 3 groups of data (5 points each) by using ANOVA and linear contrasts. The 3 groups represent data collected in 2010
, 2011
and 2012
. I want to use R for this procedure and I have tried both of the following:
contrasts(data$groups, how.many=1) <- contr.poly(3)
contrasts(data$groups) <- contr.poly(3)
Both ways seem to work fine but give slightly different answers in terms of their p-values. I have no idea which is correct and it is really tricky to find help for this on the web. I would like help figuring out what is the reasoning behind the different answers. I'm not sure if it has something to do with partitioning sums of squares or whatnot.
Both approaches differ with respect to whether a quadratic polynomial is used.
For illustration purposes, have a look at this example, both x
and y
are a factor with three levels.
x <- y <- gl(3, 2)
# [1] 1 1 2 2 3 3
# Levels: 1 2 3
The first approach creates a contrast matrix for a quadratic polynomial, i.e., with a linear (.L
) and a quadratic trend (.Q
). The 3
means: Create the 3 - 1
th polynomial.
contrasts(x) <- contr.poly(3)
# [1] 1 1 2 2 3 3
# attr(,"contrasts")
# .L .Q
# 1 -7.071068e-01 0.4082483
# 2 -7.850462e-17 -0.8164966
# 3 7.071068e-01 0.4082483
# Levels: 1 2 3
In contrast, the second approach results in a polynomial of first order (i.e., a linear trend only). This is due to the argument how.many = 1
. Hence, only 1
contrast is created.
contrasts(y, how.many = 1) <- contr.poly(3)
# [1] 1 1 2 2 3 3
# attr(,"contrasts")
# .L
# 1 -7.071068e-01
# 2 -7.850462e-17
# 3 7.071068e-01
# Levels: 1 2 3
If you're interested in the linear trend only, the second option seems more appropriate for you.