I am trying to determine if data from today is really different from yesterday in four categories.
My counted data is:
data <-data.frame(yesterday=c(10741, 1575, 174, 2),
today = c(11987, 1705, 211, 2),
row.names = c("a", "b", "c", "unknown"))
> data
yesterday today
a 10741 11987
b 1575 1705
c 174 211
unknown 2 2
so I test using chi-square from stats package this way:
stats::chisq.test(x = data$yesterday, y = data$today)
and the result is:
Pearson's Chi-squared test
data: data$yesterday and data$today
X-squared = 12, df = 9, p-value = 0.2133
My issue is I assume this should be the same as:
stats::chisq.test(data)
But you can see the result is completely different.
Pearson's Chi-squared test
data: data
X-squared = 1.3846, df = 3, p-value = 0.7092
so....which is the proper way of using this test to compare two samples from same data set?
I assume the problem lies in the fact that you are applying chisq.test in the first case on columns of a contingency table, whereas the function expects x and y to be factors. So, the version where you supply the contingency table should be the correct one, at least it corresponds to the example from the documentation