I am using following code:
from statsmodels.stats.contingency_tables import cochrans_q
res = cochrans_q([[1,4,5],[9,6,8]])
print(res)
The output is:
df 2
pvalue 0.36787944117144245
statistic 2.0
However, output remains same for [[10,4,5],[9,6,8]]
, [[55,88,77],[99,46,88]]
etc.
The statsmodels documentation page is here and the Wikipedia page on Cochran's Q test is here.
Where is the problem and how can it be solved? Thanks for your help.
cochrans_q takes binary data and not counts.
In statsmodels the documentation is often not very explicit, but the expected behavior can be seen from the unit tests.
The following unit test shows how to convert frequency data to the format required by statsmodels.
def test_cochransq3():
# another example compared to SAS
# in frequency weight format
dt = [('A', 'S1'), ('B', 'S1'), ('C', 'S1'), ('count', int)]
dta = np.array([('F', 'F', 'F', 6),
('U', 'F', 'F', 2),
('F', 'F', 'U', 16),
('U', 'F', 'U', 4),
('F', 'U', 'F', 2),
('U', 'U', 'F', 6),
('F', 'U', 'U', 4),
('U', 'U', 'U', 6)], dt)
cases = np.array([[0, 0, 0],
[1, 0, 0],
[0, 0, 1],
[1, 0, 1],
[0, 1, 0],
[1, 1, 0],
[0, 1, 1],
[1, 1, 1]])
count = np.array([ 6, 2, 16, 4, 2, 6, 4, 6])
data = np.repeat(cases, count, 0)
res = cochrans_q(data)
assert_allclose([res.statistic, res.pvalue], [8.4706, 0.0145], atol=5e-5)