I am looking for a way to save the results to save the results of the Tukeyhsd into a pandas dataframe. see below:
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
import statsmodels.stats.multicomp as multi
mcDate = multi.MultiComparison(df['Glucose'], df['Date'])
Results = mcDate.tukeyhsd()
Multiple Comparison of Means - Tukey HSD,FWER=0.05
group1 group2 meandiff lower upper reject
A B 20.35 7.388 33.312 True
A C -3.85 -16.812 9.112 False
B C -24.2 -37.162 -11.238 True
I do not have access to your data, so I can't replicate the result. I used randomised data instead, just to show that this works. All you need to add to your code is the pandas import, and the last line creating the data frame.
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
import statsmodels.stats.multicomp as multi
import pandas as pd
import numpy as np
# Random Data.
x = np.random.choice(['A','B','C'], 50)
y = np.random.rand(50)
# DataFrame.
mcDate = multi.MultiComparison(y,x)
Results = mcDate.tukeyhsd()
Produces the following table:
group1 group2 meandiff lower upper reject
A B 0.1506 -0.07 0.3712 False
A C 0.1105 -0.1278 0.3487 False
B C -0.0401 -0.2865 0.2063 False
And, this is how you get the data frame:
df = pd.DataFrame(data=Results._results_table.data[1:], columns=Results._results_table.data[0])
group1 group2 meandiff lower upper reject
0 A B 0.1506 -0.0700 0.3712 False
1 A C 0.1105 -0.1278 0.3487 False
2 B C -0.0401 -0.2865 0.2063 False
I struggled with this for a while myself, and eventually found the solution by reviewing methods for the object, like this: