Search code examples
pythonexportpngstatsmodelssummary

Export statsmodels summary() to .png


I have trained a glm as follows:

    fitGlm = smf.glm( listOfInModelFeatures,
          family=sm.families.Binomial(),data=train, freq_weights = train['sampleWeight']).fit()

The results looks good:

print(fitGlm.summary())

                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                 Target   No. Observations:              1065046
Model:                            GLM   Df Residuals:               4361437.81
Model Family:                Binomial   Df Model:                            7
Link Function:                  Logit   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:            -6.0368e+05
Date:                Sun, 25 Aug 2024   Deviance:                   1.2074e+06
Time:                        09:03:54   Pearson chi2:                 4.12e+06
No. Iterations:                     8   Pseudo R-squ. (CS):             0.1716
Covariance Type:            nonrobust                                         
===========================================================================================
                              coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------------
Intercept                   3.2530      0.003   1074.036      0.000       3.247       3.259
feat1                       0.6477      0.004    176.500      0.000       0.641       0.655
feat2                       0.3939      0.006     71.224      0.000       0.383       0.405
feat3                       0.1990      0.007     28.294      0.000       0.185       0.213
feat4                       0.4932      0.009     54.614      0.000       0.476       0.511
feat5                       0.4477      0.005     90.323      0.000       0.438       0.457
feat6                       0.3031      0.005     57.572      0.000       0.293       0.313
feat7                       0.3711      0.004     87.419      0.000       0.363       0.379
===========================================================================================

I have then tried to export the summary() into .png as suggested here:

Python: How to save statsmodels results as image file?

So, I have written this code:

    fig, ax = plt.subplots(figsize=(16, 8))
    summary = []
    fitGlm.summary(print_fn=lambda x: summary.append(x))
    summary = '\n'.join(summary)
    ax.text(0.01, 0.05, summary, fontfamily='monospace', fontsize=12)
    ax.axis('off')
    plt.tight_layout()
    plt.savefig('output.png', dpi=300, bbox_inches='tight')

But I get this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[57], line 57
     55 fig, ax = plt.subplots(figsize=(16, 8))
     56 summary = []
---> 57 fitGlm.summary(print_fn=lambda x: summary.append(x))
     58 summary = '\n'.join(summary)
     59 ax.text(0.01, 0.05, summary, fontfamily='monospace', fontsize=12)

TypeError: GLMResults.summary() got an unexpected keyword argument 'print_fn'

Looks like print_fn is not recognized by statsmodels?

Can someone help me, please?


Solution

  • I have set up a test to see where the print_fn can be used. I also checked the solution posted by the last question, but I have not been able to find print_fn in the documentation.

    I have attempted to convert to tabulate in order to save the summary to png:

    import matplotlib.pyplot as plt
    import pandas as pd
    
    # Convert the summary table to a pandas DataFrame
    # change tables [0] to [1] to get the second table
    summary_df = pd.read_html(model.summary().tables[0].as_html(), header=0, index_col=0)[0]
    
    # Get the headers
    headers = summary_df.columns.tolist()
    
    # Convert the DataFrame to a list of lists and add the headers
    summary_list = [headers] + summary_df.values.tolist()
    
    # Create a new figure
    fig, ax = plt.subplots()
    
    # Remove the axes
    ax.axis('off')
    
    # Add a table to the figure
    table = plt.table(cellText=summary_list, loc='center')
    
    # Auto scale the table
    table.auto_set_font_size(False)
    table.set_fontsize(10)
    table.scale(1, 1.5)
    
    # Save the figure as a PNG file
    plt.savefig('summary2.png', dpi=300, bbox_inches='tight')
    

    In my opinion, it is a very unusual case to save data to png. It prevents users from sharing information. There are options such to export the summary to csv and latex. If you are doing this manually I would suggest exporting to csv and copy paste as image. Or save as txt and screenshot even.

    for reference:

    model.summary().as_csv()
    # save as csv
    with open('summary.csv', 'w') as file:
        file.write(model.summary().as_csv())
    

    or

    text = model.summary().as_text()
    
    # save to txt
    with open('summary.txt', 'w') as file:
        file.write(text)