Search code examples
pythonstatsmodels

Python: How to save statsmodels results as image file?


I'm using statsmodels to make OLS estimates. The results can be studied in the console using print(results.summary()). I'd like to store the very same table as a .png file. Below is a snippet with a reproducible example.

import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import statsmodels.api as sm

# Dataframe with some random numbers
np.random.seed(123)
rows = 10
df = pd.DataFrame(np.random.randint(90,110,size=(rows, 2)), columns=list('AB'))
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist 
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
print(df)

# OLS estimates using statsmodels.api
x = df['A']
y = df['B']

model = sm.OLS(y,sm.add_constant(x)).fit()

# Output
print(model.summary())

enter image description here

I've made some naive attempts using suggestions here, but I suspect I'm way off target:

os.chdir('C:/images')
sys.stdout = open("model.png","w")
print(model.summary())
sys.stdout.close()

So far this only raises a very long error message.

Thank you for any suggestions!


Solution

  • This is a pretty unusual task and your approach is kind of crazy. You are trying to combine a string (which has no positions in some metric-space) with some image (which is based on absolute positions; at least for pixel-based formats -> png, jpeg and co.).

    No matter what you do, you need some text-rendering engine!

    I tried to use pillow, but results are ugly. Probably because it's quite limited and a post-processing anti-aliasing is not saving anything. But maybe i did something wrong.

    from PIL import Image, ImageDraw, ImageFont
    image = Image.new('RGB', (800, 400))
    draw = ImageDraw.Draw(image)
    font = ImageFont.truetype("arial.ttf", 16)
    draw.text((0, 0), str(model.summary()), font=font)
    image = image.convert('1') # bw
    image = image.resize((600, 300), Image.ANTIALIAS)
    image.save('output.png')
    
    

    When you use statsmodels, i assume you already got matplotlib. This one can be used too. Here is some approach, which is quite okay, although not perfect (some line-shifts; i don't know why; edit: OP managed to repair these by using a monospace-font):

    import matplotlib.pyplot as plt
    fig, ax = plt.subplots(figsize=(16, 8))
    summary = []
    model.summary(print_fn=lambda x: summary.append(x))
    summary = '\n'.join(summary)
    ax.text(0.01, 0.05, summary, fontfamily='monospace', fontsize=12)
    ax.axis('off')
    plt.tight_layout()
    plt.savefig('output.png', dpi=300, bbox_inches='tight')
    

    Output:

    enter image description here

    Edit: OP managed to improve the matplotlib-approach by using a monospace-font! I incorporated that here and it's reflected in the output image.

    Take this as a demo and research python's text-rendering options. Maybe the matplotlib-approach can be improved, but maybe you need to use something like pycairo. Some SO-discussion.

    Remark: On my system your code does give those warnings!

    Edit: It seems you can ask statsmodels for a latex-representation. So i recommend using this, probably writing this to a file and use subprocess to call pdflatex or something similar (here some similar approach). matplotlib can use latex too (but i won't test it as i'm currently on windows) but in this case we again need to tune text to window ratios somehow (compared to a full latex document given some A5-format for example).