Search code examples
pythonnumpyregressionlinear-regressionstatsmodels

How to get a quick predict value in OLS model?


How do I get a quick predicted value from my ols model. For example

import statsmodels.formula.api as sm

model = sm.ols(formula="price ~ size + year", data=df_c).fit()

model.predict([25,1990]) #(should return predicted price value)

How do I get a predicted value when I run model.predict([25,1990]) where 25 is the size and 1990 is the year?

EDIT:

The error I get is 'PatsyError: predict requires that you use a DataFrame when predicting from a model that was created using the formula api.

The original error message returned by patsy is: Error evaluating factor: TypeError: list indices must be integers or slices, not str'

Is there a way to just run the simple code of model.predict([25,1990])

Thank you in advance!


Solution

  • You can not do this with the code you have given because you're using statsmodels.formula.api. The simplest solution I can provide is to use a quick dictionary:

    import statsmodels.formula.api as sm
    import pandas as pd
    import numpy as np
    
    df_c = pd.DataFrame(np.random.randn(10, 3))
    df_c.columns = ['price','size','year']
    model = sm.ols(formula='price ~ size + year', data=df_c).fit()
    
    model.predict({'size':25,'year':1990})[0]
    
    -165.2345445772976
    

    I created a mock dataframe to show that it works, but all you need is that last line: model.predict({'size':25,'year':1990})[0]