Difference between predict and fittedvalue in statsmodel

I have a very basic question, which I can somehow not find a real answer for.

Assuming I have a model:

import statsmodels.formula.api as smf
model = smf.ols(....).fit()

What is the difference between model.fittedvalues and model.predict ?

Solution

model.predict is a method for predicting values, so you can provide it an unseen dataset:

import statsmodels.formula.api as smf
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100,2),columns=['X','Y'])

model = smf.ols('Y ~ X',data=df).fit()

model.predict(exog=pd.DataFrame({'X':[1,2,3]}))

If you do not provide the exog argument, it returns the prediction by calling the data stored under the object, you see this under the source code:

def predict(self, params, exog=None):
        """
        Return linear predicted values from a design matrix.

        Parameters
        ----------
        params : array_like
            Parameters of a linear model.
        exog : array_like, optional
            Design / exogenous data. Model exog is used if None.

        Returns
        -------
        array_like
            An array of fitted values.

        Notes
        -----
        If the model has not yet been fit, params is not optional.
        """
        # JP: this does not look correct for GLMAR
        # SS: it needs its own predict method

        if exog is None:
            exog = self.exog

        return np.dot(exog, params)

On the other hand, model.fittedvalues is a property and it is the fitted values that are stored. It will be exactly the same as model.predict() for reasons explain above.

You can look at the methods for this type too.