Search code examples
python-3.xdata-sciencestatsmodelsstate-space

Using the state intercept in statsmodels MLEModel class for known exogenous inputs


The state intercept c_t in the statsmodels statespace formulation appears to be a way to interject exogenous control variables into the statespace framework. However, I can't get it to work. The simplest possible example I could think of was a mean model with known offsets from a baseline, combined with random noise. In statsmodels statespace notation, it would be:

a_t = 0 * a_{t - 1} + sin(pi * t / 24) + 0 * eta_t,

y_t = 30 + 1 * a_t + e_t,

where t = 0, ..., 999 and e_t ~ N(0, 4). You can see below how I attempted to implement this:

# Python 3.6.3, Statsmodels 0.9.0
import numpy as np
from statsmodels.tsa.statespace.mlemodel import MLEModel

N = 1000
t = np.arange(N)
alpha = 2 * np.sin(np.pi * t / 24)
y = 30 + alpha + 2 * np.random.randn(N)

class Simple(MLEModel):
    start_params = [28, 2.2]
    param_names = ['int', 'sigma2_e']

    def __init__(self, endog, state_int):
        super().__init__(endog, k_states = 1)

        self.initialize_stationary()
        self.loglikelihood_burn = 100

        self['transition', 0, 0] = 0.0
        self['selection', 0, 0] = 0.0
        self['design', 0, 0] = 1.0

        self.state_intercept = np.reshape(state_int, (1, N))

    def update(self, params, **kwargs):
        params = super().update(params, **kwargs)

        self['obs_intercept', 0, 0] = params[0]
        self['obs_cov', 0, 0] = params[1]

my_Simple = Simple(y, alpha)
mle_results = my_Simple.fit(method = 'nm', maxiter = 1000)
mle_results.summary()

If the estimation takes into account the offsets, then I'll expect to get a variance estimate of around 4. If it ignores them however, it will be higher do to the variation from the sinusoid. As you'll see when you run it, it is indeed higher.

Any ideas here?


Solution

  • You can't use attribute-setting notation for state space system matrices with the MLEModel class, so your self.state_intercept call is failing. e.g. if you do:

    print(mle_results.filter_results.state_intercept)
    

    then you get:

    [[0.]]
    

    Instead, you should do:

    self['state_intercept'] = np.reshape(state_int, (1, N))
    

    and that gives a variance estimate around 4, as you expected.