Implementing Kalman filter for state space model of movement process

Is it possible to implement a model like the one presented in Bayesian Filtering and Smoothing, example 3.6 in statsmodels?

I can follow along with the Matlab code provided, but I'm not sure if and how this kind of model can be implemented in statsmodels.

The example involves tracking the position of an object in 2D space. The state is four-dimensional x=(x_1, x_2, x_3, x_4), but I've re-arranged the vector so that (x_1, x_3) represent position and (x_2, x_4) represent velocity in the two directions. Simulated data of the process consist of 100 position observations, arranged in a 2x100 matrix Y.

import numpy as np
from scipy import linalg

# The matrices in the dynamic model are set up as follows
q, dt, s = 1, 0.1, 0.5
A = np.array([[1, dt, 0, 0],
              [0, 1, 0, 0],
              [0, 0, 1, dt],
              [0, 0, 0, 1]])
Q = q * np.array([[dt ** 3 / 3, dt ** 2 / 2, 0, 0],
                  [dt ** 2 / 2, dt, 0, 0],
                  [0, 0, dt ** 3 / 3, dt ** 2 / 2],
                  [0, 0, dt ** 2 / 2, dt]])
# Matrices in the measurement model are designed as follows
H = np.array([[1, 0, 0, 0],
              [0, 0, 1, 0]])
R = s ** 2 * np.eye(2)
# Starting values
m0 = np.array([[0, 1, 0, -1]]).T  # column vector
P0 = np.eye(4)

The Kalman filter for the process is then implemented as follows:

n = 100
m = m0
P = P0
kf_m = np.zeros((m.shape[0], n))
kf_P = np.zeros((P.shape[0], P.shape[1], n))
for k in range(n):
    m = A @ m
    P = A @ P @ A.T + Q
    S = H @ P @ H.T + R
    K = linalg.lstsq(S.T, (P @ H.T).T)[0].T
    m = m + K @ (Y[:, k, np.newaxis] - H @ m)
    P = P - K @ S @ K.T

    kf_m[:, k] = m.flatten()
    kf_P[:, :, k] = P

How could this filter be implemented in statsmodels, if at all possible? statsmodels may run more efficiently if the data are much larger, and one could implement a smoother on the filter within a subclass.

Solution

Yes, you can do this; the main thing is to map your notation to the notation / variable names used by Statsmodels.

Here is an example of how you might do this:

import numpy as np
import pandas as pd  # Pandas isn't necessary, but makes some output nicer
import statsmodels.api as sm

# The matrices in the dynamic model are set up as follows
q, dt, s = 1, 0.1, 0.5
A = np.array([[1, dt, 0, 0],
              [0, 1, 0, 0],
              [0, 0, 1, dt],
              [0, 0, 0, 1]])
Q = q * np.array([[dt ** 3 / 3, dt ** 2 / 2, 0, 0],
                  [dt ** 2 / 2, dt, 0, 0],
                  [0, 0, dt ** 3 / 3, dt ** 2 / 2],
                  [0, 0, dt ** 2 / 2, dt]])
# Matrices in the measurement model are designed as follows
H = np.array([[1, 0, 0, 0],
              [0, 0, 1, 0]])
R = s ** 2 * np.eye(2)
# Starting values
m0 = np.array([[0, 1, 0, -1]]).T  # column vector
P0 = np.eye(4)


# Now instantiate a statespace model with the data
# (data should be shaped nobs x n_variables))
kf = sm.tsa.statespace.MLEModel(pd.DataFrame(Y.T), k_states=4)
kf._state_names = ['x1', 'dx1/dt', 'x2', 'dx2/dt']
kf['design'] = H
kf['obs_cov'] = R
kf['transition'] = A
kf['selection'] = np.eye(4)
kf['state_cov'] = Q

# Edit: the timing convention for initialization
# in Statsmodels differs from the the in the question
# So we should not use kf.initialize_known(m0[:, 0], P0)
# But instead, to fit the question's initialization
# into Statsmodels' timing, we just need to use the
# transition equation to move the initialization
# forward, as follows:
kf.initialize_known(A @ m0[:, 0], A @ P0 @ A.T + Q)

# To performan Kalman filtering and smoothing, use:
res = kf.smooth([])

# Then, for example, to print the smoothed estimates of
# the state vector:
print(res.states.smoothed)