Search code examples
kalman-filterestimationpykalman

EM-Algorithm with Pykalman


I am trying to implement a simpel application of the Kalman Filter using Pykalman, but I am getting an error on the estimation step of the EM-Algorithm that comes with the Pykalman package.

It is a simple linear regression with time-varying coefficient, based on simulated data. The code below simulates the data and starts the kalman filter, but when I try to estimate the parameters based on the observations, using kf.em(Data), it returns the error: ValueError: object arrays are not supported.

Am I doing something wrong with pykalman?

Model and full code below. The error occurs on the last line of the code.

Model (small images)

Description of the problem

State-Space representation of the model

Full Code

import pandas as pd
import scipy as sp
import numpy as np

import matplotlib.pyplot as plt
import pylab as pl
from pykalman import KalmanFilter

# generates the data
Data = pd.DataFrame(columns=['NoiseAR','NoiseReg', 'x', 'beta', 'y'], index=range(1000))
Data['NoiseAR'] = np.random.normal(loc=0.0, scale=1.0, size=1000)
Data['NoiseReg'] = np.random.normal(loc=0.0, scale=1.0, size=1000)

for i in range(1000):
    if i==0:
        Data.loc[i, 'x'] = Data.loc[i, 'NoiseAR']
    else:
        Data.loc[i, 'x'] = 0.95*Data.loc[i-1, 'x'] + Data.loc[i, 'NoiseAR']

for i in range(1000):
    Data.loc[i, 'beta'] = np.sin(np.radians(i))

Data['y'] = Data['x']*Data['beta'] + Data['NoiseReg']

# set up the kalman filter
F = [1.]
H = Data['x'].values.reshape(1000,1,1)
Q = [2.]
R = [2.]

init_state_mean = [0.]
init_state_cov = [2.]

kf = KalmanFilter(
    transition_matrices=F, 
    observation_matrices=H, 
    transition_covariance=Q, 
    observation_covariance=R, 
    initial_state_mean=init_state_mean, 
    initial_state_covariance=init_state_cov, 
    em_vars=['transition_covariance', 'observation_covariance', 'initial_state_mean', 'initial_state_covariance']
)

# estimate the parameters from em_vars using the EM algorithm
kf = kf.em(Data['y'].values)

Solution

  • I figured it out! Data['y'].values is a numpy array with dtype=object. All I had to do is to change the type of the array to float using .astype(float). This has to be done with everything that goes into the kalman filter object of pykalman, so I also had to change the type of the H matrix.

    Hope this helps somebody in the future!

    Here is what the final working code looks like:

    import pandas as pd
    import scipy as sp
    import numpy as np
    import matplotlib.pyplot as plt
    import pylab as pl
    from pykalman import KalmanFilter
    
    Data = pd.DataFrame(columns=['NoiseAR','NoiseReg', 'x', 'beta', 'y'], index=range(1000))
    
    Data['NoiseAR'] = np.random.normal(loc=0.0, scale=1.0, size=1000)
    Data['NoiseReg'] = np.random.normal(loc=0.0, scale=1.0, size=1000)
    plt.plot(Data[['NoiseAR','NoiseReg']])
    plt.show()
    
    for i in range(1000):
        if i == 0:
            Data.loc[i, 'x'] = Data.loc[i, 'NoiseAR']
        else:
            Data.loc[i, 'x'] = 0.95 * Data.loc[i - 1, 'x'] + Data.loc[i, 'NoiseAR']
    
    plt.plot(Data['x'])
    plt.show()
    
    for i in range(1000):
        Data.loc[i, 'beta'] = np.sin(np.radians(i))
    
    plt.plot(Data['beta'])
    plt.show()
    
    Data['y'] = Data['x']*Data['beta'] + Data['NoiseReg']
    
    plt.plot(Data[['x', 'y']])
    plt.show()
    
    F = [1.]
    H = Data['x'].values.reshape(1000,1,1).astype(float)
    Q = [2.]
    R = [2.]
    
    init_state_mean = [0.]
    init_state_cov = [2.]
    
    kf = KalmanFilter(
        transition_matrices=F,
        observation_matrices=H,
        transition_covariance=Q,
        observation_covariance=R,
        initial_state_mean=init_state_mean,
        initial_state_covariance=init_state_cov,
        em_vars=['transition_covariance', 'observation_covariance', 'initial_state_mean', 'initial_state_covariance']
    )
    
    kf = kf.em(Data['y'].values.astype(float))
    
    filtered_state_estimates = kf.filter(Data['y'].values.astype(float))[0]
    smoothed_state_estimates = kf.smooth(Data['y'].values.astype(float))[0]
    
    pl.figure(figsize=(10, 6))
    lines_true = pl.plot(Data['beta'].values, linestyle='-', color='b')
    lines_filt = pl.plot(filtered_state_estimates, linestyle='--', color='g')
    lines_smooth = pl.plot(smoothed_state_estimates, linestyle='-.', color='r')
    pl.legend(
        (lines_true[0], lines_filt[0], lines_smooth[0]),
        ('true', 'filtered', 'smoothed')
    )
    pl.xlabel('time')
    pl.ylabel('state')
    
    pl.show()