Search code examples
pythontime-seriesstatsmodelsmultivariate-time-series

How to use statsmodels' DynamicFactor method with exogenous variables?


I have a multivariate dynamic factor model with one common factor that I want to estimate with statsmodels.tsa.statespace.dynamic_factor.DynamicFactor.

The model looks as follows: Model formulation in LaTeX.*

As you can see, I am dealing with a t x 4 matrix of endogenous variables. Each of them has 6 own specific exogenous variables, which they don't share. So the only thing the 4 time series have in common, is the common factor.

My question is how to put this in code.

I have attempted the following:

model = DynamicFactor(
                        endog=y, # nobs x 4
                        exog=X, # nobs x k_exog
                        k_factors=1,
                        factor_order=1,
                        error_order=0,
                        error_cov_type='diagonal'
    )

But the results seem off, and I know from the documentation that X should have the shape of t x k_exog. I am wondering what k_exog should be in my case, and if I can arange my matrix so that y_1 only uses W_1 etc.

*EDIT: in the model formulation, at one point the dependent variable is called 'NG' but it should be y. Apologies.


Solution

  • The DynamicFactor model assumes that every exog variable affects every endog variable. However, you can tell the model to set the values of certain parameters to fixed values (rather than estimate them). You can use this to do what you want.

    A simple example follows:

    import numpy as np
    import pandas as pd
    import statsmodels.api as sm
    
    # Simulate some data
    nobs = 100
    np.random.seed(1234)
    y = pd.DataFrame(np.random.normal(size=(nobs, 2)), columns=['y1', 'y2'])
    X_1 = pd.Series(np.random.normal(size=nobs), name='x1')
    X_2 = pd.Series(np.random.normal(size=nobs), name='x2')
    X = pd.concat([X_1, X_2], axis=1)
    
    # Construct the model
    mod = sm.tsa.DynamicFactor(y, exog=X, k_factors=1, factor_order=1)
    
    # You can print the parameter names if you need to determine the
    # names of the parameters that you need to set fixed to 0
    # print(mod.param_names)
    
    # Fix the applicable parameters with `fix_params`...
    with mod.fix_params({'beta.x2.y1': 0, 'beta.x1.y2': 0}):
        # And  estimate the other parameters with `fit`
        res = mod.fit(disp=False)
    
    # Print the results
    print(res.summary())
    

    Which gives:

                                       Statespace Model Results                                  
    =============================================================================================
    Dep. Variable:                          ['y1', 'y2']   No. Observations:                  100
    Model:             DynamicFactor(factors=1, order=1)   Log Likelihood                -276.575
                                          + 2 regressors   AIC                            567.150
    Date:                               Fri, 12 May 2023   BIC                            585.386
    Time:                                       22:45:16   HQIC                           574.530
    Sample:                                            0                                         
                                                   - 100                                         
    Covariance Type:                                 opg                                         
    ===================================================================================
    Ljung-Box (L1) (Q):             0.01, 0.10   Jarque-Bera (JB):           5.22, 0.78
    Prob(Q):                        0.93, 0.76   Prob(JB):                   0.07, 0.68
    Heteroskedasticity (H):         2.11, 0.75   Skew:                     -0.56, -0.17
    Prob(H) (two-sided):            0.04, 0.41   Kurtosis:                   3.02, 3.26
                               Results for equation y1                            
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    loading.f1    -0.4278      0.891     -0.480      0.631      -2.174       1.318
    beta.x1       -0.0614      0.129     -0.478      0.633      -0.313       0.191
    beta.x2             0        nan        nan        nan         nan         nan
                               Results for equation y2                            
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    loading.f1     0.4487      0.888      0.505      0.613      -1.292       2.189
    beta.x1             0        nan        nan        nan         nan         nan
    beta.x2       -0.1626      0.113     -1.442      0.149      -0.384       0.058
                            Results for factor equation f1                        
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    L1.f1          0.1060      0.323      0.328      0.743      -0.527       0.739
                               Error covariance matrix                            
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    sigma2.y1      0.6124      0.766      0.800      0.424      -0.889       2.113
    sigma2.y2      0.9306      0.818      1.138      0.255      -0.672       2.533
    ==============================================================================
    
    Warnings:
    [1] Covariance matrix calculated using the outer product of gradients (complex-step).