Issue with using statsmodels.sandbox.regression.gmm.GMM

I wanna estimate interest rate process using gmm. enter image description here

enter image description here

So, I referenced a this code. https://github.com/josef-pkt/misc/blob/master/notebooks/ex_gmm_gamma.ipynb

and following is my code.

import numpy as np
import pandas as pd
from statsmodels.sandbox.regression.gmm import GMM

cd = np.array([1.5, 1.5, 1.7, 2.2, 2.0, 1.8, 1.8, 2.2, 1.9, 1.6, 1.8, 2.2, 2.0, 1.5, 1.1, 1.5, 1.4, 1.7, 1.42, 1.9])
dcd = np.array([0, 0.2 ,0.5, -0.2, -0.2, 0, 0.4, -0.3, -0.3, 0.2, 0.4, -0.2, -0.5, -0.4, 0.4, -0.1, 0.3, -0.28, 0.48, 0.2])
inst = np.column_stack((np.ones(len(cd)), cd))

class gmm(GMM):
    def momcond(self, params):
        p0, p1, p2, p3 = params
        endog = self.endog
        exog = self.exog
        inst = self.instrument   

        error1 = endog - p0 - p1 * exog
        error2 = (endog - p0 - p1 * exog) ** 2 - p2 * (exog ** (2 * p3)) / 12
        error3 = (endog - p0 - p1 * exog) * inst[:,0]
        error4 = ((endog - p0 - p1 * exog) ** 2 - p2 * (exog ** (2 * p3)) / 12) * inst[:,1]
        g = np.column_stack((error1, error2, error3, error4))
        return g


beta0 = np.array([0.1, 0.1, 0.01, 1])

gmm(endog = dcd, exog = cd, instrument = inst, k_moms=4, k_params=4).fit(beta0)

But, it rises an error like this.

ValueError: shapes (80,) and (4,4) not aligned: 80 (dim 0) != 4 (dim 0)

Could you please solve this problem.

Solution

The shape problem is because exog is a column array (vector) and the indexed instrument is 1-D which broadcasts to the 80 columns. I added a squeeze to exog, so that exog is also 1-D

The second problem is that there is a typo in the index of the instrument for moment condition 3, which should use
error3 = (endog - p0 - p1 * exog) * inst[:,1]
After fixing the shape problem, the fit raises a LinalgError because error1 and error3 were the same.

It works for me after making these two changes, but I don't know whether the estimated parameters make sense in the application.

cd = np.array([1.5, 1.5, 1.7, 2.2, 2.0, 1.8, 1.8, 2.2, 1.9, 1.6, 1.8, 2.2, 2.0, 1.5, 1.1, 1.5, 1.4, 1.7, 1.42, 1.9])
dcd = np.array([0, 0.2 ,0.5, -0.2, -0.2, 0, 0.4, -0.3, -0.3, 0.2, 0.4, -0.2, -0.5, -0.4, 0.4, -0.1, 0.3, -0.28, 0.48, 0.2])
inst = np.column_stack((np.ones(len(cd)), cd))

class gmm(GMM):
    def momcond(self, params):
        p0, p1, p2, p3 = params
        endog = self.endog
        exog = self.exog.squeeze()
        inst = self.instrument   

        error1 = endog - p0 - p1 * exog
        error2 = (endog - p0 - p1 * exog) ** 2 - p2 * (exog ** (2 * p3)) / 12
        error3 = (endog - p0 - p1 * exog) * inst[:,1]
        error4 = ((endog - p0 - p1 * exog) ** 2 - p2 * (exog ** (2 * p3)) / 12) * inst[:,1]
        g = np.column_stack((error1, error2, error3, error4))
        return g


beta0 = np.array([0.1, 0.1, 0.01, 1])
res = gmm(endog = dcd, exog = cd, instrument = inst, k_moms=4, k_params=4).fit(beta0)

There is a bug in GMM for summary which is based on an incorrect and too short list of parameter names. We can override the parameter names, then summary works

res.model.exog_names[:] = 'p0 p1 p2 p3'.split()
print(res.summary())




                                gmm Results                                  
==============================================================================
Dep. Variable:                      y   Hansen J:                    1.487e-10
Model:                            gmm   Prob (Hansen J):                   nan
Method:                           GMM                                         
Date:                Wed, 14 Mar 2018                                         
Time:                        09:38:38                                         
No. Observations:                  20                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
p0             0.9890      0.243      4.078      0.000       0.514       1.464
p1            -0.5524      0.129     -4.281      0.000      -0.805      -0.299
p2             1.2224      0.940      1.300      0.193      -0.620       3.065
p3            -0.3376      0.641     -0.527      0.598      -1.593       0.918
==============================================================================

Extra

In the corrected version the constant in the instrument is not used anymore. So it could be removed, or the moment conditions could be vectorized in instruments as in the following. Note, I convert endog to 2-d column array, so it matches the shape of exog and instruments.

class gmm(GMM):
    def momcond(self, params):
        p0, p1, p2, p3 = params
        endog = self.endog[:, None]
        exog = self.exog
        inst = self.instrument   

        error3 = (endog - p0 - p1 * exog) * inst
        error4 = ((endog - p0 - p1 * exog) ** 2 - p2 * (exog ** (2 * p3)) / 12) * inst
        g = np.column_stack((error3, error4))
        return g


beta0 = np.array([0.1, 0.1, 0.01, 1])
res = gmm(endog = dcd, exog = cd, instrument = inst, k_moms=4, k_params=4).fit(beta0)
res.model.exog_names[:] = 'p0 p1 p2 p3'.split()
print(res.summary())

Debugging

We can check that the user provided moment conditions have the correct shape but just creating the model instance and calling momcond

mod = gmm(endog = dcd, exog = cd, instrument = inst, k_moms=4, k_params=4)
mod.momcond(beta0).shape