Search code examples
pythonglmstatsmodelspymc3

Modified negative binomial GLM in Python


Packages pymc3 and statsmodels can handle negative binomial GLMs in Python as shown here:

E(Y) = e^(beta_0 + Sigma (X_i * beta_i))

Where X_is are my predictor variables and Y is my dependent variable. Is there a way to force one my variables (for example X_1) to have beta_1=1 so that the algorithm optimizes other coefficients. I am open to using both pymc3 and statsmodels. Thanks.


Solution

  • GLM and the count models in statsmodels.discrete include and optional keyword offset which is exactly for this use case. It is added to the linear prediction part, and so corresponds to an additional variable with fixed coefficient equal to 1.

    http://www.statsmodels.org/devel/generated/statsmodels.genmod.generalized_linear_model.GLM.html http://www.statsmodels.org/devel/generated/statsmodels.discrete.discrete_model.NegativeBinomial.html

    Aside: GLM with family NegativeBinomial takes the negative binomial dispersion parameter as fixed, while the discrete model NegativeBinomial estimates the dispersion parameter by MLE jointly with the mean parameters.

    Another aside: GLM has a fit_constrained method for linear or affine restrictions on the parameters. This works by transforming the design matrix and using offset for the constant part. In the simple case of a fixed parameter as in the question, this reduces to using offset in the same way as described above (although fit_constrained has to go through the more costly general case.)