I want to make a GLM model using statsmodels
in python. statsmodels.glm
supports the following distrubutions:
I want to find the same distributions in the scipy
library. I think these are the same:
statsmodels (docs) | scipy (docs) |
Binomial | binom |
Gamma | gamma |
Gaussian | norm |
InverseGaussian | invgauss |
NegativeBinomial | nbinom |
Poisson | poisson |
Tweedie | tweedie |
I am not entirely sure if "expon" is exactly the same as "Family". How could I check if this is the case? Also, I think it is wise to check them all, because I am determining the best fitting distribution using the scipy package, but then apply the statsmodels glm.
Update: I thought "Family" was a distribution (in my defense, the table is quite unclear).
is the parent class and not a specific distribution family.
The families for GLM in statsmodels correspond to distributions in scipy.stats (except maybe for tweedie). However the parameterization is different between GLM families and the common parameterization of the distributions, such as those in scipy.
In the latest release, the GLM and most discrete models have a get_distribution
method that returns and instance of the corresponding scipy or scipy-compatible distribution.
For example
which has the same parameterization, and
where GLM parameters need to be transformed to correspond to the scipy distribution.