I'm trying to run a tweedie model with Statsmodel and keep getting the following error:
AttributeError: 'Tweedie' object has no attribute 'ndim'
formula = 'pure_premium ~ atfault_model + channel_model_DIR + channel_model_IA + CLded_model + credit_model_52778 + \
credit_model_c6 + package_model_Elite + package_model_LBO + package_model_Plus + package_model_Savers + \
package_model_Savers_Plus + Q("ds_fp_paid_in_full_eligiable-has discount") + ds_fp_paid_in_full_ineligable + \
Q("ds_pn_prior_insurance_eligable-has discount") + ds_pn_prior_insurance_ineligable + \
Q("ds_ip_advanced_purchase_eligiable-has discount") + ds_ip_advanced_purchase_ineligable + \
credit_model_c5 + ds_ad_affinity + ds_ak_alliance + \
ds_ly_loyalty_discount + ds_mo_multipolicy + ds_pf_performance + majorvio_model + \
(driver_age_model*marital_status_model) + minorvio_model + multi_unit_model + \
RATING_CLASS_CODE_MODEL + unit_drv_exp_model + Vintiles + safety_course_model + instructor_course_model + \
(class_model*v_age_model) + (class_model*cc_model) + state_model'
lost_cost_model = smf.ols(formula = formula, data = coll_df
, family = sm.families.Tweedie(link = sm.families.links.log, var_power = 1.5))
Every variable is either a categorical, float or int.
I'm not sure what is causing this.
ols
does not take a family, OLS
is just linear regression.
You need to use the generalized linear model, i.e. GLM
or glm
for the formula interface.
GLM
includes several families in the one parameter exponential family and includes a selection of link functions.
Several other models are equivalent to GLM but based on a different implementation and with other options. Those models are written for the specific family-link combinations and do not have an option to change those.
OLS
is GLM with Gaussian family and linear link
Logit
is GLM with Binomial family, logit link and only for binary response variables.
Proit
is GLM with Binomial family, probit link and only for binary response variables.
Poisson
is GLM with a Poisson family and log link
NegativeBinomial
is a more general version of GLM with NegativeBinomial family and log link. discrete.NegativeBinomial
allow for several parameterizations of the implied variance function and estimates the dispersion parameter jointly with the mean parameters as MLE.