I have a dataset that looks like the one below (first 5 rows shown). CPA is an observed result from an experiment (treatment) on different advertising flights. Flights are hierarchically grouped in campaigns.
campaign_uid flight_uid treatment CPA
0 0C2o4hHDSN 0FBU5oULvg control -50.757370
1 0C2o4hHDSN 0FhOqhtsl9 control 10.963426
2 0C2o4hHDSN 0FwPGelRRX exposed -72.868952
3 0C5F8ZNKxc 0F0bYuxlmR control 13.356081
4 0C5F8ZNKxc 0F2ESwZY22 control 141.030900
5 0C5F8ZNKxc 0F5rfAOVuO exposed 11.200450
I fit a model like the following one:
model.fit('CPA ~ treatment', random=['1|campaign_uid'])
To my knowledge, this model simply says:
so one would just get one posterior for each such variable.
However, looking at the results below, I also get posteriors for the following variable: 1|campaign_uid_offset
. What does it represent?
Code for fitting the model and the plot:
model = Model(df)
results = model.fit('{} ~ treatment'.format(metric),
random=['1|campaign_uid'],
samples=1000)
# Plotting the result
pm.traceplot(model.backend.trace)
These are the random intercepts for campaigns that you mentioned in your list of parameters.
This is the standard deviation of the aforementioned random campaign intercepts.
This is the residual standard deviation. That is, your model can be written (in part) as CPA_ij ~ Normal(b0 + b1*treatment_ij + u_j, sigma^2), and CPA_sd
represents the parameter sigma.
This is an alternative parameterization of the random intercepts. bambi
uses this transformation internally in order to improve the MCMC sampling efficiency. Normally this transformed parameter is hidden from the user by default; that is, if you make the traceplot using results.plot()
rather than pm.traceplot(model.backend.trace)
then these terms are hidden unless you specify transformed=True
(it's False by default). It's also hidden by default from the results.summary()
output. For more information about this transformation, see this nice blog post by Thomas Wiecki.