I'm teaching myself PyMC but got stuck with the following problem:
I have a model whose parameters should be determined from successive measurements. In the beginning the parameter's prior is uninformative, but should be updated after each measurement (i.e. replaced by the posterior). In short, I want to do sequential updating with PyMC.
Consider the following (somewhat constructed) example:
Of course, this can be solved analytically with beta/binomial conjugate priors, but this is not the point here :)
Alternatively, both measurements could be combined to n=15 and k=12. However, this is too simple. I want to take the hard way for educational purposes.
I found a solution in this answer, where new priors are sampled from the posterior. This is almost what I want, but sampling the prior feels a bit messy because the results depends on the number of samples and other settings.
My attempted solution puts both measurement and priors separately in one model, like this:
n1, k1 = 10, 9
n2, k2 = 5, 3
theta1 = pymc.Beta('theta', alpha=1, beta=1)
outcome1 = pymc.Binomial('outcome1', n=n1, p=theta1, value=k1, observed=True)
theta2 = ? # should be the posterior of theta1
outcome2 = pymc.Binomial('outcome2', n=n2, p=theta2, value=k2, observed=True)
How can I get the posterior of theta1
as the prior of theta2
?
Is this even possible, or did I just demonstrate ultimate ignorance about Bayesian statistics?
The only way sequential updating works sensibly is in two different models. Specifying them in the same model does not make any sense, since we have no posteriors until after MCMC has completed.
In principle, you would examine the distribution of theta1
and specify a prior that best resembles it. In this simple case it is easy -- it would be:
theta2 = pymc.Beta('theta2', alpha=10, beta=2)
since you don't need MCMC to determine what the posterior of theta is. More generally, you could fit a Beta distribution to the posterior, say using scipy.stats.beta.fit
.