I am new to the Bayesian world and PyMC3, and am struggling with a simple model setup. Specifically, how to deal with a setup where the 'observed' data are themselves modified by the random variables? As an example, lets' say I have a collection of 2d points [Xi, Yi] that form an arc about a circle whose central point [Xc,Yc], I don't know. However, I expect that the distances between the points and the circle center, Ri, should be normally distributed, about a known radius, R. I therefore initially thought I could assign Xc and Yc uniform priors (on some arbitrarily large range) and then re-calculate Ri within the model and assign Ri as the 'observed' data to get posterior estimates on Xc and Yc:
import pymc3 as pm
import numpy as np
points = np.array([[2.95, 4.98], [3.28, 4.88], [3.84, 4.59], [4.47, 4.09], [2.1,5.1], [5.4, 1.8]])
Xi = points[:,0]
Yi = points[:,1]
#known [Xc,Yc] = [2.1, 1.8]
R = 3.3
with pm.Model() as Cir_model:
Xc = pm.Uniform('Xc', lower=-20, upper=20)
Yc = pm.Uniform('Yc', lower=-20, upper=20)
Ri = pm.math.sqrt((Xi-Xc)**2 + (Yi-Yc)**2)
y = pm.Normal('y', mu=R, sd=1.0, observed=Ri)
samples = pm.fit(random_seed=2020).sample(1000)
pm.plot_posterior(samples, var_names=['Xc'])
pm.plot_posterior(samples, var_names=['Yc']);
While this code runs and gives me something, it clearly isn't working properly, which isn't surprising because it didn't seem right to be feeding a variable (Ri) in as 'observed' data. However, while I know there is something seriously wrong with my model setup (and my understanding more generally), I can't seem to recognize it. Any help greatly appreciated!
This model is actually doing fine, but there are a few things you might improve:
pm.fit(...)
, which uses variational inference, but MCMC is fine here, so replacing that whole line with samples = pm.sample()
works.points
you provide are almost exactly on a circle -- the empirical standard deviation is around 0.004, but standard deviation you supply in the liklihood is 1: around 250x the true value! Sampling from the model as-is allows for the center of the points to be in two different places:If you change the likelihood to y = pm.Normal('y', mu=R, sd=0.01, observed=Ri)
, you still get two possible centers, though there's a little more mass near the true center:
Finally, you could take an approach where you put a prior on the scale, and also learn that, which happily feels the most principled and gives results closest to the true ones. Here's the model:
with pm.Model():
Xc = pm.Uniform('Xc', lower=-20, upper=20)
Yc = pm.Uniform('Yc', lower=-20, upper=20)
Ri = pm.math.sqrt((Xi-Xc)**2 + (Yi-Yc)**2)
obs_sd = pm.HalfNormal('obs_sd', 1)
y = pm.Normal('y', mu=R, sd=obs_sd, observed=Ri)
samples = pm.sample()
and here's the output: