Search code examples
pythonbayesianpymc3mcmc

Can a variable be used as 'observed' data in a PyMC3 model?


I am new to the Bayesian world and PyMC3, and am struggling with a simple model setup. Specifically, how to deal with a setup where the 'observed' data are themselves modified by the random variables? As an example, lets' say I have a collection of 2d points [Xi, Yi] that form an arc about a circle whose central point [Xc,Yc], I don't know. However, I expect that the distances between the points and the circle center, Ri, should be normally distributed, about a known radius, R. I therefore initially thought I could assign Xc and Yc uniform priors (on some arbitrarily large range) and then re-calculate Ri within the model and assign Ri as the 'observed' data to get posterior estimates on Xc and Yc:

import pymc3 as pm
import numpy as np

points = np.array([[2.95, 4.98], [3.28, 4.88], [3.84, 4.59], [4.47, 4.09], [2.1,5.1], [5.4, 1.8]])
Xi = points[:,0]
Yi = points[:,1]

#known [Xc,Yc] = [2.1, 1.8]
R = 3.3

with pm.Model() as Cir_model:
    
    Xc = pm.Uniform('Xc', lower=-20, upper=20)
    Yc = pm.Uniform('Yc', lower=-20, upper=20)
    
    Ri = pm.math.sqrt((Xi-Xc)**2 + (Yi-Yc)**2)
    
    y = pm.Normal('y', mu=R, sd=1.0, observed=Ri)
    
    samples = pm.fit(random_seed=2020).sample(1000)
    
    pm.plot_posterior(samples, var_names=['Xc'])
    pm.plot_posterior(samples, var_names=['Yc']);

While this code runs and gives me something, it clearly isn't working properly, which isn't surprising because it didn't seem right to be feeding a variable (Ri) in as 'observed' data. However, while I know there is something seriously wrong with my model setup (and my understanding more generally), I can't seem to recognize it. Any help greatly appreciated!


Solution

  • This model is actually doing fine, but there are a few things you might improve:

    1. Using a variable as an observation is not great, in that you should think about what it is doing to the distribution you are fitting. It will fit a distribution, but you should think about whether you are double-counting variables in a prior and a likelihood. That doesn't matter so much for this toy model though!
    2. You are using pm.fit(...), which uses variational inference, but MCMC is fine here, so replacing that whole line with samples = pm.sample() works.
    3. The points you provide are almost exactly on a circle -- the empirical standard deviation is around 0.004, but standard deviation you supply in the liklihood is 1: around 250x the true value! Sampling from the model as-is allows for the center of the points to be in two different places:

    likelihood with sd=1

    If you change the likelihood to y = pm.Normal('y', mu=R, sd=0.01, observed=Ri), you still get two possible centers, though there's a little more mass near the true center:

    likelihood with sd=0.01

    Finally, you could take an approach where you put a prior on the scale, and also learn that, which happily feels the most principled and gives results closest to the true ones. Here's the model:

    with pm.Model():
        Xc = pm.Uniform('Xc', lower=-20, upper=20)
        Yc = pm.Uniform('Yc', lower=-20, upper=20)
        Ri = pm.math.sqrt((Xi-Xc)**2 + (Yi-Yc)**2)
        obs_sd = pm.HalfNormal('obs_sd', 1)
        y = pm.Normal('y', mu=R, sd=obs_sd, observed=Ri)
        
        samples = pm.sample()
    

    and here's the output:

    likelihood with a prior over sd