Can a variable be used as 'observed' data in a PyMC3 model?

I am new to the Bayesian world and PyMC3, and am struggling with a simple model setup. Specifically, how to deal with a setup where the 'observed' data are themselves modified by the random variables? As an example, lets' say I have a collection of 2d points [Xi, Yi] that form an arc about a circle whose central point [Xc,Yc], I don't know. However, I expect that the distances between the points and the circle center, Ri, should be normally distributed, about a known radius, R. I therefore initially thought I could assign Xc and Yc uniform priors (on some arbitrarily large range) and then re-calculate Ri within the model and assign Ri as the 'observed' data to get posterior estimates on Xc and Yc:

import pymc3 as pm
import numpy as np

points = np.array([[2.95, 4.98], [3.28, 4.88], [3.84, 4.59], [4.47, 4.09], [2.1,5.1], [5.4, 1.8]])
Xi = points[:,0]
Yi = points[:,1]

#known [Xc,Yc] = [2.1, 1.8]
R = 3.3

with pm.Model() as Cir_model:
    
    Xc = pm.Uniform('Xc', lower=-20, upper=20)
    Yc = pm.Uniform('Yc', lower=-20, upper=20)
    
    Ri = pm.math.sqrt((Xi-Xc)**2 + (Yi-Yc)**2)
    
    y = pm.Normal('y', mu=R, sd=1.0, observed=Ri)
    
    samples = pm.fit(random_seed=2020).sample(1000)
    
    pm.plot_posterior(samples, var_names=['Xc'])
    pm.plot_posterior(samples, var_names=['Yc']);

While this code runs and gives me something, it clearly isn't working properly, which isn't surprising because it didn't seem right to be feeding a variable (Ri) in as 'observed' data. However, while I know there is something seriously wrong with my model setup (and my understanding more generally), I can't seem to recognize it. Any help greatly appreciated!

Solution

This model is actually doing fine, but there are a few things you might improve:

Using a variable as an observation is not great, in that you should think about what it is doing to the distribution you are fitting. It will fit a distribution, but you should think about whether you are double-counting variables in a prior and a likelihood. That doesn't matter so much for this toy model though!
You are using pm.fit(...), which uses variational inference, but MCMC is fine here, so replacing that whole line with samples = pm.sample() works.
The points you provide are almost exactly on a circle -- the empirical standard deviation is around 0.004, but standard deviation you supply in the liklihood is 1: around 250x the true value! Sampling from the model as-is allows for the center of the points to be in two different places:

If you change the likelihood to y = pm.Normal('y', mu=R, sd=0.01, observed=Ri), you still get two possible centers, though there's a little more mass near the true center:

Finally, you could take an approach where you put a prior on the scale, and also learn that, which happily feels the most principled and gives results closest to the true ones. Here's the model:

with pm.Model():
    Xc = pm.Uniform('Xc', lower=-20, upper=20)
    Yc = pm.Uniform('Yc', lower=-20, upper=20)
    Ri = pm.math.sqrt((Xi-Xc)**2 + (Yi-Yc)**2)
    obs_sd = pm.HalfNormal('obs_sd', 1)
    y = pm.Normal('y', mu=R, sd=obs_sd, observed=Ri)
    
    samples = pm.sample()

and here's the output: