I'm trying to convert this example of Bayesian correlation for PyMC2 to PyMC3, but get completely different results. Most importantly, the mean of the multivariate Normal distribution quickly goes to zero, whereas it should be around 400 (as it is for PyMC2). Consequently, the estimated correlation quickly goes towards 1, which is wrong as well.
The full code is available in this notebook for PyMC2 and in this notebook for PyMC3.
The relevant code for PyMC2 is
def analyze(data):
# priors might be adapted here to be less flat
mu = pymc.Normal('mu', 0, 0.000001, size=2)
sigma = pymc.Uniform('sigma', 0, 1000, size=2)
rho = pymc.Uniform('r', -1, 1)
@pymc.deterministic
def precision(sigma=sigma,rho=rho):
ss1 = float(sigma[0] * sigma[0])
ss2 = float(sigma[1] * sigma[1])
rss = float(rho * sigma[0] * sigma[1])
return np.linalg.inv(np.mat([[ss1, rss], [rss, ss2]]))
mult_n = pymc.MvNormal('mult_n', mu=mu, tau=precision, value=data.T, observed=True)
model = pymc.MCMC(locals())
model.sample(50000,25000)
My port of the above code to PyMC3 is as follows:
def precision(sigma, rho):
C = T.alloc(rho, 2, 2)
C = T.fill_diagonal(C, 1.)
S = T.diag(sigma)
return T.nlinalg.matrix_inverse(T.nlinalg.matrix_dot(S, C, S))
def analyze(data):
with pm.Model() as model:
# priors might be adapted here to be less flat
mu = pm.Normal('mu', mu=0., sd=0.000001, shape=2, testval=np.mean(data, axis=1))
sigma = pm.Uniform('sigma', lower=1e-6, upper=1000., shape=2, testval=np.std(data, axis=1))
rho = pm.Uniform('r', lower=-1., upper=1., testval=0)
prec = pm.Deterministic('prec', precision(sigma, rho))
mult_n = pm.MvNormal('mult_n', mu=mu, tau=prec, observed=data.T)
return model
model = analyze(data)
with model:
trace = pm.sample(50000, tune=25000, step=pm.Metropolis())
The PyMC3 version runs, but clearly does not return the expected result. Any help would be highly appreciated.
The call signature of pymc.Normal is
In [125]: pymc.Normal?
Init signature: pymc.Normal(self, *args, **kwds)
Docstring:
N = Normal(name, mu, tau, value=None, observed=False, size=1, trace=True, rseed=True, doc=None, verbose=-1, debug=False)
Notice that the third positional argument of pymc.Normal
is tau
, not the standard deviation, sd
.
Therefore, since the pymc
code uses
mu = Normal('mu', 0, 0.000001, size=2)
The corresponding pymc3
code should use
mu = pm.Normal('mu', mu=0., tau=0.000001, shape=2, ...)
or
mu = pm.Normal('mu', mu=0., sd=math.sqrt(1/0.000001), shape=2, ...)
since tau = 1/sigma**2
.
With this one change, your pymc3 code produces (something like)