I'm trying to compare two models (example from Jake Vanderplas' blog) using PyMC3, but I can't get my modified code to work (the functions best_theta()
and logL()
are explained in Jake's blog post, which is available in IPython Notebook form):
degrees = [1, 2, 3]
# best_theta() finds the best set of parameters for a given model
thetas = [best_theta(d) for d in degrees]
n = len(degrees)
prob = np.array([ 1 for _ in degrees ])
# model specs
from pymc3 import Model, Dirichlet, Categorical, DensityDist
with Model() as bfactor:
choices = Dirichlet('choices', prob, shape=prob.shape[0])
choice = Categorical('choice', choices)
indmodel = [0] * len(degrees)
for i, d in enumerate(degrees):
# logL() calculates the log-likelihood for a given model
indmodel[i] = DensityDist('indmodel', lambda value: logL(thetas[i]))
fullmodel = DensityDist('fullmodel', lambda value: indmodel[choice].logp(value))
That raises an exception, because the variable choice
is an RV object, not an integer (unlike in PyMC2), as discussed in this question. In my code, however, the value of choice
is important to make it work.
My question is, is there a way to access the value of the RV choice
, or more generally set up a hierarchical model using Categorical random variables (i.e. use the value of a Categorical RV to calculate the log likelihood of another RV)?
I took a quick stab at this. However, the approach needed to be changed quite a bit as it's often more convenient to vectorize the model. This also revealed a bug which I fixed (https://github.com/pymc-devs/pymc3/commit/c784c478aa035b5113e175145df8751b0dea0df3) so you will need to update from current master for this to work.
Here is the full NB: https://gist.github.com/anonymous/c1ada3388a40ae767a8d
It doesn't seem to quite work yet as the results are not identical but it's a step into the right direction.