I am replicating some of the examples presented in "Think Bayes" by Allen Downey to pymc3.
His great book provides us some introductory examples to Bayesian Methods and is done using Allen's own library.
There is the "Train Problem", where you need to predict the number of trains a company have based on the number you see painted on each train (each train is numbered from 1 to N)
The likelihood of this problem is basically
def likelihood(self, data, hypo):
if data > hypo:
return 0
return 1/hypo
for data in stream:
for hypo in hypothesis:
self.posterior[hypo] *= likelihood(data, hypo)
data
in the number you've seen on a train.
How can I define that custom likelihood is pymc3? I'm using DensityDist
to create my own likelihood function, but this one that I'm replicating is dependent on the hypothesis that ranges from 1 to N (let's say N = 100) and in pymc3 I couldn't find a way to get the X's from the tensors.
This problem is also know as the German Tank problem. Since during WWII the allies were trying to find the number of German tanks based on the serial number of captured tanks.
I think the problem can be solved by the following model
with pm.Model() as model:
N = pm.DiscreteUniform('N', lower=y.max(), upper=y.max()*10)
y_obs = pm.DiscreteUniform('y', lower=0, upper=N, observed=y)
trace = pm.sample(10000)
Depending on your actual problem you may relax the discrete assumption (that is really reasonable) and use a continuous distribution like the Uniform
one.
with pm.Model() as model:
N = pm.Uniform('N', lower=y.max(), upper=y.max()*10)
y_obs = pm.Uniform('y', lower=0, upper=N, observed=y)
trace = pm.sample(1000)
One advantage of relaxing the discrete assumption is that now you can use NUTS. Instead, in the previous model you are restricted to Metropolis since you were using discrete variables.