Search code examples
pythonbayesianpymcmcmcpymc3

How to define General deterministic function in PyMC


In my model, I need to obtain the value of my deterministic variable from a set of parent variables using a complicated python function.

Is it possible to do that?

Following is a pyMC3 code which shows what I am trying to do in a simplified case.

import numpy as np
import pymc as pm


#Predefine values on two parameter Grid (x,w) for a set of i values (1,2,3)
idata = np.array([1,2,3])
size= 20
gridlength = size*size
Grid = np.empty((gridlength,2+len(idata)))
for x in range(size):
    for w in range(size):
        # A silly version of my real model evaluated on grid.
        Grid[x*size+w,:]= np.array([x,w]+[(x**i + w**i) for i in idata])

# A function to find the nearest value in Grid and return its product with third variable z
def FindFromGrid(x,w,z):
    return Grid[int(x)*size+int(w),2:] * z

#Generate fake Y data with error
yerror = np.random.normal(loc=0.0, scale=9.0, size=len(idata))
ydata = Grid[16*size+12,2:]*3.6 + yerror  # ie. True x= 16, w= 12 and z= 3.6


with pm.Model() as model:

    #Priors
    x = pm.Uniform('x',lower=0,upper= size)
    w = pm.Uniform('w',lower=0,upper =size)
    z = pm.Uniform('z',lower=-5,upper =10)

    #Expected value
    y_hat = pm.Deterministic('y_hat',FindFromGrid(x,w,z))

    #Data likelihood
    ysigmas = np.ones(len(idata))*9.0 
    y_like = pm.Normal('y_like',mu= y_hat, sd=ysigmas, observed=ydata)

    # Inference...
    start = pm.find_MAP() # Find starting value by optimization
    step = pm.NUTS(state=start) # Instantiate MCMC sampling algorithm
    trace = pm.sample(1000, step, start=start, progressbar=False) # draw 1000 posterior samples using NUTS sampling


print('The trace plot')
fig = pm.traceplot(trace, lines={'x': 16, 'w': 12, 'z':3.6})
fig.show()

When I run this code, I get error at the y_hat stage, because the int() function inside the FindFromGrid(x,w,z) function needs integer not FreeRV.

Finding y_hat from a pre calculated grid is important because my real model for y_hat does not have an analytical form to express.

I have earlier tried to use OpenBUGS, but I found out here it is not possible to do this in OpenBUGS. Is it possible in PyMC ?

Update

Based on an example in pyMC github page, I found I need to add the following decorator to my FindFromGrid(x,w,z) function.

@pm.theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar, t.dscalar],otypes=[t.dvector])

This seems to solve the above mentioned issue. But I cannot use NUTS sampler anymore since it needs gradient.

Metropolis seems to be not converging.

Which step method should I use in a scenario like this?


Solution

  • You found the correct solution with as_op.

    Regarding the convergence: Are you using pm.Metropolis() instead of pm.NUTS() by any chance? One reason this could not converge is that Metropolis() by default samples in the joint space while often Gibbs within Metropolis is more effective (and this was the default in pymc2). Having said that, I just merged this: https://github.com/pymc-devs/pymc/pull/587 which changes the default behavior of the Metropolis and Slice sampler to be non-blocked by default (so within Gibbs). Other samplers like NUTS that are primarily designed to sample the joint space still default to blocked. You can always explicitly set this with the kwarg blocked=True.

    Anyway, update pymc with the most recent master and see if convergence improves. If not, try the Slice sampler.