Search code examples
tensorflowstantensorflow-probabilityprobability-distributionhierarchical-bayesian

Use two set of data for likelihood of log_prob in tensorflow probability


I am new to tensorflow and trying to translate a STAN model into TFP. Here is my TFP model using JointDistributionCoroutineAutoBatched.

def make_joint_distribution_coroutine(Depth,N_RNA):
    def model():
        ## c1 prior 
        c1 = yield tfd.Gamma(concentration = 1.1, rate = 0.005)
        ## c2 prior
        c2 = yield tfd.Gamma(concentration = 1.1, rate = 0.005)
        ## s prior 
        s = yield GammaModeSD(1,1)
        ## theta prior 
        theta = yield tfd.LogNormal(0,s)
        ## p prior
        p = yield BetaModeConc(0.1,c1)
        ## tfp bug, need to cast tensor to float32
        #theta = tf.cast(theta, tf.float32)
        #p = tf.cast(p, tf.float32)
        ## q formula
        q = (theta*p)/(1-p+theta*p)
        ## qi prior 
        qi = yield BetaModeConc(tf.repeat(q,N_RNA), c2)
        ## qi likelihood 
        k = yield tfd.Binomial(tf.cast(Depth,tf.float32),qi)
        # p likelihood
        a = yield tfd.Binomial(tf.cast(Depth,tf.float32),p)
    return tfd.JointDistributionCoroutineAutoBatched(model)

My model generates two different sets of data which are a and k. If it only has a or k, then I could specify my log_prob function by

def joint_log_prob(*args):
      return joint.log_prob(*args, likelihood = data)

or

joint_log_prob = lambda *x: model.log_prob(x + (data,))

But my question is how to incorporate two different sets of data into one log_prob function? Thank you!


Solution

  • The simplest solution would just be specifying both. Assuming data is a tuple:

    def joint_log_prob(*args):
      return joint.log_prob(*args, a=data[0], k=data[1])
    

    or

    joint_log_prob = lambda *x: model.log_prob(x + data)
    

    You might also like to write:

    joint_log_prob = joint.experimental_pin(a=.., k=..).unnormalized_log_prob
    

    (See JointDistributionPinned)