Search code examples
pythontensorflowtensorflow-probability

multivariateNormal distribution with n-batch > 1


I am trying to generalise the example given in How to use a MultiVariateNormal distribution in the latest version of Tensorflow to a normal distribution in 2-D but with more than one batch. When I run the following:

from tensorflow_probability import distributions as tfd
import tensorflow as tf

tf.compat.v1.enable_eager_execution()

mu = [[1, 2],
        [-1,-2]]

cov = [[1, 3./5],
        [3./5, 2]]

cov = [cov, cov] # for demonstration purpose, use same cov for both batches

mvn = tfd.MultivariateNormalFullCovariance(
        loc=mu,
        covariance_matrix=cov)

# generate the pdf
X, Y = tf.meshgrid(tf.range(-3, 3, 0.1), tf.range(-3, 3, 0.1))
idx = tf.concat([tf.reshape(X, [-1, 1]), tf.reshape(Y,[-1,1])], axis =1)
prob = tf.reshape(mvn.prob(idx), tf.shape(X))

I get an Incompatible shapes error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [3600,2] vs. [2,2] [Op:Sub] name: MultivariateNormalFullCovariance/log_prob/affine_linear_operator/inverse/sub/

My understanding of the documentation (https://www.tensorflow.org/api_docs/python/tf/contrib/distributions/MultivariateNormalFullCovariance) is that to compute the pdf, one needs a [n_observation, n_dimensions] tensor (which is the case in this example: idx.shape = TensorShape([Dimension(3600), Dimension(2)])). Did I get my maths wrong?


Solution

  • You need to add a batch axis to the idx tensor in the second-to-last position, since 60x60 can't broadcast against the mvn.batch_shape of (2,).

    # TF/TFP Imports
    !pip install --quiet tfp-nightly tf-nightly
    import tensorflow.compat.v2 as tf
    tf.enable_v2_behavior()
    import tensorflow_probability as tfp
    tfd = tfp.distributions
    
    mu = [[1, 2],
          [-1, -2]]
    
    cov = [[1, 3./5],
           [3./5, 2]]
    
    cov = [cov, cov] # for demonstration purpose, use same cov for both batches
    
    mvn = tfd.MultivariateNormalFullCovariance(
        loc=mu, covariance_matrix=cov)
    print(mvn.batch_shape, mvn.event_shape)
    
    # generate the pdf
    X, Y = tf.meshgrid(tf.range(-3, 3, 0.1), tf.range(-3, 3, 0.1))
    print(X.shape)
    idx = tf.stack([X, Y], axis=-1)[..., tf.newaxis, :]
    print(idx.shape)
    
    probs = mvn.prob(idx)
    print(probs.shape)
    

    output:

    (2,) (2,)   # mvn.batch_shape, mvn.event_shape
    (60, 60)    # X.shape
    (60, 60, 1, 2)   # idx.shape == X.shape + (1 "broadcast against batch", 2 "event")
    (60, 60, 2)  # probs.shape == X.shape + (2 "mvn batch shape")