python tensorflow tensorflow-probability

multivariateNormal distribution with n-batch > 1

I am trying to generalise the example given in How to use a MultiVariateNormal distribution in the latest version of Tensorflow to a normal distribution in 2-D but with more than one batch. When I run the following:

from tensorflow_probability import distributions as tfd
import tensorflow as tf

tf.compat.v1.enable_eager_execution()

mu = [[1, 2],
        [-1,-2]]

cov = [[1, 3./5],
        [3./5, 2]]

cov = [cov, cov] # for demonstration purpose, use same cov for both batches

mvn = tfd.MultivariateNormalFullCovariance(
        loc=mu,
        covariance_matrix=cov)

# generate the pdf
X, Y = tf.meshgrid(tf.range(-3, 3, 0.1), tf.range(-3, 3, 0.1))
idx = tf.concat([tf.reshape(X, [-1, 1]), tf.reshape(Y,[-1,1])], axis =1)
prob = tf.reshape(mvn.prob(idx), tf.shape(X))

I get an Incompatible shapes error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [3600,2] vs. [2,2] [Op:Sub] name: MultivariateNormalFullCovariance/log_prob/affine_linear_operator/inverse/sub/

My understanding of the documentation (https://www.tensorflow.org/api_docs/python/tf/contrib/distributions/MultivariateNormalFullCovariance) is that to compute the pdf, one needs a [n_observation, n_dimensions] tensor (which is the case in this example: idx.shape = TensorShape([Dimension(3600), Dimension(2)])). Did I get my maths wrong?

Solution

You need to add a batch axis to the idx tensor in the second-to-last position, since 60x60 can't broadcast against the mvn.batch_shape of (2,).

# TF/TFP Imports
!pip install --quiet tfp-nightly tf-nightly
import tensorflow.compat.v2 as tf
tf.enable_v2_behavior()
import tensorflow_probability as tfp
tfd = tfp.distributions

mu = [[1, 2],
      [-1, -2]]

cov = [[1, 3./5],
       [3./5, 2]]

cov = [cov, cov] # for demonstration purpose, use same cov for both batches

mvn = tfd.MultivariateNormalFullCovariance(
    loc=mu, covariance_matrix=cov)
print(mvn.batch_shape, mvn.event_shape)

# generate the pdf
X, Y = tf.meshgrid(tf.range(-3, 3, 0.1), tf.range(-3, 3, 0.1))
print(X.shape)
idx = tf.stack([X, Y], axis=-1)[..., tf.newaxis, :]
print(idx.shape)

probs = mvn.prob(idx)
print(probs.shape)

output:

(2,) (2,)   # mvn.batch_shape, mvn.event_shape
(60, 60)    # X.shape
(60, 60, 1, 2)   # idx.shape == X.shape + (1 "broadcast against batch", 2 "event")
(60, 60, 2)  # probs.shape == X.shape + (2 "mvn batch shape")