The official page says, Compute Y = g(X; scale) = scale @ X. So I understand scale is left-multiplied to X, but I see that ScaleMatvecDiag calculates X @ scale.
The following code produces
import numpy as np
import tensorflow_probability as tfp
tfb = tfp.bijectors
x = [[1., 2.], [3., 4.]]
b = tfb.ScaleMatvecDiag(scale_diag=[-1., 2.])
b.forward(x)
[[-1., 4.],
[-3., 8.]]
I am expecting
np.diag([-1., 2.]) @ x
[[-1., -2.],
[ 6., 8.]]
From the following outputs, I see that ScaleMatvecDiag calculates X @ scale.
y = [[1., 2, 3], [4, 5, 6]]
z = [[1., 2], [3, 4], [5, 6]]
b.forward(y) --> ValueError: Dimensions 2 and 3 are not compatible
b.forward(z) --> (3, 2)
I would be appreciated if anyone clarify the misunderstanding.
I think there's a documentation bug.
In short, matvec != matmul
(and note that @
is matmul, not matvec)
Ignoring "batching":
matmul
takes inputs of shape [k, m]
, [m, n]
and outputs [k, n]
matvec
takes inputs of shape [k, m]
, [m]
and outputs [k]
.Taking batching into account:
matmul
takes inputs of shape [batch, k, m]
, [batch, m, n]
and outputs [batch, k, n]
matvec
takes inputs of shape [batch, k, m]
, [batch, m]
and outputs [batch k]
.The right-hand sides of your examples are being interpreted as batches of vectors:
[2, 2]
=> batch of two 2d vectors[3, 2]
=> batch of three 2d vectors[2, 3]
=> batch of two 3d vectorsonly the batches of 2-vectors will be admissible to a matvec with a 2x2 left-hand side (matrix).