Implementing the Cosine similarity in tensor flow

My Question is for the below equation

The equation above of single vector. But if I have a batches of vectors, like my X and Y having the dimension of (None, 32), then there will some issue.

Also remember in coding environment, one example inside the batch is already in transpose shape. My problem is when we need to do transpose on [None, 32] the code will not accept and transpose for None dimenation.So I solve it in the following way:

def Cosine_similarity(X, Y, feature_dim):

  L = tf.compat.v1.initializers.glorot_normal()(shape=[feature_dim, feature_dim])

  out1 = tf.matmul(X, L)
  out2 = tf.matmul(Y, L)

  out_numerator = tf.reduce_sum(tf.multiply(out1, out2), axis = 1)

  out3 = tf.reduce_sum(tf.multiply(out1, out1), axis = 1)
  out3 = tf.sqrt(out3)

  out4 = tf.reduce_sum(tf.multiply(out2, out2), axis = 1)
  out4 = tf.sqrt(out4)

  out_denominator = tf.multiply(out3, out4)

  final_out = tf.divide(out_numerator, out_denominator)

return final_out

And this is coming from the following:

<XA.YA> = (XA)^T (YA)

        = tf.reduce_sum(tf.multiply((X A) , (Y A)), axis = 1)

So I just to know if this implementation is right? Or you can correct me if I am missing something

Solution

Not sure I understand your concern for the (none) dimension.

If I understand correctly the cosine similarity between two identically shaped matrix X and Y ([batch, target_dim]) is just a matrix multiplication of X * Y^T with some L2 normalization. Note X would be your out1 and Y would be your out2.

def Cosine_similarity(x, y, A):
  """Pair-wise Cosine similarity.

  First `x` and `y` are transformed by A.
  `X = xA^T` with shape [batch, target_dim],
  `Y = yA^T` with shape [batch, target_dim].

  Args:
    x: shaped [batch, feature_dim].
    y: shaped [batch, feature_dim].
    A: shaped [targte_dim, feature_dim]. Transformation matrix to project
      from `feature_dim` to `target_dim`.

  Returns:
    A cosine similarity matrix shaped [batch, batch]. The entry
    at (i, j) is the cosine similarity value between vector `X[i, :]` and
    `Y[j, :]` where `X`, `Y` are the transformed `x` and y` by `A` 
    respectively. In the other word, entry at (i, j) is the pair-wise 
    cosine similarity value between the i-th example of `x` and the j-th 
    example of `y`.
  """

  x = tf.matmul(x, A, transpose_b=True)
  y = tf.matmul(y, A, transpose_b=True)
  x_norm = tf.nn.l2_normalize(x, axis=-1)
  y_norm = tf.nn.l2_normalize(y, axis=-1)
  y_norm_trans = tf.transpose(y_norm, [1, 0])
  sim = tf.matmul(x_norm, y_norm_trans)
  return sim

import numpy as np

feature_dim = 8
target_dim = 4
batch_size = 2
x = tf.placeholder(tf.float32, shape=(None, dim))
y = tf.placeholder(tf.float32, shape=(None, dim))
A = tf.placeholder(tf.float32, shape=(target_dim, feature_dim))

sim = Cosine_similarity(x, y, A)

with tf.Session() as sess:
  x, y, sim = sess.run([x, y, sim], feed_dict={
      x: np.ones((batch_size, feature_dim)), 
      y: np.random.rand(batch_size, feature_dim),
      A: np.random.rand(target_dim, feature_dim)})
  print 'x=\n', x
  print 'y=\n', y
  print 'sim=\n', sim

Result:

x=
[[ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]]
y=
[[ 0.01471654  0.76577073  0.97747731  0.06429122  0.91344446  0.47987637
   0.09899797  0.773938  ]
 [ 0.8555786   0.43403915  0.92445409  0.03393625  0.30154493  0.60895061
   0.1233703   0.58597666]]
sim=
[[ 0.95917791  0.98181278]
 [ 0.95917791  0.98181278]]