Search code examples

Implementing the Cosine similarity in tensor flow

My Question is for the below equation enter image description here

The equation above of single vector. But if I have a batches of vectors, like my X and Y having the dimension of (None, 32), then there will some issue.

Also remember in coding environment, one example inside the batch is already in transpose shape. My problem is when we need to do transpose on [None, 32] the code will not accept and transpose for None dimenation.So I solve it in the following way:

def Cosine_similarity(X, Y, feature_dim):

  L = tf.compat.v1.initializers.glorot_normal()(shape=[feature_dim, feature_dim])

  out1 = tf.matmul(X, L)
  out2 = tf.matmul(Y, L)

  out_numerator = tf.reduce_sum(tf.multiply(out1, out2), axis = 1)

  out3 = tf.reduce_sum(tf.multiply(out1, out1), axis = 1)
  out3 = tf.sqrt(out3)

  out4 = tf.reduce_sum(tf.multiply(out2, out2), axis = 1)
  out4 = tf.sqrt(out4)

  out_denominator = tf.multiply(out3, out4)

  final_out = tf.divide(out_numerator, out_denominator)

return final_out

And this is coming from the following:

<XA.YA> = (XA)^T (YA)

        = tf.reduce_sum(tf.multiply((X A) , (Y A)), axis = 1)

So I just to know if this implementation is right? Or you can correct me if I am missing something


  • Not sure I understand your concern for the (none) dimension.

    If I understand correctly the cosine similarity between two identically shaped matrix X and Y ([batch, target_dim]) is just a matrix multiplication of X * Y^T with some L2 normalization. Note X would be your out1 and Y would be your out2.

    def Cosine_similarity(x, y, A):
      """Pair-wise Cosine similarity.
      First `x` and `y` are transformed by A.
      `X = xA^T` with shape [batch, target_dim],
      `Y = yA^T` with shape [batch, target_dim].
        x: shaped [batch, feature_dim].
        y: shaped [batch, feature_dim].
        A: shaped [targte_dim, feature_dim]. Transformation matrix to project
          from `feature_dim` to `target_dim`.
        A cosine similarity matrix shaped [batch, batch]. The entry
        at (i, j) is the cosine similarity value between vector `X[i, :]` and
        `Y[j, :]` where `X`, `Y` are the transformed `x` and y` by `A` 
        respectively. In the other word, entry at (i, j) is the pair-wise 
        cosine similarity value between the i-th example of `x` and the j-th 
        example of `y`.
      x = tf.matmul(x, A, transpose_b=True)
      y = tf.matmul(y, A, transpose_b=True)
      x_norm = tf.nn.l2_normalize(x, axis=-1)
      y_norm = tf.nn.l2_normalize(y, axis=-1)
      y_norm_trans = tf.transpose(y_norm, [1, 0])
      sim = tf.matmul(x_norm, y_norm_trans)
      return sim
    import numpy as np
    feature_dim = 8
    target_dim = 4
    batch_size = 2
    x = tf.placeholder(tf.float32, shape=(None, dim))
    y = tf.placeholder(tf.float32, shape=(None, dim))
    A = tf.placeholder(tf.float32, shape=(target_dim, feature_dim))
    sim = Cosine_similarity(x, y, A)
    with tf.Session() as sess:
      x, y, sim =[x, y, sim], feed_dict={
          x: np.ones((batch_size, feature_dim)), 
          y: np.random.rand(batch_size, feature_dim),
          A: np.random.rand(target_dim, feature_dim)})
      print 'x=\n', x
      print 'y=\n', y
      print 'sim=\n', sim


    [[ 1.  1.  1.  1.  1.  1.  1.  1.]
     [ 1.  1.  1.  1.  1.  1.  1.  1.]]
    [[ 0.01471654  0.76577073  0.97747731  0.06429122  0.91344446  0.47987637
       0.09899797  0.773938  ]
     [ 0.8555786   0.43403915  0.92445409  0.03393625  0.30154493  0.60895061
       0.1233703   0.58597666]]
    [[ 0.95917791  0.98181278]
     [ 0.95917791  0.98181278]]