How can I get the cosine similarity of all elements of an array with all the other elements in the same array using Tensorflow

Given an array of sentence embeddings (arrays of 512) with a shape of (1000000, 512) how do I calculate the cosine similarity of every one of the 1 million sentence embeddings of the array against every other sentence embedding of the array, ideally using tensorflow, so I can try and speed it up with a GPU?

Solution

in this way you can calculate the cosine distance

X = np.random.uniform(0,10, (100,512)).astype('float32')
X = tf.constant(X)

def compute_cosine_distances(a, b):

    normalize_a = tf.nn.l2_normalize(a,1)        
    normalize_b = tf.nn.l2_normalize(b,1)
    distance = 1 - tf.matmul(normalize_a, normalize_b, transpose_b=True)

    return distance

compute_cosine_distances(X, X)

which is equal to

from sklearn.metrics.pairwise import pairwise_distances

pairwise_distances(X.numpy(), metric='cosine')