Search code examples
pythontensorflowdeep-learningword-embeddingnlp

What does tf.nn.embedding_lookup function do?


tf.nn.embedding_lookup(params, ids, partition_strategy='mod', name=None)

I cannot understand the duty of this function. Is it like a lookup table? Which means to return the parameters corresponding to each id (in ids)?

For instance, in the skip-gram model if we use tf.nn.embedding_lookup(embeddings, train_inputs), then for each train_input it finds the correspond embedding?


Solution

  • embedding_lookup function retrieves rows of the params tensor. The behavior is similar to using indexing with arrays in numpy. E.g.

    matrix = np.random.random([1024, 64])  # 64-dimensional embeddings
    ids = np.array([0, 5, 17, 33])
    print matrix[ids]  # prints a matrix of shape [4, 64] 
    

    params argument can be also a list of tensors in which case the ids will be distributed among the tensors. For example, given a list of 3 tensors [2, 64], the default behavior is that they will represent ids: [0, 3], [1, 4], [2, 5].

    partition_strategy controls the way how the ids are distributed among the list. The partitioning is useful for larger scale problems when the matrix might be too large to keep in one piece.