Search code examples
nlptensorflowword2vecdoc2vec

Embedding lookup from multiple embeddings in tensorflow


Building a doc2Vec algorithm, there is a need for having multiple embeddings around. There are embeddings for the word vectors, while at the same time there are embeddings for the documents themselves. The way the algorithm works is similar to that of a CBOW model, but the document embedding is also used per each document being trained with a given window. So if we have a window of 5 words, we go ahead and go through those 5 words, but per each window we will always include the document embedding vector itself so that we can update it.


Solution

  • Just concat them:

    input_tensor = tf.concat(1, [wordembedding_tensor, documentembedding_tensor])
    

    Then the input-tensors are your inputs.