Search code examples
kerasembedding

The embeddings using ** layers[0].get_weights()[0]**


I use an example to study embedding Networks, where a put the vocabulary size = 200 and the training sample contain about 20 different words. the vocab size is 200 that means that the number of words is 200. But effectively I'm working with 20 words only ( the words of my training sample) : let say word[0] to word[19]. So, after the embedding, the vector[0] corresponds to word[0] and so on. but vector[20].. vector [30] … what do they match ? I have no word[20] or word[30] .

Thanks in advance.


Solution

  • what do they match?

    Nothing. Until you increase your vocabulary size, the weights there will stay at what they were initialized to, which is almost certainly random. If you attempt to treat them as words, they will have no english definition.

    They might have some meaning to them based on the fact that training embeddings creates a space in which numbers have meaning, but these random embeddings cannot be reliably translated back into english.