Search code examples
pythonpython-3.xnlpgensimword2vec

Layer size in gensim's word2vec


When I start training my word2vec model, I am presented with the warning

consider setting layer size to a multiple of 4 for greater performance

That sounds neat, but I can't find any reference to a layer argument or similar in the documentation.

So how can I increase the layer size, and how can I determine a good value?


Solution

  • The layer size simply means the size (dimension) of the word vectors which can be set with the size parameter. The default value is 100 but you can for example try with 128 to have a multiple of 4. The best size depends on your training data and has to be determined empirically. In general, more data means that you can go for a bigger size.