python-3.x gensim word2vec word-embedding fasttext

Value of alpha in gensim word-embedding (Word2Vec and FastText) models?

I just want to know the effect of the value of alpha in gensim word2vec and fasttext word-embedding models? I know that alpha is the initial learning rate and its default value is 0.075 form Radim blog.

What if I change this to a bit higher value i.e. 0.5 or 0.75? What will be its effect? Does it is allowed to change the same? However, I have changed this to 0.5 and experiment on a large-sized data with D = 200, window = 15, min_count = 5, iter = 10, workers = 4 and results are pretty much meaningful for the word2vec model. However, using the fasttext model, the results are bit scattered, means less related and unpredictable high-low similarity scores.

Why this imprecise result for same data with two popular models with different precision? Does the value of alpha plays such a crucial role during building of the model?

Any suggestion is appreciated.

Solution

The default starting alpha is 0.025 in gensim's Word2Vec implementation.

In the stochastic gradient descent algorithm for adjusting the model, the effective alpha affects how strong of a correction to the model is made after each training example is evaluated, and will decay linearly from its starting value (alpha) to a tiny final value (min_alpha) over the course of all training.

Most users won't need to adjust these parameters, or might only adjust them a little, after they have a reliable repeatable way of assessing whether a change improves their model on their end tasks. (I've seen starting values of 0.05 or less commonly 0.1, but never as high as your reported 0.5.)