Search code examples
word-embeddingfasttext

Fast Text unsupervised model loss


I wanted to create a fastText unsupervised model for my text data of size 1GB. I'm using fastText command line tool to implement the model training process.

./fasttext skipgram -input PlainText.txt -output FastText-PlainText- -dim 50 -epoch 50 

The above are few arguments I used for created word representation.

Read 207M words
Number of words:  501986
Number of labels: 0
Progress:  97.5% words/sec/thread:   87224 lr:  0.001260 avg.loss:  0.089536 ETA:   0h 4m 9s

Here, in the output of the fastText command, I see this avg.loss and the learning rate has been decreased from default (0.5) to 0.001. I don't really understand, what does this avg.loss mean and why is the learning rate is dropped?

  1. Should I want to increase the epoch to make fastText to learn my data better?
  2. Can I use any loss function to improve the loss? If yes, what kind of loss function will be better?
  3. And how can I evaluate my fastText model's learning whether is good or bad?
  4. Just out of interest, Can I use wordngrams to make my model learn better with context in unsupervised learning?

Solution

  • I can't answer all your questions in depth, but I try to give you some advice.

    • you can understand better avg.loss, reading this thread
    • learning rate is updated according lrUpdateRate option (read this).
    • in general, increasing the number of epochs can improve learning. However, as you can read in this paper, the most popular language models have a number of epochs between 10 and 100.
    • default loss function is softmax. You can also choose hs (hierarchical softmax) or ns. You can read more in the official tutorial.
    • if you want to learn more about the effects of the ws and wordngrams parameters, you can read this answer.