Search code examples
javanlpintervalstopic-modelingmallet

Mallet Topic Modelling API - How to decide number of intervals needed or best for optimization?


Sorry I'm quite the beginner in the field of NLP, as the title says what is the best interval for optimization in Mallet API? I was also wondering if it was dependent or related to the number of iterations/topics/corpus etc.


Solution

  • The optimization interval is the number of iterations between hyperparameter updates. Values between 20 and 50 seem to work well, but I haven't done any systematic tests. One possible failure mode to look out for is that too many optimization rounds could lead to instability, with the alpha hyperparameters going to zero.