The BerTopic model resulted the below Topics:
As you can see from the above, the model is finetuned to generate lesser outliers '-1' which has the count of 3 and it appears in the last.
While visualizing the Topics per class,
topic_model.visualize_topics_per_class(topics_per_class)
the below interactive visual is generated, and however it ignored the 0th
index, to be precise the Topic 0. The Global Topic Representations are displayed from 1, 2, 3, 4, 5, 6, -1
Is the BerTopic designed in a way that it always assumes the very first index will be an outlier (-1
), and eliminates it blindly?
Are the generated topics always accessed based on the count size, may be in descending order?
This issue is posted in the BerTopic github forum as well, and the response from the Author himself,
by setting top_n_topics=None
, all the topics along with the 0th
index can be viewed while visualizing,
topic_model.visualize_topics_per_class(topics_per_class, top_n_topics=None)