I have te same error as this thread : ValueError: cannot compute LDA over an empty collection (no terms) but the solution needed isn't the same.
I'm working on a notebook with Sklearn, and I've done an LDA and a NMF.
I'm now trying to do the same using Gensim: https://radimrehurek.com/gensim/auto_examples/tutorials/run_lda.htm
Here is a piece of code (in Python) from my notebook of what I'm trying to do :
dic = gensim.corpora.Dictionary(texts_lem)
dic.filter_extremes(no_below=10, no_above=0.8)
corpus = [dic.doc2bow(doc) for doc in texts_lem]
model = gensim.models.LdaModel(
corpus=corpus,
id2word=dic.id2token,
num_topics=10,
)
I'm using the existing texts_lem list from another section of my notebook to do the Gensim LDA. I'm following the guide : Creating a dictionary, filtering extremes, creating a corpus and sending it to LdaModel().
Unfortunately, it doesn't work, and commenting filter_extremes's row doesn't help (This is the answer of the other thread with same error).
texts_lem is the list of list of words like the following :
[
['word', 'word', 'word', 'word'],
['word', 'word', 'word', 'word'],
['word', 'word', 'word', 'word'],
]
My error is :
ValueError: cannot compute LDA over an empty collection (no terms)
Many thanks for your help.
Just don't use id2token.
Your model should be :
model = gensim.models.LdaModel(
corpus=corpus,
id2word=dic.id2token,
num_topics=10,
)
Works fine. Who knows what's going on ?