Search code examples

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names' -- Topic Modeling -- Latent Dirichlet Allocation

I'm trying to follow the example from the link below.

All the code up to this point works, but the code below does not work.

from sklearn.decomposition import LatentDirichletAllocation
vectorizer = CountVectorizer(
            min_df=3,# minimum required occurences of a word 
            stop_words='english',# remove stop words
            lowercase=True,# convert all words to lowercase
            token_pattern='[a-zA-Z0-9]{3,}',# num chars > 3
            max_features=5000,# max number of unique words

data_matrix = vectorizer.fit_transform(df_clean['question_lemmatize_clean'])

lda_model = LatentDirichletAllocation(
            n_components=10, # Number of topics
            n_jobs = -1  # Use all available CPUs
lda_output = lda_model.fit_transform(data_matrix)

import pyLDAvis
import pyLDAvis.sklearn
pyLDAvis.sklearn.prepare(lda_model, data_matrix, vectorizer, mds='tsne')    

When I run that code snippet, I get this error message.

AttributeError                            Traceback (most recent call last)
Cell In[83], line 29
     27 import pyLDAvis.sklearn
     28 pyLDAvis.enable_notebook()
---> 29 pyLDAvis.sklearn.prepare(lda_model, data_matrix, vectorizer, mds='tsne')

File ~\anaconda3\lib\site-packages\pyLDAvis\, in prepare(lda_model, dtm, vectorizer, **kwargs)
     62 def prepare(lda_model, dtm, vectorizer, **kwargs):
     63     """Create Prepared Data from sklearn's LatentDirichletAllocation and CountVectorizer.
     65     Parameters
     92     See `pyLDAvis.prepare` for **kwargs.
     93     """
---> 94     opts = fp.merge(_extract_data(lda_model, dtm, vectorizer), kwargs)
     95     return pyLDAvis.prepare(**opts)

File ~\anaconda3\lib\site-packages\pyLDAvis\, in _extract_data(lda_model, dtm, vectorizer)
     37 def _extract_data(lda_model, dtm, vectorizer):
---> 38     vocab = _get_vocab(vectorizer)
     39     doc_lengths = _get_doc_lengths(dtm)
     40     term_freqs = _get_term_freqs(dtm)

File ~\anaconda3\lib\site-packages\pyLDAvis\, in _get_vocab(vectorizer)
     19 def _get_vocab(vectorizer):
---> 20     return vectorizer.get_feature_names()

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'

I feel like, perhaps, some library is not updated correctly, but I can't tell, and when I Google it, I'm not getting great results to help me debug this thing. Anyone know what's wrong here?


  • The method get_feature_names() has been changed to get_feature_names_out() and the purpose of it is to help get output feature names for transformation.

    Link to the documentation: here