Search code examples
pythonmachine-learningscikit-learnsklearn-pandas

TfidfTransformer and stop words


I am importing TfidfTransformer from sklearn and trying to use stop_word argument, but it is showing error.

from sklearn.feature_extraction.text import TfidfTransformer
tfidf = TfidfTransformer(stop_words='english')


TypeError                                 Traceback (most recent call last)
<ipython-input-16-1315a209c082> in <module>
      1 from sklearn.feature_extraction.text import TfidfTransformer
----> 2 tfidf = TfidfTransformer(stop_words='english')

TypeError: __init__() got an unexpected keyword argument 'stop_words'

How solve this error?


Solution

  • I think you intent to use TfidfVectorizer, which has the parameter stop_words. Refer the documentation here

    Example:

    from sklearn.feature_extraction.text import TfidfVectorizer
    corpus = [
        'This is the first document.',
        'This document is the second document.',
        'And this is the third one.',
        'Is this the first document?',
    ]
    vectorizer = TfidfVectorizer(stop_words='english')
    X = vectorizer.fit_transform(corpus)