Search code examples
pythonnlpopen-sourcepypi

Is there any python library for searching specific words like "the", "is", "was", "am" .... and other similar words?


I want to exclude such words from a text file. Currently, my code just counts the occurrences of all the words in the text file, but I want to exclude such unwanted words as mentioned before and only count the frequency of certain important words. There are many important words in the file so I cannot include all of them. So it would be helpful if there was a preexisting library in python


Solution

  • Such words are called stop words, you can easily remove them by using the nltk library

    from nltk.corpus import stopwords
    # print(list(stopwords.words('english')))
    filtered_words = [word for word in word_list if word not in stopwords.words('english')]