Search code examples
pythonspacylemmatization

How to solve 'str' object has no attribute 'lemma_' using Spacy?


I tried to do a lemmatization for my DataFrame using Spacy in python. The code that I used is like this below:

# import spaCy's language model
nlp = spacy.load("en_core_web_sm")

# function to lemmatize text
def lemmatization(texts):
    output = []
    for i in texts:
        lem = [str(token).lemma_ for token in nlp(i) or str(token) in ["-PRON-"]]
        output.append(' '.join(lem))
    return output

train['clean_tweet'] = lemmatization(train['clean_tweet'])
test['clean_tweet'] = lemmatization(test['clean_tweet'])

turns out I get an error which said:

'str' object has no attribute 'lemma_'

How can I resolve this?


Solution

  • string_ = "I am will be playing football tommorrow" # dummy string 
    obj = nlp(string_)
    lemmatize_token = [x.lemma_ for x in obj]
    
    print(lemmatize_token)
    ['I', 'be', 'will', 'be', 'play', 'football', 'tommorrow']