I tried to do a lemmatization for my DataFrame using Spacy
in python. The code that I used is like this below:
# import spaCy's language model
nlp = spacy.load("en_core_web_sm")
# function to lemmatize text
def lemmatization(texts):
output = []
for i in texts:
lem = [str(token).lemma_ for token in nlp(i) or str(token) in ["-PRON-"]]
output.append(' '.join(lem))
return output
train['clean_tweet'] = lemmatization(train['clean_tweet'])
test['clean_tweet'] = lemmatization(test['clean_tweet'])
turns out I get an error which said:
'str' object has no attribute 'lemma_'
How can I resolve this?
string_ = "I am will be playing football tommorrow" # dummy string
obj = nlp(string_)
lemmatize_token = [x.lemma_ for x in obj]
print(lemmatize_token)
['I', 'be', 'will', 'be', 'play', 'football', 'tommorrow']