I want to ask a seemingly simple question to Python wizs (I am a total newbie so have no idea how simple/complex this question is)!
I have a verb list in a dataframe looking as below:
id verb
15 believe
64 start
90 believe
I want to lemmatize it. The problem is that most lemmatization comes with sentence strings. My data does not provide context to decide its part-of-speech because I only need 'verb' speech lemmas.
Would you have any ideas about how to go about lemmatizing this verb list? Many thanks in advance for considering my question!
If you are asking how to apply a function over a pandas DataFrame column, you can do
import pandas as pd
from nltk.stem import WordNetLemmatizer
data = pd.DataFrame({
"id": [1, 2, 3, 4],
"verb": ["believe", "start", "believed", "starting"],
})
# https://www.nltk.org/_modules/nltk/stem/wordnet.html
wnl = WordNetLemmatizer()
data.verb = data.verb.map(lambda word: wnl.lemmatize(word, pos="v"))
print(data)
Output
id verb
0 1 believe
1 2 start
2 3 believe
3 4 start