I have a spaCy doc
that I would like to lemmatize.
For example:
import spacy
nlp = spacy.load('en_core_web_lg')
my_str = 'Python is the greatest language in the world'
doc = nlp(my_str)
How can I convert every token in the doc
to its lemma?
Each token has a number of attributes, you can iterate through the doc to access them.
For example: [token.lemma_ for token in doc]
If you want to reconstruct the sentence you could use: ' '.join([token.lemma_ for token in doc])
For a full list of token attributes see: https://spacy.io/api/token#attributes