Search code examples
pythonnlpspacylemmatization

Is it possible to get a list of words given the Lemma in Spacy?


I am trying to fix grammatical gender in French text and wanted to know if there is a way to get a list of all words from a certain lemma and if it possible to do a lookup in such list?


Solution

  • Try:

    import spacy
    lemma_lookup = spacy.lang.en.LOOKUP
    
    reverse_lemma_lookup = {}
    
    for word,lemma in lemma_lookup.items():
        if not reverse_lemma_lookup.get(lemma):
            reverse_lemma_lookup[lemma] = [word]
        elif word not in reverse_lemma_lookup[lemma]:
            reverse_lemma_lookup[lemma].append(word)
    
    reverse_lemma_lookup["be"]
    ["'m", 'am', 'are', 'arst', 'been', 'being', 'is', 'm', 'was', 'wass', 'were']
    

    Change spacy.lang.en.LOOKUP to spacy.lang.fr.LOOKUP I guess for French