Search code examples
nlpspacypos-tagger

POS tagging a single word in spaCy


spaCy POS tagger is usally used on entire sentences. Is there a way to efficiently apply a unigram POS tagging to a single word (or a list of single words)?

Something like this:

words = ["apple", "eat", good"]
tags = get_tags(words) 
print(tags)
> ["NNP", "VB", "JJ"]

Thanks.


Solution

  • English unigrams are often hard to tag well, so think about why you want to do this and what you expect the output to be. (Why is the POS of apple in your example NNP? What's the POS of can?)

    spacy isn't really intended for this kind of task, but if you want to use spacy, one efficient way to do it is:

    import spacy
    nlp = spacy.load('en')
    
    # disable everything except the tagger
    other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "tagger"]
    nlp.disable_pipes(*other_pipes)
    
    # use nlp.pipe() instead of nlp() to process multiple texts more efficiently
    for doc in nlp.pipe(words):
        if len(doc) > 0:
            print(doc[0].text, doc[0].tag_)
    

    See the documentation for nlp.pipe(): https://spacy.io/api/language#pipe