spaCy POS tagger is usally used on entire sentences. Is there a way to efficiently apply a unigram POS tagging to a single word (or a list of single words)?
Something like this:
words = ["apple", "eat", good"]
tags = get_tags(words)
print(tags)
> ["NNP", "VB", "JJ"]
Thanks.
English unigrams are often hard to tag well, so think about why you want to do this and what you expect the output to be. (Why is the POS of apple
in your example NNP
? What's the POS of can
?)
spacy isn't really intended for this kind of task, but if you want to use spacy, one efficient way to do it is:
import spacy
nlp = spacy.load('en')
# disable everything except the tagger
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "tagger"]
nlp.disable_pipes(*other_pipes)
# use nlp.pipe() instead of nlp() to process multiple texts more efficiently
for doc in nlp.pipe(words):
if len(doc) > 0:
print(doc[0].text, doc[0].tag_)
See the documentation for nlp.pipe()
: https://spacy.io/api/language#pipe