I am trying to get the POS for the sentence
dragon flies to rescue the princess
using below code
nlp = spacy.load("en_core_web_md")
doc = nlp("dragon flies to rescue the princess")
for token in doc:
print(f'{token.text:{12}} {token.pos_:{12}}')
Output for above code:
dragon NOUN
flies NOUN
to PART
rescue VERB
the DET
princess NOUN
Here, 'flies' is considered as NOUN while it is VERB, is it because spacy is considering 'dragon flies' as a single word?
what should I do , if I wish to get "VERB" as POS for flies.
When running your example, there are two things to note:
When I run the corrected sentence "A dragon flies to rescue the princess.", the output is
The DET
dragon NOUN
flies VERB
to PART
rescue VERB
the DET
princess NOUN
. PUNCT
and therefore exactly what we expected. Should your dataset contain sentences that are dealing with such syntactic errors, the "easiest" solution would probably to hand-annotate some of the examples, and utilize Spacy's training functionality, details for this can be found here. Even then, it is not guaranteed that you get significantly better results unless you annotate a lot of data, and can assert that most of the samples have "similar-looking" errors.