Search code examples
stanford-nlppos-tagger

Stanford Parser - POS tagging - pronoun as noun?


I noticed that the Stanford Parser taggs "anyone" and "anybody" as a noun, whereas they are pronouns; I tried to set "anyone" in different contexts, I got the same result. Can anyone tell me if it hapenned to him/her and if there is a way to correct it (I mean perhaps some settings ?).

Thank you!


Solution

  • This is more a linguistics question than a code question. The analysis of these words is complex. Many linguists would describe them as being a fused determiner and noun, as they fairly transparently are at least historically.

    In general we currently use the Penn Treebank standards for tokenization, part-of-speech, and phrasal labels for English. The Penn Treebank annotates these words as noun (NN) - rightly or wrongly - so that is what our tools currently do.

    However, you might be pleased to know that under the Universal Dependencies guidelines, these words are indeed pronouns (PRON). We are sure to be moving towards greater use of Universal Dependencies in future releases.