From the Spacy documentation:
For a list of the fine-grained and coarse-grained part-of-speech tags assigned by spaCy’s models across different languages, see the label schemes documented in the models directory.
I assume this is referring to the parts of speech tags, eg: VERB
, NOUN
, NUM
etc., and that this list will be different for each language.
Is this a correct assumption?
I followed the link in the documentation to the models directory, but could not find a list of the valid POS tags for each language.
https://spacy.io/usage/linguistic-features#pos-tagging
Answer
Thanks to @polm23 for the answer, here's a screen shot with the navigation, in case anyone else can't find it.
Look for the "label scheme" on the page for any individual language.
The VERB NOUN type tags, that go in the .pos
attribute, are from Universal Dependencies, and are mostly the same between languages. The coarse-grained tags, for the .tag
attribute, can be anything and are unique to each language as far as I'm aware.