Search code examples
nlpspacylinguistics

Use Spacy Models to find Modal Verb for languages fr, es, ru


I am using Spacy models to find modal verb (MD) from following languages.

en
de
fr
es
ru

From tag_map.py of en and de it is clear that "VerbType": "mod" is a modal verb. But tag_map.py for fr, es and ru do not have any such property. How can I find out Modal verb in these 3 languages(on which properties should I focused on)? Also is there any generic way that I can find out the Modal verb of any language released by Spacy in the future let say greek is released?

Note: I am not looking for high-level tags but I am looking for low-level tags. In Spacy terminology, I am preferring token.tag_ property.


Solution

  • I don't think there is currently a language-independent way to do so. But modal words are closed-classed words, so just checking if token.tag_ == 'AUX' (although in German, modal verbs are tagged as VERB) and if token.lemma_ is in a set of modal verbs should do the job.