Search code examples
nlpopennlp

Is there a way to force the Apache OpenNLP parser to see a verb phrase instead of a noun phrase?


I'm writing a command parser using Apache's OpenNLP. The problem is that OpenNLP sees some commands as noun phrases. For example, if I parse something like "open door", OpenNLP gives me (NP (JJ open) (NN door)). In other words, it sees the phrase as "an open door" instead of "open the door". I want it to parse as (VP (VB open) (NP (NN door))). If I parse "open the door" it produces a VP, But I can't count on a person using determiners.

I'm currently trying to figure out how to perform surgery on the incorrect parse tree but the API documentation is severely lacking.


Solution

  • After a lot of research I stumbled on someone with the same problem using NLTK. They were advised to "hack" NLTK by adding a pronoun like "they" before the command to force the parser to see the input as a verb phrase. So I would give OpenNLP "they open door" and get back (S (NP (PRP they)) (VP (VBP open) (NP (NN door)))), at which point I can just extract the verb phrase.

    It's certainly not ideal! But for now it will work for my requirements.