I am using https://github.com/sylvainhalle/Bullwinkle/, a BNF parser. But it can only recognize a BNF pattern for single-whole-sentence, how to find BNF matches in a paragraph? I am trying to recognize certain commands in human's speaking. For a BNF example:
<S>: <lightcmd>;
<lightcmd>: <do> <obj> | <do> <a> <obj>;
<do>: turn on | turn off;
<a>: a | an | the;
<obj>: light;
I can parse the right command from speaking "turn on the light", but I can't parse it from "please turn on the light".
When you try to parse "please turn on the light" against that grammar, the parser will attempt to parse the entire utterance ... because that is what parsers do. But what you want is to match part of the utterance.
Now you could deal with this something like the following:
<S>: <words> <lightcmd> <words>;
<lightcmd>: <do> <obj> | <do> <a> <obj>;
<do>: turn on | turn off;
<a>: a | an | the;
<obj>: light;
<words>: <word> | <words> <word>
<word>: // one or more letters
but you start getting into problems with ambiguity.
I think that a better idea would be to try to parse sub-sequences of the word stream. So if the user says
please turn on a light and make a cup of tea
you attempt to parse starting with "please". Then when that fails you try starting with "turn". Then when you reach "light" you ignore the rest of the words.
I don't know if Bullwinkle can be used like that. If not, look for alternative parser generators.
However, it should also be noted that a grammar as simple as this can also be written as a regular expression where the words are treated as the base symbols. You would only need a simple regular expression engine to handle this. (If this grammar is typical, you only need grouping and alternation. It should only be a couple of hours work to implement this from scratch.)