Search code examples
nlpgrammarcontext-free-grammarbnf

Multiple-digit numbers getting split by space in NuGram?


I'm seeing some unexpected behavior in the NuGram IDE Eclipse plug-in for ABNF grammar development.

Say I have a rule that reads:

$fifties =
    50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59
;

The sentence generator comes up with the matches 5 0, 5 1, 5 2, ... I would normally expect 50, 51, 52, and so forth, but according to NuGram's coverage tool these are considered OOG.

Come to find that it will split any multiple-digit number with spaces, unless there's a leading non-number:

1234 -> 1 2 3 4
1234asdf -> 1 2 3 4 asdf
asdf1234 -> asdf1234
1234asdf5678 -> 1 2 3 4 asdf5678

As far as I know, a normal ABNF grammar wouldn't do this. Or am I forgetting something?


Solution

  • This is because NuGram IDE considers digits as individual DTMF tones. I agree that this behaviour should only apply to DTMF grammars and not voice grammars.

    You can surround sequences of digits with double quotes, like:

    $fifties =
        "50" | "51" | "52" | "53" | "54" | "55" | "56" | "57" | "58" | "59"
    ;
    

    Hope that helps!