If i try to run "___sad" in the interpreter for the following grammar
grammar identTest;
options
{
language = Java;
output=AST;
}
goal: identifier;
fragment Letter: (('a'..'z') | ('A'..'Z'));
fragment Digit : '0' .. '9';
identifier :IDENTIFIER;
IDENTIFIER: Letter+;
WS:(' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;};
Interpreter output:
Debugger output:
Interpreter includes underscore letter and debugger seems just ignores it! I expect to get some kind of exception in this case (since only 'A'-'z' letters are defined in the grammar). What is wrong with my grammar?
Don't use the interpreter: it's buggy.
Using the debugger you can view the warnings/errors/exceptions your parser produces after pressing the Output button (lower left corner). When doing so, you will see the following:
.../__Test___input.txt line 1:0 no viable alternative at character '_'
.../__Test___input.txt line 1:1 no viable alternative at character '_'
.../__Test___input.txt line 1:2 no viable alternative at character '_'
The parser simply recovers from the underscores and continues parsing.
If you don't want your lexer to recover from such no viable alternative warnings, simply create a fall through lexer rule (called OTHER
) and throw an exception from it:
grammar identTest;
options
{
language = Java;
output=AST;
}
goal : identifier;
identifier : IDENTIFIER;
IDENTIFIER : Letter+;
WS : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;};
OTHER : . {throw new RuntimeException("unknown char: '" + $text + "'");};
fragment Letter : (('a'..'z') | ('A'..'Z'));
fragment Digit : '0' .. '9';