I'm currently working on a simple ANTLR4 grammar for evaluating mathematical expressions. At the moment, my grammar should just be able to parse simple operations like multiplications, divisions, additions and subtractions ... Here's my grammar:
grammar WRB;
options {
language = Java;
}
prog: stat+;
stat: expr SEPARATOR #printExpr
| ID ASSIGN expr SEPARATOR #assignment
;
expr: expr op=(MUL|DIV) expr #punkt
| expr op=(ADD|SUB) expr #strich
| num #number
| (SIGN)? ID #ref
| '(' expr ')' #klammer
;
ID : [a-zA-Z]+;
DIGITS : [0-9]+ ;
ASSIGN: '=';
MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';
integer: (SIGN)? DIGITS;
floating: (integer)? '.' DIGITS;
num: (integer | floating);
SIGN: '+' | '-';
SEPARATOR: ';';
WS: [ \t\r\n]+ -> skip ;
Everything works fine besides the negative numbers. Here's the syntax tree for the sample "-4 + 9":
I'm fairly new to language recognition and grammars. I don't see why ANTLR handles the negative sign as extraneous input, shouldn't the expr
rule dive into the #number
alternative?
Thanks in advance.
Without testing: try removing SIGN
rule, rewrite integer
as (SUB|ADD)? DIGITS
. My understanding is that SIGN
will never match because it follows SUB
and ADD
. Token rules always follow "first longest match wins", there is no attempt to rematch for "better parsing".