Search code examples
antlr4language-recognition

Beginner: ANTLR4 Grammar doesn't handle negative numbers


I'm currently working on a simple ANTLR4 grammar for evaluating mathematical expressions. At the moment, my grammar should just be able to parse simple operations like multiplications, divisions, additions and subtractions ... Here's my grammar:

grammar WRB;

options {
language = Java;
}

prog: stat+;

stat: expr SEPARATOR #printExpr
    | ID ASSIGN expr SEPARATOR #assignment
    ;

expr: expr op=(MUL|DIV) expr #punkt
    | expr op=(ADD|SUB) expr #strich
    | num #number
    | (SIGN)? ID #ref
    | '(' expr ')' #klammer
    ;

ID  :   [a-zA-Z]+;
DIGITS :   [0-9]+ ;

ASSIGN: '=';
MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';

integer: (SIGN)? DIGITS;
floating:  (integer)? '.' DIGITS;
num:  (integer | floating);
SIGN: '+' | '-';

SEPARATOR: ';';
WS: [ \t\r\n]+ -> skip ;

Everything works fine besides the negative numbers. Here's the syntax tree for the sample "-4 + 9":

enter image description here

I'm fairly new to language recognition and grammars. I don't see why ANTLR handles the negative sign as extraneous input, shouldn't the expr rule dive into the #number alternative?

Thanks in advance.


Solution

  • Without testing: try removing SIGN rule, rewrite integer as (SUB|ADD)? DIGITS. My understanding is that SIGN will never match because it follows SUB and ADD. Token rules always follow "first longest match wins", there is no attempt to rematch for "better parsing".