Search code examples
parsingrecursiongrammarantlr4rpn

Antlr4 grammar left recursive error


I am having quite a problem with antlr4 right now.
Whenever I try to feed antlr with this RPN grammar

 grammar UPN;

    //Parser  

    expression : plus | minus | mult | div | NUMBER;  
    plus : expression expression '+';  
    minus : expression expression '-';  
    mult : expression expression '*';  
    div : expression expression '/';  

    //Lexer  
    NUMBER : '-'? ('0'..'9')+;  

antlr will throw an error because plus,minus,mult and div are mutually left recursive.
I dont know how to fix that.
(I know this occurs because with this grammar "expression" could be infinitely looped, I have had this problem before with another grammar, but i could fix that on my own)

My only solution would be to restrict the grammar in the following way

grammar UPN;

//Parser

expression : plus | minus | mult | div | NUMBER;
exp2 : plus2 | minus2 | mult2 | div2 | NUMBER;
plus : exp2 exp2'+';
minus : exp2 exp2'-';
mult: exp2 exp2'*';
div: exp2 exp2'/';
plus2 : NUMBER NUMBER '+';
minus2 : NUMBER  NUMBER '-';
mult2: NUMBER  NUMBER '*';
div2: NUMBER  NUMBER '/';

//Lexer
NUMBER : '-'? ('0'..'9')+;  

but this is not really what i want it to be, because now i could work at maximum with expressions like

2 3 + 5 4 - *

and the grammar would be more complex than it actually could be.
Hope you guys can help me


Solution

  • ANTLR4 only supports "direct" left recursive rules, not "indirect", as you have them.

    Try something like this:

    grammar RPN;
    
    parse : expression EOF;
    
    expression
     : expression expression '+'
     | expression expression '-'
     | expression expression '*'
     | expression expression '/'
     | NUMBER
     ;
    
    NUMBER : '-'? ('0'..'9')+;
    
    SPACES : [ \t\r\n] -> skip;
    

    Btw, 23+54-* is not a valid RPN expression: it must start with two numbers.