Search code examples
parsingantlrantlr4grammar

Issue with parse order in ANTLR4 grammar


Below is a very simplified grammar to illustrate the problem. I likely can handle the existing result in generated code, but suspect there is some more elegant way to instead control the parser. Appreciate any tips.

grammar
  a: a binop a | unop a
  binop: '&'
  unop: '~'

code to be parsed
  ~A & ~B & C

result
  unop (A binop ((unop B) binop C)

desired
  (unop A) binop ((unop B) binop C)

Have tried some tips from articles solving related issues, but nothing matching just this - and no success.


Solution

  • You need to do 2 things:

    • move UNOP a above a BINOP a to give it a higher precedence
    • let binary expressions be right associative so that A & B & C is not parsed as ((A & B) & C) but as (A & (B & C)) instead

    This grammar:

    parse : a EOF;
    
    a
     : UNOP a
     | <assoc=right> a BINOP a
     | ID
     ;
    
    ID    : [A-Z]+;
    BINOP : '&';
    UNOP  : '~';
    SPACE : [ \t\r\n] -> skip;
    

    will parse your input ~A & ~B & C as follows:

    enter image description here