I am writing a parser for a subset of Java-like expressions, and have found an ambiguity when parsing this code:
(something) -x
since it could be parsed as something - x
(binary expression) or (something) (-x)
(casted unary expression).
Currently the parser favours the first option but I would like the second one.
The relevant parts of my parser look like this:
expression
: LeftParen expr = expression RightParen # ParenthesizedExpression
| id = Identifier # IdentifierExpression
| op = (Minus | BitwiseNot) expr = expression # UnaryExpression
| < assoc = right > LeftParen type = Identifier RightParen expr = expression # CastExpression
| left = expression op = (Plus | Minus) right = expression # AdditiveExpression
;
Any ideas how to resolve this, ideally without eliminating the direct left-recursion? Thanks.
The alternatives in an ANTLR parser rule are tried from top to bottom. If you move your cast expression up, it will parse as expected:
expression
: <assoc=right> LeftParen type=Identifier RightParen expr=expression # CastExpression
| LeftParen expr=expression RightParen # ParenthesizedExpression
| op=(Minus | BitwiseNot) expr=expression # UnaryExpression
| left=expression op=(Plus | Minus) right=expression # AdditiveExpression
| id=Identifier # IdentifierExpression
;