I am trying to write a parser for a dialect of Answer Set Programming (ASP) which, in terms of grammar, looks like Prolog with some extensions.
One extension, for instance is expansion, meaning that fact(1..3).
for instance is expanded in fact(1). fact(2). fact(3).
. Notice that the language understands INT
and FLOAT
numbers and uses .
also as a terminator.
In some cases the parser fails to distinguish between integers, floats, extensions and separators because I reckon the language is clearly ambiguous. In that cases, I have to explicitly separate tokens with white spaces. Any Prolog or ASP parser, however, correctly deals with such productions. I read that ANTLR4 can disambiguate problematic productions autonomously, but probably it needs some help but I don't know how to do! ;-) I read something like here and here, but apparently they did not help me.
Could somebody please tell me what to do to overcome this ambiguity? Please notice that I cannot change the language because it is quite standard. In order to simplify the experts' work, I created a minimal working example that follows.
grammar Test;
program:
statement* ;
statement: // DOT is the statement terminator
range DOT |
intNum DOT |
floatNum DOT ;
intNum: // not needed, but helps in TestRig
INT;
floatNum: // not needed, but helps in TestRig
FLOAT;
range: // defines an expansion
INT DOTS INT ;
DOTS: '..';
DOT: '.';
FLOAT: DIGIT+ '.' DIGIT* | '.' DIGIT+ ;
INT: DIGIT+ ;
WS: [ \t\r\n]+ -> skip ;
fragment NONZERO : [1-9] ;
fragment DIGIT : [0] | NONZERO ;
I use the following input:
1 .
1. .
1.5 .
.5 .
1 .. 5 .
1.
1..
1.5.
.5.
1..5.
And I get the following errors which instead are parsed corrected by other tools:
line 8:0 extraneous input '1.' expecting '.'
line 11:2 extraneous input '.5' expecting '.'
Many thanks in advance!
Prolog does not accept 1.
as a float. This feature makes your grammar significantly more ambiguous, so maybe try removing that feature.