So, say we have a text of 2.at(0)
is there a way to tell antlr4 that 2. can't be a float with omitted trailing 0, unambiguously and without consuming the following token?
Assuming you want to accept 2.
in 2. + 3
as a float, but not inside 2.at(0)
(or 2. at(0)
), then no, that is not possible in ANTLR without some sort of predicate.
With a predicate, you'll need to add target specific code to your grammar that determines if a .
is part of a float, or if it is a DOT token. For the Java target, that might look like this:
lexer grammar DemoLexer;
@header {
import java.util.*;
}
@members {
private boolean nakedDotPartOfFloat() {
// Start looking ahead 2 steps (1 step ahead id the '.')
for (int i = 2; ; i++) {
char nextChar = (char)_input.LA(i);
if (Character.isSpaceChar(nextChar)) {
// Ignore any space chars
continue;
}
// If the character after the '.' is a letter, a float is not possible
return !Character.isLetter(nextChar);
}
}
}
ADD
: '+'
;
DOT
: '.'
;
INT
: [0-9]+
;
FLOAT
: [0-9]+ '.' [0-9]+
| [0-9]+ {nakedDotPartOfFloat()}? '.'
;
ID
: [a-zA-Z]+
;
SPACE
: [ \t\r\n] -> skip
;
If you then tokenize the input "2. 2.1 2.foo 2.+3"
, you'd get the following tokens:
9 tokens:
1 FLOAT '2.'
2 FLOAT '2.1'
3 INT '2'
4 DOT '.'
5 ID 'foo'
6 FLOAT '2.'
7 ADD '+'
8 INT '3'
9 EOF '<EOF>'