CHAR_LITERAL: '\'' (~['\\\r\n] | EscapeSequence) '\'';
fragment EscapeSequence
: '\\' 'u005c'? [btnfr"'\\]
| '\\' 'u005c'? ([0-3]? [0-7])? [0-7]
| '\\' 'u'+ HexDigit HexDigit HexDigit HexDigit
;
Why '\r', '\n', ''', '' is excluded in first part,and '\b', '\t', '\f', '"' not excluded in first part?
If I change the rule to this, is it equivalent to the previous rule
CHAR_LITERAL: '\'' (~['\\\r\n\b\f\t\"] | EscapeSequence) '\'';
Or change it to this
CHAR_LITERAL: '\'' (~['\\] | EscapeSequence) '\'';
It's not trying to exclude something like:
char x = `\r`;
It's trying to exclude:
char x = '
';
That last one is illegal java. A '
(opening a char literal) can be followed by either an EscapeSequence, or a character, but by exception, not a newline character. (as in, literally pressing enter in your editor, not \n
which isn't a newline, it's an escape sequence that represents a new line).
In other words, after the single quote, any character is fine, EXCEPT backslash which needs to be excluded, as EscapeSequence
handles this, and EXCEPT the literal unicode values 0D
/0A
(CR and LF, in antlrspeak, \r
and \n
).
It gets a little confusing perhaps - just make sure you very very carefully count the backslashes:
['\\\r\n]
That is excluding 4 unicode values, and only 4:
char x = '';
is not legal java.EscapeSequence
part to parse that. char x = '\';
is not legal.In contrast, the escape sequences aren't looking for \n
, they are looking for a backslash symbol and then the actual letter 'n'.