I have the following string that I want to match against the rule, stringLiteral:
"D:\\Downloads\\Java\\MyFile"
And my grammar is the file: String.g4, as follows:
grammar String;
fragment
HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
stringLiteral
: '"' ( EscapeSequence | XXXXX )* '"'
;
fragment
EscapeSequence
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UnicodeEscape
| OctalEscape
;
fragment
OctalEscape
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UnicodeEscape
: '\\' 'u' HexDigit HexDigit HexDigit HexDigit
;
What should I put in the XXXXX location in order to match any character that is not \ or "?
I tried the following, and it all doesn't work:
~['\\'"']
~['\\'\"']
~["\]
~[\"\\]
~('\"'|'\\')
~[\\\"]
I am using ANTLRWorks 2 to try this out. Errors are the following:
D:\Downloads\ANTLR\String.g4 line 26:5 mismatched character '<EOF>' expecting '"'
error(50): D:\Downloads\ANTLR\String.g4:26:5: syntax error: '<EOF>' came as a complete surprise to me while looking for rule element
Inside a character class, you only need to escape the backslash:
The following is illegal, it escapes the ]
:
[\]
The following matches a backslash:
[\\]
The following matches a quote:
["]
And the following matches either a backslash or quote:
[\\"]
In v4 style, your grammar could look like this:
grammar String;
/* other rules */
StringLiteral
: '"' ( EscapeSequence | ~[\\"] )* '"'
;
fragment
HexDigit
: [0-9a-fA-F]
;
fragment
EscapeSequence
: '\\' [btnfr"'\\]
| UnicodeEscape
| OctalEscape
;
fragment
OctalEscape
: '\\' [0-3] [0-7] [0-7]
| '\\' [0-7] [0-7]
| '\\' [0-7]
;
fragment
UnicodeEscape
: '\\' 'u' HexDigit HexDigit HexDigit HexDigit
;
Note that you can't use fragments inside parser rules: StringLiteral
must be a lexer rule!