I have a working TOKEN
that excludes certain characters. It must not start with +
or -
, but these characters are allowed after the start.
TOKEN : ~('+' | '-' | '\u0000' .. '\u001f' | ' ' | '<' | '>' | ':' | '"' | '/' | '\\' | '|' | '?' | '*' | '#' | '@') ~('\u0000' .. '\u001f' | ' ' | '<' | '>' | ':' | '"' | '/' | '\\' | '|' | '?' | '*' | '#' | '@')+ ;
I have been trying to simplify it using fragments...
fragment EXCLUDED : ('\u0000' .. '\u001f' | ' ' | '<' | '>' | ':' | '"' | '/' | '\\' | '|' | '?' | '*' | '#' | '@');
fragment RESERVED : ('+' | '-') ;
TOKEN : ~(RESERVED | EXCLUDED) ~(EXCLUDED)+ ;
However I get the error: rule reference RESERVED is not currently supported in a set
?
If you use the shorter character set notation from ANTLR 4, you perhaps don't need to use the negated fragments. The rule:
TOKEN
: ~('+' | '-' | '\u0000' .. '\u001f' | ' ' | '<' | '>' | ':' | '"' | '/' | '\\' | '|' | '?' | '*' | '#' | '@') ~('\u0000' .. '\u001f' | ' ' | '<' | '>' | ':' | '"' | '/' | '\\' | '|' | '?' | '*' | '#' | '@')+
;
is the same as this:
TOKEN
: ~[+\-\u0000-\u001f <>:"/\\|?*#@] ~[\u0000-\u001f <>:"/\\|?*#@]+
;