I'm trying to capture user defined variables for highlighting purposes. I've been wrestling with this for a while and finally have something that appears to work (at least on various regex debugging sites.) However, when I throw it into Ace Editor, I get:
Uncaught SyntaxError: Invalid regular expression: /#<error>/: Unterminated group
Here is my regex, and the example I'm trying to parse.
^(?:(?:(?:(?:volatile|non_volatile|persistent)\s*(?:integer|char|long|slong)\s*([a-zA-Z0-9]*)(?:$|.*|\[)|\s*(?:integer|char|long)\s*([a-zA-Z0-9]*)(?:$|.*|\[)|\s*(?!volatile|persistent|non_volatile|define_variable)([a-zA-Z0-9]*)(?:$|.*|\[)))|(?:volatile|non_volatile|persistent)\s([a-zA-Z0-9]*))
define_variable
volatile integer loop;
volatile char someChar
volatile long someLong
volatile anotherChar[3] = 'ABC';
anInteger = 33
VOLATILE INTEGER yetAnotherInt
Volatile[32] // this is not valid
someCharArray[32]
char anotherCharArray[32]
char singleChar
persistent slong slongValue[32];
Testing my regex & code example on regex101.com yields the proper matches - ideas, pointers, suggestions?
Rule regexps for ace either need to not have capturing groups or have a row of groups spanning whole text (like /(foo)(\s+)(bar)(\s+)(baz)/
).
The main issue here is that your regexp is too complex, You don't have to put everything into one rule but every time you have |
at the root level create a new rule, and let ace combine them.
something like
{
regex: /(volatile|non_volatile|persistent)(\s+)(integer|char|long|slong)(\s+)([a-zA-Z0-9]+)/
token: ["token1", "text", "token2", "text", "token3"]
},
{
regex: /(\s*)(integer|char|long)(\s+)([a-zA-Z0-9]+)/
token: ["text", "token1", "text", "token2"]
},
even better would be to use createKeywordMapper
see https://github.com/ajaxorg/ace/blob/v1.1.3/lib/ace/mode/javascript_highlight_rules.js#L40
Also note that using \s*
instead of \s+
in (?:volatile|non_volatile|persistent)\s*(?:integer|char|long|slong)
matches volatileinteger
which is likely wrong.