Search code examples
javascriptregexace-editor

How can I properly group user defined variables for syntax highlighting?


I'm trying to capture user defined variables for highlighting purposes. I've been wrestling with this for a while and finally have something that appears to work (at least on various regex debugging sites.) However, when I throw it into Ace Editor, I get:

Uncaught SyntaxError: Invalid regular expression: /#<error>/: Unterminated group

Here is my regex, and the example I'm trying to parse.

^(?:(?:(?:(?:volatile|non_volatile|persistent)\s*(?:integer|char|long|slong)\s*([a-zA-Z0-9]*)(?:$|.*|\[)|\s*(?:integer|char|long)\s*([a-zA-Z0-9]*)(?:$|.*|\[)|\s*(?!volatile|persistent|non_volatile|define_variable)([a-zA-Z0-9]*)(?:$|.*|\[)))|(?:volatile|non_volatile|persistent)\s([a-zA-Z0-9]*))

Regular expression visualization

Debuggex Demo

define_variable
volatile integer loop;
volatile char someChar
volatile long someLong
volatile anotherChar[3] = 'ABC';
anInteger = 33
VOLATILE INTEGER yetAnotherInt
Volatile[32] // this is not valid
someCharArray[32]
char anotherCharArray[32]
char singleChar
persistent slong slongValue[32];

Testing my regex & code example on regex101.com yields the proper matches - ideas, pointers, suggestions?


Solution

  • Rule regexps for ace either need to not have capturing groups or have a row of groups spanning whole text (like /(foo)(\s+)(bar)(\s+)(baz)/).

    The main issue here is that your regexp is too complex, You don't have to put everything into one rule but every time you have | at the root level create a new rule, and let ace combine them.

    something like

       {
          regex: /(volatile|non_volatile|persistent)(\s+)(integer|char|long|slong)(\s+)([a-zA-Z0-9]+)/
          token: ["token1", "text", "token2", "text", "token3"]
       },
       {
          regex: /(\s*)(integer|char|long)(\s+)([a-zA-Z0-9]+)/
          token: ["text", "token1", "text", "token2"]
       },
    

    even better would be to use createKeywordMapper see https://github.com/ajaxorg/ace/blob/v1.1.3/lib/ace/mode/javascript_highlight_rules.js#L40

    Also note that using \s* instead of \s+ in (?:volatile|non_volatile|persistent)\s*(?:integer|char|long|slong) matches volatileinteger which is likely wrong.