I need help with the following: We are implementing the Monaco Editor (https://github.com/Microsoft/monaco-editor) in a web GUI as an editor for text files in a RobotFramework-like (http://robotframework.org) space-separated syntax. Each keyword/argument is separated using two or more consecutive spaces. A keyword/argument may also contain one or more spaces, if they are not consecutive.
Using the Monarch tokenizer, we are successfully writing regexes matching rows where we know how many arguments that will be used, however, a few of our keywords offer the possibility of supplying an arbitrary number of arguments:
keyword arg1 arg2 ... argN
^^spaces^^ ^^spaces^^ ^^spaces^^ ^^spaces^^ ^^spaces^^
We would like to give each argument a class called 'argument' and the spaces a class called 'separator'. We already have one rule in place, matching the keyword and the following spaces, sending the arg1..argN string to the state 'arguments':
arguments: [
{
regex: /(\S.*?)(\s{2,})/,
action: { cases: {
'$2': [
{ token: 'argument', log: 'Matches: `$0`, `$1`, `$2`' },
{ token: 'separator', next: '@arguments' },
],
'$1': [
{ token: 'argument' },
],
'@default': { token: 'eos', next: '@pop' }
}
},
}
],
We figured that we could have the state call itself to match an arbitrary number of arg-spaces combinations. However, the console output from the tokenizing suggests that the state does not call itself but skips onto the next row instead.
Does anyone know what we have done wrong? Is there a better solution to our use case?
Thanks!
Edit: Found a rather complicated solution; it depends on two alternating states with some rather complex regex matching and usage of the "switchTo" and "cases" features of the editor:
/* Arguments iterators: argument -> argseparator -> argument -> ... (end of line) */
argument: [
{
regex: /(\S.*?)(?=\s{2,}|$)/,
action: { cases: {
'@eos': { token: 'argument', next: '@pop' },
'$1': { token: 'argument', switchTo: '@argseparator' }
} },
}
],
argseparator: [
{
regex: /(\s{2,}?)(?=\S.*|$)/,
action: { cases: {
'@eos': { token: 'separator', next: '@pop' },
'$1': { token: 'separator', switchTo: '@argument' }
} },
}
],
I solved it myself. I had to use two alternating states and perform some rather complex regex matching together with the "cases" and "swtichTo" features of the editor to solve my problem. See my edit above.