I wanna capture logical operators from ooRexx with regex in a .cson file because I want support syntax highlighting of ooRexx with the Atom editor. Those are the operators I try to cover:
>= <= \> \< \= >< <> == \== // && || ** ¬> ¬< ¬= ¬== >> << >>= \<< ¬<< \>> ¬>> <<=
And this is the regex part in the cson file:
'match': '\\+ | - | [\\\\] | \\/ | % | \\* | \\| | & |=|¬|>|<|
>= | <= | ([\\\\]>) | ([\\\\]<) | ([\\\\]=) | >< | <> | == | ([\\\\]==) |
\\/\\/ | && | \\|\\| | \\*\\* | ¬> | ¬< | ¬= | ¬== | >> | << | >>= | ([\\\\]<<) | ¬<< |
([\\\\]>>) | ¬>> | <<='
I'm struggling with the slashes (forward and backward) and also with the double **
My knowledge about regex is very basic, to say it nicely. Is there somebody who can help me with that?
You have spaces around the pipe bars: these spaces are counted in the regular expression. So when you write something like | \*\* |
, the double asterisks get caught, but only if they are surrounded by a space on each side, and not if they're affixed to a word or at the beginning/end of a line. Same issue with the slashes — I have tested it, and it does seem to catch them for me, but only as long as your slashes (or asterisks) are between two spaces.
A few other things to keep in mind:
[<>]=
will catch both >=
and <=
. Writing [\\]
is equivalent to writing \\
directly because \\
counts as a single character, due to the first escaping backslash. Similarly, your parentheses here are not being used; see grouping.+
and *
. So \\>+
will catch both \>
and \>>
. ==?
will match both =
and ==
.You can group together a LOT of your statements with these three tricks combined… I'll leave that exercise to you!
Just another hint when developing long regular expressions — use a tester like Regex101 or similar with a test file to see your changes in real time, and debuggers like Regexper will help you understand how your regular expression is parsed.