Search code examples
regexperl

A regex to parse regex string


I need to parse a regex with regex. I have a regex string:

[a-z]{1}[-][0-9]{4}[-][ab]

The actual regex for parsing the string above that I came up with and which almost works is:

/(?|\[(.*?)\]\{(.*?)\}|\[(.*?)\](.*?))/g

What does it do can be seen in this regex101 example and the error here is in the Match 2 and its Group 1 (-][0-9, which should be just -).


The goal is to match everything inside of square brackets [] followed by a number inside curly brackets {}. If curly brackets {} after square brackets [] are missing it should fill it with null and this is what alternative group is doing with branch reset group. Also if just square brackets followed by a square brackets, then it's expected to act as later as well (match what's on the inside of square brackets [] and fill Group 2 with null).

The problem that my regex doesn't stop on third [-] and matches it upto -][0-9 instead of matching just - and then starting with parsing [0-9]{4}.

The expected match should be:

[a-z]{1}
a-z
1

[-]
-
null

[0-9]{4}
0-9
4

[-]
-
null

[ab]
ab
null

The current match is incorrect and is as follows:

[a-z]{1}
a-z
1

[-][0-9]{4}
-][0-9
4

[-]
-
null

[ab]
ab
null

What am I missing?


Solution

  • This regex should work:

    \[([^]]*)](\{\d+\}|)
    

    Demo

    Explanation:

    • \[ - matches [
    • ([^]]*) - matches 0+ occurrences of any character that is not a ] and captures this submatch in group 1
    • ] - matches ]
    • (\{\d+\}|) - either matches nothing OR a { followed by 1+ digits followed by }. Whatever is matched is stored in Group 2