I'm trying to create custom markup. This markup will look like this; If there is no attribute then it will be <mark text mark>
and the group of this match will be text. If this markup has an attribute <mark:attribute text mark>
will be like this. After the <mark
, there will be a colon without a space, and an attribute will come without a space. Two groups will be formed from this match, the first group will be the attribute value after the colon, the second group will be text.
<mark text mark>
must match<mark:attribute text mark>
must match<mark
text
mark>
must match
<mark:attribute
text
mark>
must match
<marktextmark>
should not match<mark>
should not match<mark:attributetextmark>
should not match<mark:attribute textmark>
should not match<mark: text mark>
should not match<mark:red ...blah...blah... mark>
must match. First group is red, Second group is ...blah...blah...<mark Lorem Ipsum mark>
must match. The group is Lorem IpsumI think it can make matching difficult when mark is capitalized <MARK TEXT MARK>
. It doesn't matter if it doesn't affect the situation.
<mark
<mark:attribute
mark>
<mark:attribute text mark>
<mark text mark>
<mark text mark>
Group: text<mark:attribute text mark>
Group[0]: attribute, Group[1]: textI tried to write some regex (<mark:([^*].+?)mark>
) but I couldn't get any result. I hope I was able to explain. https://regex101.com/r/jNsM88/1
Thanks for your help.
Group 0 is always the entire match, so captured groups start at 1: Your targets will be captured in groups 1 and 2 (not 0 and 1 as you desire).
Use an optional (ie quantifier ?
) non-capturing group ((?:...)
) for the attribute and capture non-whitespace \S
:
<mark(?::(\S+))?\s+(\S+)\s+mark>
See live demo.