I am trying to match C++ argument type which can contain balanced <
and >
characters.
With this regex:
(\<(?>[^<>]|(?R))*\>)
On this string: QMap<QgsFeatureId, QPair<QMap<Something, Complex> >>
It matches all expect the first 4 characters (QMap).
Now, if I add \w+
at the start of my regex, it now only matches the end of it (QPair<QMap<Something, Complex> >>
) and not the whole string.
What is the explanation and how to solve this?
You can try it online here.
This is intented to use in Perl 5.10+ (5.24).
The (?R)
construct recurses the entire pattern. When you add \w+
at the start, it is also accounted for when the recursion takes place. However, what you want to recurse is the Group 1 subpattern.
You need a subroutine call that will recurse the capturing group subpattern:
(\w+)(<(?:[^<>]++|(?2))*>)
See the regex demo
Details
(\w+)
- Group 1 capturing the identifier (you may change it to [a-zA-Z]\w*
)(<(?:[^<>]++|(?2))*>)
- Group 2 (that will be recursed)
<
- a literal <
(?:[^<>]++|(?2))*
- either 1+ chars other than <
and >
(possessively, to make it faster) or (|
) the whole Group 2 pattern ((?2)
).>
- a literal >
Results:
Match: QMap<QgsFeatureId, QPair<QMfap<Something, Complex> >>
Group 1: QMap
Group 2: <QgsFeatureId, QPair<QMfap<Something, Complex> >>