A common real-world use is to match a balanced set of parentheses.
\((?>[^()]|(?R))*\)
matches a single pair of parentheses with any text in between, including an unlimited number of parentheses, as long as they are all properly paired. (...)
If you want a regex that does not find any matches in a string that contains unbalanced parentheses, then you need to use a subroutine call instead of recursion. (AND THIS IS WHAT I DONT UNDERSTAND ->). If you want to find a sequence of multiple pairs of balanced parentheses as a single match, then you also need a subroutine call.
So using "(?R)" will not produce a "single match". Is it something different than a "single match", something like multiple matches in one match?
Source: https://www.regular-expressions.info/recurse.html
"Matching balanced constructs" part.
If you want to find a sequence of multiple pairs of balanced parentheses as a single match, then you also need a subroutine call.
I believe what this is saying that if you want to match this (.)(.)
- a sequence of multiple pairs of balanced parentheses - then you need to use subroutines. I admit, an example would have helped in the tutorial.
Why will recursion fail to return "as a single match"? This simply because of the way recursion works - it is applying the same regex to an ever decreasing string (if it's not infinitely recursing) and returning a result when it reaches a non-matching element, then proceeding further along the string. It does not match (.)(.)
as a single string, because it is not defined to do so (you can do that using quantifiers: (?>\((?>[^()]|(?R))*\)){2}
or more generally: (?>\((?>[^()]|(?R))*\))*
).
Here is what the pattern presented in the tutorial produces for a sequence of multiple pairs of balanced parentheses
(line 1) and for balanced constructs or nested constructs
(line 2):