Search code examples
.netregexbalancing-groups

Avoid regex balancing group to move out of parentheses


I'm using the following regex to match the contents of any datascript script referencing a specific UDF:

\[?shared3\]?\.\[?stringsum\]?(((?'Open'\()[^()]*)+((?'Close-Open'\))[^()]*)+)*

it matches any instance of:

Shared3.StringSum(<some contents here>)

Using the balancing groups, I'm trying to also support cases like:

Shared3.StringSum(SomeOtherMethod('input') + AnotherMethod('input'))

However, I'm running into trouble, when the input is like:

Shared3.StringSum(SomeOtherMethod('input') + AnotherMethod('input')) + ThirdMethod('input')

In the last case, my regex also matches the ThirdMethod('input') part.

Is there any way I can alter my regex, so it stops matching as soon as the "parentheses count" is 0?


Solution

  • You may use

    \[?shared3]?\.\[?stringsum]?\(((?>[^()]+|(?'Open'\()|(?'Close-Open'\)))*)\)
    

    See the regex demo

    Details

    • \[?shared3]? - an optional [, shared, and an optional ]
    • \. - a dot
    • \[?stringsum]? - an optional[,stringsum, and an optional]`
    • \( - a (
    • ((?>[^()]+|(?'Open'\()|(?'Close-Open'\)))*) - Group 1: one or more occurrences of
      • [^()]+| - 1+ chars other than ( and ), or
      • (?'Open'\()| - Group "Open": pushes ( into group stack
      • (?'Close-Open'\)) - Group "Close" and "Open": pops the ) from the Open group stack and saves the current level substring into Close group
    • \) - a ) char to finish things up