My c# regex code:
Regex regex = new Regex(@"\((.*?)\)");
return regex.Matches(str);
...nicely matches all the "paren groups" as in the data block below:
(dirty FALSE)
(composite [txtModel])
(view [star2])
(creationIndex 0)
(creationProps )
(instanceNameSpecified FALSE)
(containsObject nil)
(sName ApplicationWindow)
(txtDynamic FALSE)
(txtSubComposites )
(txtSubObjects )
(txtSubConnections )
But the following block of data throws it off the rails:
([vog317] of ZZconstant
(dirty FALSE)
(composite [gpGame])
(view [nil])
(creationIndex 1)
(creationProps composite !/gpGame sName Constraint4)
(instanceNameSpecified TRUE)
(containsObject ZZconstant)
(sName NoGo_Track_back_Co)
(description "")
(parameters "")
(languageType Prefix)
(explanation "Some sample text here!")
(salience 1)
(condition "
(if (eq ?hoer9_Cl:sName extens)
then
(or (eq ?Starry:sName sb405)
(eq ?Starry:sName sb43)
(eq ?Starry:sName sb455)
(eq ?Starry:sName sb48)
)
)
")
)
Please note the inner-paren group:
(if (eq ?hoer9_Cl:sName extens)
then
(or (eq ?Starry:sName sb405)
(eq ?Starry:sName sb43)
(eq ?Starry:sName sb455)
(eq ?Starry:sName sb48)
)
)
That little sub-block of paren-enclosed data should merely be seen as a part of the (condition
paren-group, and not be matched by the regex pattern. The way to exclude it is for the pattern to see either of the following 2 exceptions:
(
preceded by a tab or space should be excluded from the match.(if
followed by any kind of whitespace should be excluded from the match.So how can I modify my regex pattern \((.*?)\)
so that it complies with the above 2 rules? I tried for awhile in Regex Storm, but I'm too much of a beginner with regex to work it out.
You could use the pattern that you tried, and add lookarounds for the logic in the 2 exceptions listed:
(?<![ \t])\((?!if\s)(.*?)\)
Explanation
(?<![ \t])
Negative lookbehind 1st point assert what is directly to the left is not a space or tab\(
Match (
(?!if\s)
Negative lookahead 2nd point assert what is directly to the right is not if
and whitespace char(.*?)
Capture group 1 Match any char except a newline non greedy\)
match )
If matching between opening and closing parenthesis can span multiple lines, you could also use a negated character class [^
:
(?<![ \t])\((?!if\s)([^()]*)\)