I have this pattern (?<!')(\w*)\((\d+|\w+|.*,*)\)
that is meant to match strings like:
c(4)
hello(54, 41)
Following some answers on SO, I added a negative lookbehind so that if the input string is preceded by a '
, the string shouldn't match at all. However, it still partially matches.
For example:
'c(4)
returns (4)
even though it shouldn't match anything because of the negative lookbehind.
How do I make it so if a string is preceded by '
NOTHING matches?
Since nobody came along, I'll throw this out to get you started.
This regex will match things like
aa(a , sd,,,f,)
aa( as , " ()asdf)) " ,, df, , )
asdf()
but not
'ab(s)
This will fix the basic problem (?<!['\w])\w*
Where (?<!['\w])
will not let the engine skip over a word char just
to satisfy the not quote.
Then the optional words \w*
to grab all the words.
And if a 'aaa(
quote is before it, then it won't match.
This regex here embellishes what I think you are trying to accomplish
in the function body part of your regex.
It might be a little overwhelming to understand at first.
(?s)(?<!['\w])(\w*)\(((?:,*(?&variable)(?:,+(?&variable))*[,\s]*)?)\)(?(DEFINE)(?<variable>(?:\s*(?:"[^"\\]*(?:\\.[^"\\]*)*"|'[^'\\]*(?:\\.[^'\\]*)*')\s*|[^()"',]+)))
Readable version (via: http://www.regexformat.com)
(?s) # Dot-all modifier
(?<! ['\w] ) # Not a quote, nor word behind
# <- This will force matching a complete function name
# if it exists, thereby blocking a preceding quote '
( \w* ) # (1), Function name (optional)
\(
( # (2 start), Function body
(?: # Parameters (optional)
,* # Comma (optional)
(?&variable) # Function call, get first variable (required)
(?: # More variables (optional)
,+ # Comma (required)
(?&variable) # Variable (required)
)*
[,\s]* # Whitespace or comma (optional)
)? # End parameters (optional)
) # (2 end)
\)
# Function definitions
(?(DEFINE)
(?<variable> # (3 start), Function for a single Variable
(?:
\s*
(?: # Double or single quoted string
"
[^"\\]*
(?: \\ . [^"\\]* )*
"
|
'
[^'\\]*
(?: \\ . [^'\\]* )*
'
)
\s*
| # or,
[^()"',]+ # Not quote, paren, comma (can be whitespace)
)
) # (3 end)
)