Search code examples
pythonregexregex-lookarounds

Regex match with negative lookbehind, recursive pattern and negative lookahead


I need to match this:

void function{ {  {  } }}   

(function definition with balanced parenthesis) but not this

static stTLookupTable RxTable[MAX]={
     
    {zero, one},{zero, one},{zero, one}};

I have tried to match with lookarounds with (?<![[=])({((?>[^{}]+|(?R))*)})(?!;) But this matches {zero, one} in the variable declaration.

(?<![[=]){((?>[^{}]+|(?R))*)}[^;]$ doesn't work either.

In short, I need it to match function definition, but not the array declaration, assuming array initialization starts with ]=. Does anyone know how to match the function definition alone?

PS: {((?>[^{}]+|(?R))*)} matches for balanced paranthesis


Solution

  • Assuming you are using PyPi regex module you can use

    import regex
    text = """void function{ {  {  } }}   
    static stTLookupTable RxTable[MAX]={
         
        {zero, one},{zero, one},{zero, one}};"""
    
    print( [x.group(3) for x in regex.finditer(r'=\s*({(?>[^{}]+|(?1))*})(*SKIP)(*F)|({((?>[^{}]+|(?2))*)})', text)] )
    # => [' {  {  } }']
    

    See the Python demo online.

    Details:

    • =\s*({(?>[^{}]+|(?1))*})(*SKIP)(*F):
      • = - a = char
      • \s* - zero or more whitespaces
      • ({(?>[^{}]+|(?1))*}) - a substring between balanced {...}
      • (*SKIP)(*F) - skips the match and restarts the search from the failure position
    • | - or
    • ({((?>[^{}]+|(?2))*)}) - Group 2 (technical, used for recursion):
      • {((?>[^{}]+|(?2))*)} - matches a {...} substring with balanced curly braces.

    You need to return Group 3 from the matches.