Search code examples
pythonregexregex-groupregex-greedy

Getting function Content and function name in C with regular expression in python


I am trying to get function content (body) if the function's name matches a defined pattern

what I tried so far:

(Step1) get with a recursion all function bodies in a define C file {(?:[^{}]+|(?R))*+}

(Step2) find all matches of wanted function' s name

(Step3) Combine both steps. This where I am struggling

Input

TASK(arg1)
{
    if (cond)
    {
      /* Comment */
      function_call();
      if(condIsTrue)
      {
         DoSomethingelse();
      }
    }
    if (cond1)
    {
      /* Comment */
      function_call1();
    }
}


void FunctionIDoNotWant(void)
{
    if (cond)
    {
      /* Comment */
      function_call();
    }
    if (cond1)
    {
      /* Comment */
      function_call1();
    }
}

I am looking for the function TASK. When I add the regex to match TASK in front of "{(?:[^{}]+|(?R))*+}", nothing works.

(TASK\s*\(.*?\)\s)({((?>[^{}]+|(?R))*)})

Desired Output

Group1:
   TASK(arg1)
Group2:
    if (cond)
    {
      /* Comment */
      function_call();
      if(condIsTrue)
      {
         DoSomethingelse();
      }
    }
    if (cond1)
    {
      /* Comment */
      function_call1();
    }

Solution

  • You are recursing the whole pattern with (?R) which is the same like (?0) whereas you want to recurse (?2), the second group. Group one contains your (TASK...)

    See this demo at regex101

    (TASK\s*\(.*?\)\s)({((?>[^{}]+|(?2))*)})
                      ^ here starts the second group -> recursion with (?2)