Search code examples
pythonregexregex-lookarounds

Regex match a function if only if it contains a specific variable


I have this string that I have read from a file.

    /**********************************************************************functionheaderstuff***********************************************************************************************************************/
void fn1(void)
{   
    b= 8;
}

/***********************************************************************functionheaderstuff***********************************************************************************************************************/

void fn2(int a, intb)
{   int c;

    var = 6;
}

I want to match the function which contains the variable var which is written.

With this regex (?<=[*]{60}\/)(\s*\w+(?: \w+\s*)(?=(\((.*?)\)\s*{))).*(\bvar\b([^>=<!;{])*[=]{1}[^=]*?[;]), I am matching both the functions as the .* is greedy. I need it to not match if it encounters }\s*\/[*]{60} and only match if the function contains the variable being written, preferably only if the variable is not within a comment.

Negative lookahead didn't work- ((?<=[*]{60}\/)\s*\w+(?: \w+\s*)(?=(\((.*?)\)\s*{)).*)(?!(\s*}\s*\/[*]{60}))(\bvar\b([^>=<!;{])*[=]{1}[^=]*?[;])

My function will start with the type of function header I shared. Finding variable being written into and identifying the function works okay. This regex works fine if the function contains the var else it takes from the next function. What am I doing wrong here?


Solution

  • I changed my approach and got what I needed. I removed all the comments initially. I matched the functions in the string and made it to a list. Then checked for the variable. Python code

    import regex
    text = """void fn1(void){       b= 8;} void fn2(int a, intb){   int c;    var = 6;}"""
    
    #checks for complete function definition
    string1 = r'\w+(?:\s+\w+)*\w+[(][\w,\.*\s&\[\]]*[)]\s*({(?:[^{}]++|(?1))*})'
    
    varwrite = r'\bvar\b([^>=<!;{])*[=]{1}[^=]*?[;]'
    reg = regex.compile(varwrite,regex.MULTILINE )
    y =[x.group() for x in regex.finditer(string1,text,regex.DOTALL)]
    print(y)
    for line in range(len(y)):
        if reg.search(y[line]):
            z =y[line].split('(',1)[0]
            print(z.split()[-1])