Search code examples
pythonregexconditional-compilation

parse nested conditional statements


I need to parse a file that contains conditional statements, sometimes nested inside one another.

I have a file that stores configuration data but the configuration data is slightly different depending on user defined options. I can deal with the conditional statements, they're all just booleans with no operations but I don't know how to recursively evaluate the nested conditionals. For instance, a piece of the file might look like:

...
#if CELSIUS
    #if FROM_KELVIN ; this is a comment about converting kelvin to celsius.
        temp_conversion = 1, 273
    #else
        temp_conversion = 0.556, -32
    #endif
#else
    #if FROM_KELVIN
        temp_conversion = 1.8, -255.3
    #else
        temp_conversion = 1.8, 17.778
    #endif
#endif
...

... Also, some conditionals don't have an #else statement, just #if CONDITION statement(s) #endif.

I realize that this could be easy if the file were just written in XML or something else with a nice parser to begin with, but this is what I have to work with so I'm wondering if there's any relatively simple way to parse this file. It's similar to parenthesis matching so I imagine there would be some module for it but I haven't found anything.

I'm working in python but I can switch for this function if it's easier to solve this in another language.


Solution

  • Since all of the conditions are binary and I know the values of all of them in advance (no need to evaluate them in order in order like a programming language), i was able to do it with a regular expression. This works better for me. It finds the lowest level conditionals (ones with no nested conditions), evaluates them and replaces them with the correct contents. Then repeats for the higher level conditionals and so on.

    import re
    
    conditions = ['CELSIUS', 'FROM_KELVIN']
    
    def eval_conditional(matchobj):
        statement = matchobj.groups()[1].split('#else')
        statement.append('') # in case there was no else statement
        if matchobj.groups()[0] in conditions: return statement[0]
        else: return statement[1]
    
    def parse(text):
        pattern = r'#if\s*(\S*)\s*((?:.(?!#if|#endif))*.)#endif'
        regex = re.compile(pattern, re.DOTALL)
        while True:
            if not regex.search(text): break
            text = regex.sub(eval_conditional, text)
        return text
    
    if __name__ == '__main__':
        i = open('input.txt', 'r').readlines()
        g = ''.join([x.split(';')[0] for x in i if x.strip()])
        o = parse(g)
        open('output.txt', 'w').write(o)
    

    Given the input in the original post, it outputs:

    ...
            temp_conversion = 1, 273
    
    ...
    

    which is what I need. Thanks to everyone for their responses, I really appreciate the help!