Search code examples
pythonregexequation

Simplifying equations using regex in Python


I want to write a function in Python that takes any multivariable quadratic equation and then simplifies it as much as possible. For example, the following equation: (24*x + 1 + 9) / (((8 * 6 * 2) + 1) * y) = 3 + 9 + (3 * (8 + 6)) * 18.56 Convert to the following equation: (24*x + 10)/(97*y) = 791.52

I was able to make it identify the variables using regex in Python, but it didn't help much to simplify the equation: r'[a-zA-Z]+\d*'

I also wrote a code to identify numbers and mathematical symbols to separate them from variables and then simplify them with the eval function. But the regex code was not working properly: r'\(?([-+/*]?\d+\.\d+|\d+[-+/*]?)+\)?' For example, instead of identifying part 3 + 9 + (3 * (8 + 6)) * 18.56 on the right side of the initial equation, this code identified parts ['9+', '3*', '6', '*18.56']. If anyone knows the problem of this regex or knows a better way to simplify the equations, please tell me. I don't want to use additional libraries like sympy, I want to write this program myself.


Solution

  • You could take an approach where the aim is to just call eval on the input string and prepare the environment so that this evaluation will give you the desired result. This is possible, because:

    • The names of the variables (x, y) can be defined as "smart" objects
    • Python offers ways to override operators so that you can define what it means to add/subtract/multiply/divide/compare such objects
    • You can use __str__ to convert the evaluated object to the desired output format

    To dynamically create the variables of the first step, it will be handy to have them as keys in a dictionary, and associate to them the "smart" objects. Then convert the input string so that these dictionary keys are accessed.

    As = is not the equality operator in Python syntax, it would be good to replace any occurrence of that by "==".

    Here is code you could start with. It would need extension if you need to support more operators, or need smarter simplification rules to be applied, but this will already deal with the example you have provided:

    class Operator:
        symbol = "?"  # Subclass should override this
        order = 100   # Subclass should override this (operator precedence) 
        
        def __init__(self, left, right=None):
            self.left = left
            self.right = right
    
        @staticmethod
        def wrap(expr, order=0):
            return str(expr)
        
        def __mul__(self, other):
            return Product(self, other)
    
        __rmul__ = __mul__
        
        def __truediv__(self, other):
            return Quotient(self, other)
    
        def __rtruediv__(self, other):
            return Quotient(other, self)
    
        def __add__(self, other):
            return Sum(self, other)
    
        __radd__ = __add__
    
        def __sub__(self, other):
            return Difference(self, other)
    
        def __rsub__(self, other):
            return Difference(other, self)
    
        def __eq__(self, other):
            return Equality(self, other)
    
        def __str__(self):
            return f"{self.wrap(self.left, self.order)} {self.symbol} {self.wrap(self.right, self.order)}"
    
    class BinaryOperator(Operator):
        @staticmethod
        def wrap(expr, order=0):
            # Wrap in parentheses if this operator-expression is used as an operand
            #    of an operator that has higher precedence (lower order)
            if isinstance(expr, BinaryOperator) and expr.order > order:
                return f"({expr})"
            return str(expr)
    
    
    class Variable(Operator):
        order = 0
        def __str__(self):
            return self.left
        
    
    class Product(BinaryOperator):
        order = 10
        symbol = "*"
        def __init__(self, a, b):
            if isinstance(a, Product):  # Rotate tree to potentially simplify
                a, b = a.left, a.right * b
            super().__init__(a, b)
            
    class Sum(BinaryOperator):
        order = 20
        symbol = "+"
        def __init__(self, a, b):
            if isinstance(a, Sum):  # Rotate tree to potentially simplify
                a, b = a.left, a.right + b
            super().__init__(a, b)
    
    class Difference(BinaryOperator):
        order = 20
        symbol = "-"
    
    class Quotient(BinaryOperator):
        order = 10
        symbol = "/"
    
    class Equality(BinaryOperator):
        order = 30
        symbol = "="
    
    
    import re
    def simplify(s):
        # Replace single "=" with double ("==")
        s = re.sub(r"(?<!=)=(?!=)", "==", s)
        # Identify all variable names
        regex = r"[a-zA-Z]\w*"
        # Map each name to a Variable instance
        d = { name: Variable(name) for name in re.findall(regex, s) }
        # Alter variable names into attributes of this dict
        s = re.sub(regex, r"d['\g<0>']", s)
        return eval(s)
    
    
    # demo
    s = "(24*x + 1 + 9) / (((8 * 6 * 2) + 1) * y) = 3 + 9 + (3 * (8 + 6)) * 18.56"
    print(simplify(s)) # (x * 24 + 10) / y * 97 = 791.52
    

    This is just a start. You'll need to add simplification rules as you tailor this solution to your needs.