Search code examples
pythonmachine-learningsympysymbolic-mathgplearn

How to export the output of gplearn as a sympy expression or some other readable format?


As much as this may sound like a simple task, I have not encountered a way to do it though the documentation.

After running an arbitrary routine (such as one of these examples, I get something like

>>> print(est_gp)
sqrt(div(add(1.000, sub(div(sqrt(log(0.978)), X0), mul(-0.993, X0))),add(-0.583, 0.592)))

How do I (or can I even) convert this to an expression that can be used outside gplearn, like a sympy expression?


Solution

  • You can make it into a SymPy expression with sympify. This requires providing a dictionary so that things like add, mul, sub, div are interpreted correctly by SymPy:

    locals = {
        "add": Add,
        "mul": Mul,
        "sub": Lambda((x, y), x - y),
        "div": Lambda((x, y), x/y)
    }
    
    sympify('sqrt(div(add(1.000, sub(div(sqrt(log(0.978)), X0), mul(-0.993, X0))), add(-0.583, 0.592)))', locals=locals)
    

    This returns a SymPy expression, which prints as

    sqrt(110.333333333333*X0 + 111.111111111111 + 16.5721799259414*I/X0)
    

    The symbol X0 can be accessed as Symbol("X0"). Or, which is a more robust approach, you can explicitly say what the symbols are, by creating them and adding them to the dictionary ahead of time.

    X0 = symbols("X0")
    locals = {
        "add": Add,
        "mul": Mul,
        "sub": Lambda((x, y), x - y),
        "div": Lambda((x, y), x/y),
        "X0": X0
    }
    

    This is needed, for example, to parse I as a symbol "I" rather than "imaginary unit" as SymPy would do by default.

    I'm not happy about the evaluation of sqrt(log(0.978)). Although sympify has option evaluate=False, which prevents things like addition, it does not prevent functions with floating point arguments from being evaluated.