Search code examples
pythonserializationsympydillmpmath

How to serialise an mpmath lambdified SymPy function containing Fresnel integrals?


I am using SymPy to generate some functions requiring the evaluation of Fresnel integrals and would like to save them to disk. When I sm.lambdify the expressions using the default ["scipy", "numpy"] modules, serialising them works with dill.dump() and dill.load(). However they contain square roots of the inputs, which can be negative (although the output is always real), and the default modules are unable to handle the complex numbers.

When I use the "mpmath" module in sm.lambdify instead, the resulting function can be evaluated for positive and negative inputs. However, I am having trouble serialising these versions of the functions with dill or cloudpickle.

Here are examples of what I've tried with a simple function as well as something resembling what I'm actually using.

  1. Simple function (no Fresnel integral):
import sympy as sm
import dill
import cloudpickle

a,b = sm.symbols('a b')
f = sm.integrate(a*b**2, (b,0,1))
F = sm.lambdify((a,b), f, "mpmath")

# Saving with dill fails with below error:
F_dump = dill.dumps(F)
# PicklingError: Can't pickle : it's not the same object as mpmath.ctx_iv.ivmpf

# Setting recurse=True results in successful save and load:
F_dump = dill.dumps(F, recurse=True)
F_load = dill.loads(F_dump)

# Cloudpickle also successfully saves and loads by default:
F_dump = cloudpickle.dumps(F)
F_load = cloudpickle.loads(F_dump)
  1. Function with Fresnel integral:
import sympy as sm
import dill
import cloudpickle

a,b = sm.symbols('a b')
f_fresnel = sm.integrate(sm.cos(a*b**2), (b,0,1))
F_fresnel = sm.lambdify((a,b), f_fresnel, "mpmath")

# Fails with same error as before:
F_fresnel_dump = dill.dumps(F_fresnel)
# PicklingError: Can't pickle : it's not the same object as mpmath.ctx_iv.ivmpf

# With recurse=True, fails after a while with below error:
F_fresnel_dump = dill.dumps(F_fresnel, recurse=True)
# RecursionError: maximum recursion depth exceeded in comparison

# Cloudpickle appears to save successfully:
F_dump = cloudpickle.dumps(F_fresnel)

# But fails with below error when trying to load:
F_load = cloudpickle.loads(F_dump)
# TypeError: mpq.__new__() missing 1 required positional argument: 'p'

I don't know how relevant the specific type of function I'm working with is to the problem, but it appears to at least have some bearing considering the above two examples. Can anyone advise how I can make this work, I'm not sure if I should be looking at the lambdify stage, the serialisation modules, or something else entirely?


Solution

  • From the comments it seems that you have a possible solution by pickling the expression instead of the function. I just want to make clear that sympy's lambdify function is really just creating a dynamically generated Python function and you can get the code for that function:

    In [2]: import sympy as sm
       ...: 
       ...: a,b = sm.symbols('a b')
       ...: f_fresnel = sm.integrate(sm.cos(a*b**2), (b,0,1))
       ...: F_fresnel = sm.lambdify((a,b), f_fresnel, "mpmath")
    
    In [3]: import inspect
    
    In [4]: print(inspect.getsource(F_fresnel))
    def _lambdifygenerated(a, b):
        return (mpf(1)/mpf(8))*sqrt(2)*sqrt(pi)*fresnelc(sqrt(2)*sqrt(a)/sqrt(pi))*gamma(mpf(1)/mpf(4))/(sqrt(a)*gamma(mpf(5)/mpf(4)))
    

    You can just paste that function into your code directly if you want to be able to reuse it in a future script. It needs an import like from mpmath import sqrt, gamma, mpf, .... That way the future script does not need to import sympy, does not need to use pickle etc. This approach will be both faster and more robust.

    The downsides of this approach are that if you are dynamically generating many of these functions then copying out the code might be error-prone but you could also automate that:

    In [24]: code = inspect.getsource(sm.lambdify((a, b), f_fresnel, use_imps=True))
    
    In [25]: code = code.replace('_lambdifygenerated', 'F_fresnel')
    
    In [26]: with open('generated.py', 'a') as fout: # using append mode
        ...:     fout.write('\n')
        ...:     fout.write(code)
        ...: 
    
    In [27]: cat generated.py
    
    def F_fresnel(a, b):
        return (mpf(1)/mpf(8))*sqrt(2)*sqrt(pi)*fresnelc(sqrt(2)*sqrt(a)/sqrt(pi))*gamma(mpf(1)/mpf(4))/(sqrt(a)*gamma(mpf(5)/mpf(4)))
    

    Clearly having code that rewrites your code on disk is risky so make sure you are careful if following something like this approach. The other downside is that this won't work if you are trying to pickle a complicated data structure that includes this F_fresnel function as a part of that data structure.

    As for why pickling fails the question is why pickling a function fails. After you have called lambdify what you have is an ordinary function except that it is one that is dynamically generated. I think that in that situation pickle will "pickle" the function by essentially just embedding the code to import that function. For example:

    In [28]: import pickle
    
    In [29]: pickle.dumps(sm.simplify)
    Out[29]: b'\x80\x04\x95(\x00\x00\x00\x00\x00\x00\x00\x8c\x17sympy.simplify.simplify\x94\x8c\x08simplify\x94\x93\x94.'
    

    I don't exactly know how to "read" a pickle in general but I can see pretty clearly that this is basically just storing the equivalent of the Python code

    from sympy.simplify.simplify import simplify
    

    That works if you pickle a function that is contained in a definite installed module because it means that pickle.loads can just import the module and get the function from there. Unfortunately this approach does not work for dynamically generated functions like F_fresnel because they are not defined in a module from which they can be imported (unless you copy out the code as I showed above). I don't know how dill or cloudpickle handle this as compared to pickle but it's quite clear to me why pickle can't handle this:

    In [30]: pickle.dumps(F_fresnel)
    ---------------------------------------------------------------------------
    PicklingError                             Traceback (most recent call last)
    Cell In[30], line 1
    ----> 1 pickle.dumps(F_fresnel)
    
    PicklingError: Can't pickle <function _lambdifygenerated at 0x7f30a5db0f40>: attribute lookup _lambdifygenerated on __main__ failed
    
    In [31]: _lambdifygenerated = F_fresnel
    
    In [32]: pickle.dumps(F_fresnel)
    Out[32]: b'\x80\x04\x95#\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x12_lambdifygenerated\x94\x93\x94.'
    

    So pickle wants to store the code from __main__ import _lambdifygenerated because _lambdifygenerated is the name of the function in the code created by lambdify regardless of what name e.g. F_fresnel you have assigned it to. That fails because the function is defined in the __main__ module i.e. the main script and there is no object called _lambdifygenerated there unless we assign it to that name. It would fail in many other ways though in practice because if you tried to load that pickle in a different script (a different __main__) then it would not have the object to be imported.

    Another solution would just be to wrap up the arguments to lambdify in something that is pickleable and then pickle that:

    class PickleLambdify:
    
        def __init__(self, args, expression, **kwargs):
            self.args = args
            self.expression = expression
            self.kwargs = kwargs
    
        def regenerate(self):
            return lambdify(self.args, self.expression, **self.kwargs)
    

    Then you can do:

    In [35]: F_fresnel_pickle = PickleLambdify((a,b), f_fresnel, modules="mpmath")
    
    In [36]: pickle.dumps(F_fresnel_pickle)
    Out[36]: b'\x80\x04\x95\x0e\x02\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0ePickleLambdify\x94\x93\x94)\x81\x94}\x94(\x8c\x04args\x94\x8c\x11sympy.core.symbol\x94\x8c\x06Symbol\x94\x93\x94\x8c\x01a\x94\x85\x94\x81\x94h\x08\x8c\x01b\x94\x85\x94\x81\x94\x86\x94\x8c\nexpression\x94\x8c\x0esympy.core.mul\x94\x8c\x03Mul\x94\x93\x94(\x8c\x12sympy.core.numbers\x94\x8c\x08Rational\x94\x93\x94K\x01K\x08\x86\x94\x81\x94\x8c\x10sympy.core.power\x94\x8c\x03Pow\x94\x93\x94h\x14\x8c\x07Integer\x94\x93\x94K\x02\x85\x94\x81\x94h\x14\x8c\x04Half\x94\x93\x94)\x81\x94\x86\x94\x81\x94h\x1bh\x14\x8c\x02Pi\x94\x93\x94)\x81\x94h"\x86\x94\x81\x94h\x1bh\x0bh\x16J\xff\xff\xff\xffK\x02\x86\x94\x81\x94\x86\x94\x81\x94h\x1b\x8c\'sympy.functions.special.gamma_functions\x94\x8c\x05gamma\x94\x93\x94h\x16K\x05K\x04\x86\x94\x81\x94\x85\x94\x81\x94h\x14\x8c\x0bNegativeOne\x94\x93\x94)\x81\x94\x86\x94\x81\x94\x8c\'sympy.functions.special.error_functions\x94\x8c\x08fresnelc\x94\x93\x94h\x13h$h\x1bh\'h\x16J\xff\xff\xff\xffK\x02\x86\x94\x81\x94\x86\x94\x81\x94h\x1bh\x0bh"\x86\x94\x81\x94\x87\x94\x81\x94\x85\x94\x81\x94h0h\x16K\x01K\x04\x86\x94\x81\x94\x85\x94\x81\x94t\x94\x81\x94\x8c\x06kwargs\x94}\x94\x8c\x07modules\x94\x8c\x06mpmath\x94sub.'
    
    In [37]: pickle.loads(pickle.dumps(F_fresnel_pickle))
    Out[37]: <__main__.PickleLambdify at 0x7f30a5dbc290>
    
    In [39]: pickle.loads(pickle.dumps(F_fresnel_pickle)).regenerate()
    Out[39]: <function _lambdifygenerated(a, b)>
    

    This way you can use pickle transparently with your function as part of a larger data structure but clearly at runtime unpickling this needs to import sympy, create the symbolic expressions and then call lambdify again which is basically the same as what happens if you just pickle the expression yourself in the first place.