Search code examples
pythonquery-optimizationdill

How do I pickle a "memoized" Python function?


I have the following code:

def f(input,MEM={}):
    if len(MEM) == 0:
        with open('dill.pkl', 'rb') as f:
            MEM = dill.load(f)
    if input not in MEM:
        intended_output = complex_function(input)
        MEM[input] = intended_output
    return MEM[input]

Running long batches of inputs, I have found that my code runs much slower than if I hadn't loaded MEM originally. ie, with

def f(input,MEM={}):
    if len(MEM) == -1:
        return None
    if input not in MEM:
        intended_output = complex_function(input)
        MEM[input] = intended_output
    return MEM[input]

And I run both f and f2 for a four thousand inputs, it takes half an hour for f to complete, but only 40 seconds for f2 to complete. Is this because when I load MEM with dill, it represented by structure which is slower to access? I have tried copying and deepcopying MEM, this only makes the issue worse (especially with deepcopy, then it takes multiple seconds to do even the smaller inputs).


Solution

  • MEM = dill.load(...)
    

    This creates a new local variable called MEM but does not change the default argument MEM.

    Therefore the default argument MEM is an empty dictionary and the file is unpickled each time the function is called.

    To actually change the default argument, you could simply use instead:

    MEM.update(dill.load(...))
    

    See https://docs.python.org/3/library/stdtypes.html#dict.update.