I have the following code:
def f(input,MEM={}):
if len(MEM) == 0:
with open('dill.pkl', 'rb') as f:
MEM = dill.load(f)
if input not in MEM:
intended_output = complex_function(input)
MEM[input] = intended_output
return MEM[input]
Running long batches of inputs, I have found that my code runs much slower than if I hadn't loaded MEM
originally. ie, with
def f(input,MEM={}):
if len(MEM) == -1:
return None
if input not in MEM:
intended_output = complex_function(input)
MEM[input] = intended_output
return MEM[input]
And I run both f
and f2
for a four thousand inputs, it takes half an hour for f
to complete, but only 40 seconds for f2
to complete. Is this because when I load MEM
with dill, it represented by structure which is slower to access? I have tried copying and deepcopying MEM, this only makes the issue worse (especially with deepcopy, then it takes multiple seconds to do even the smaller inputs).
MEM = dill.load(...)
This creates a new local variable called MEM
but does not change the default argument MEM
.
Therefore the default argument MEM
is an empty dictionary and the file is unpickled each time the function is called.
To actually change the default argument, you could simply use instead:
MEM.update(dill.load(...))
See https://docs.python.org/3/library/stdtypes.html#dict.update.