Search code examples
pythonmemoizationpython-decoratorsjoblibklepto

Decorators for selective caching / memoization


I am looking for a way of building a decorator @memoize that I can use in functions as follows:

@memoize
my_function(a, b, c):
    # Do stuff 
    # result may not always be the same for fixed (a,b,c)
return result

Then, if I do:

result1 = my_function(a=1,b=2,c=3)
# The function f runs (slow). We cache the result for later

result2 = my_function(a=1, b=2, c=3)
# The decorator reads the cache and returns the result (fast)

Now say that I want to force a cache update:

result3 = my_function(a=1, b=2, c=3, force_update=True)
# The function runs *again* for values a, b, and c. 

result4 = my_function(a=1, b=2, c=3)
# We read the cache

At the end of the above, we always have result4 = result3, but not necessarily result4 = result, which is why one needs an option to force the cache update for the same input parameters.

How can I approach this problem?

Note on joblib

As far as I know joblib supports .call, which forces a re-run, but it does not update the cache.

Follow-up on using klepto:

Is there any way to have klepto (see @Wally's answer) cache its results by default under a specific location? (e.g. /some/path/) and share this location across multiple functions? E.g. I would like to say

cache_path = "/some/path/"

and then @memoize several functions in a given module under the same path.


Solution

  • I would suggest looking at joblib and klepto. Both have very configurable caching algorithms, and may do what you want.

    Both definitely can do the caching for result1 and result2, and klepto provides access to the cache, so one can pop a result from the local memory cache (without removing it from a stored archive, say in a database).

    >>> import klepto
    >>> from klepto import lru_cache as memoize
    >>> from klepto.keymaps import hashmap
    >>> hasher = hashmap(algorithm='md5')
    >>> @memoize(keymap=hasher)
    ... def squared(x):
    ...   print("called")
    ...   return x**2
    ... 
    >>> squared(1)
    called
    1
    >>> squared(2)
    called
    4
    >>> squared(3)
    called
    9
    >>> squared(2)
    4
    >>> 
    >>> cache = squared.__cache__()
    >>> # delete the 'key' for x=2
    >>> cache.pop(squared.key(2))
    4
    >>> squared(2)
    called
    4
    

    Not exactly the keyword interface you were looking for, but it has the functionality you are looking for.