Search code examples
pythonpython-3.xcachingmutable

How to get functools.lru_cache to return new instances?


I use Python's lru_cache on a function which returns a mutable object, like so:

import functools

@functools.lru_cache()
def f():
    x = [0, 1, 2]  # Stand-in for some long computation
    return x

If I call this function, mutate the result and call it again, I do not obtain a "fresh", unmutated object:

a = f()
a.append(3)
b = f()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2, 3]

I get why this happens, but it's not what I want. A fix would be to leave the caller in charge of using list.copy:

a = f().copy()
a.append(3)
b = f().copy()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2]

However I would like to fix this inside f. A pretty solution would be something like

@functools.lru_cache(copy=True)
def f():
    ...

though no copy argument is actually taken by functools.lru_cache.

Any suggestion as to how to best implement this behavior?

Edit

Based on the answer from holdenweb, this is my final implementation. It behaves exactly like the builtin functools.lru_cache by default, and extends it with the copying behavior when copy=True is supplied.

import functools
from copy import deepcopy

def lru_cache(maxsize=128, typed=False, copy=False):
    if not copy:
        return functools.lru_cache(maxsize, typed)
    def decorator(f):
        cached_func = functools.lru_cache(maxsize, typed)(f)
        @functools.wraps(f)
        def wrapper(*args, **kwargs):
            return deepcopy(cached_func(*args, **kwargs))
        return wrapper
    return decorator

# Tests below

@lru_cache()
def f():
    x = [0, 1, 2]  # Stand-in for some long computation
    return x

a = f()
a.append(3)
b = f()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2, 3]

@lru_cache(copy=True)
def f():
    x = [0, 1, 2]  # Stand-in for some long computation
    return x

a = f()
a.append(3)
b = f()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2]

Solution

  • Since the lru_cache decorator has unsuitable behaviour for you, the best you can do is to build your own decorator that returns a copy of what it gets from lru_cache. This will mean that the first call with a particular set of arguments will create two copies of the object, since now the cache will only be holding prototype objects.

    This question is made more difficult because lru_cache can take arguments (mazsize and typed), so a call to lru_cache returns a decorator. Remembering that a decorator takes a function as its argument and (usually) returns a function, you will have to replace lru_cache with a function that takes two arguments and returns a function that takes a function as an argument and returns a (wrapped) function which is not an easy structure to wrap your head around.

    You would then write your functions using the copying_lru_cache decorator instead of the standard one, which is now applied "manually" inside the updated decorator.

    Depending on how heavy the mutations are, you might get away without using deepcopy, but you don't give enough information to determine that.

    So your code would then read

    from functools import lru_cache
    from copy import deepcopy
    
    def copying_lru_cache(maxsize=10, typed=False):
        def decorator(f):
            cached_func = lru_cache(maxsize=maxsize, typed=typed)(f)
            def wrapper(*args, **kwargs):
                return deepcopy(cached_func(*args, **kwargs))
            return wrapper
        return decorator
    
    @copying_lru_cache()
    def f(arg):
        print(f"Called with {arg}")
        x = [0, 1, arg]  # Stand-in for some long computation
        return x
    
    print(f(1), f(2), f(3), f(1))
    

    This prints

    Called with 1
    Called with 2
    Called with 3
    [0, 1, 1] [0, 1, 2] [0, 1, 3] [0, 1, 1]
    

    so the cacheing behaviour your require appears to be present. Note also tht the documentation for lru_cache specifically warns that

    In general, the LRU cache should only be used when you want to reuse previously computed values. Accordingly, it doesn’t make sense to cache functions with side-effects, functions that need to create distinct mutable objects on each call, or impure functions such as time() or random().