Storing cached values of a function as an attribute of the function in Python

I would like to have @cached decorator akin @memoized that stores cached values of a function as an attribute of the function. Something like this

def cached(fcn):
    def cached_fcn(*args,**kwargs):
        call_signature=",".join([repr(a) for a in args] +
                                [repr(kwa[0])+"="+repr(kwa[1])
                                 for kwa in sorted(kwargs.items()) ])
        if call_signature not in cached_fcn.cache:
            cached_fcn.cache[call_signature] = fcn(*args,**kwargs)
        return copy.deepcopy(cached_fcn.cache[call_signature])
    cached_fcn.__name__ = fcn.__name__
    cached_fcn.__doc__ = fcn.__doc__
    cached_fcn.__annotations__ = fcn.__annotations__    
    cached_fcn.cache = dict()
    return cached_fcn

@cached
def fib(n):
    if n in (0,1): return 1
    return fin(n-1) + fib(n-2)

Assuming that the function does not access anything global, is it safe to do that? What if threading is used?

Solution

There is one pitfall that may be relevant to your implementation. Observe

def pf(*args, **kwargs):
    print(args)
    print(kwargs)

and call this with

pf(1, k="a")
pf(1, "a")
pf(k="a", x=1)

All argument specs are valid specs for a function with signature f(x, k) (with or without defaults) - so you can't really know the order of the arguments, their names, and sorting on kwargs is definitely not enough in a general case (empty in the second example, while args is empty in the last with order reversed). Defaults make this worse as if f(x, k=3) is the definition, then f(2, 3) and f(2) and f(x=2) f(2, k=3) and f(x=2, k=3) (also reversed) are the same, with differing kwargs and args passed to the wrapper.

A more robust solution will use inspect.getargspec(your_function). This uses reflection to know the actual argument names of the function as they were defined. You then have to "fill in" the arguments your are given in *args and **kwargs, and use that to generate your call signature:

import inspect
def f(x, k=3): pass
argspec = inspect.getargspec(f) # returns ArgSpec(args=['x', 'k'], varargs=None, keywords=None, defaults=(3,))

Now you can generate a call signature (from *args and **kwargs):

signature = {}
for arg, default in zip(reversed(argspec.args), reversed(argspec.defaults)):
    signature[arg] = default

set_args = set()
for arg, val in zip(argspec.args, args):
    set_args.add(arg)
    signature[arg] = val

for arg, val in kwargs.items():
    # if arg in set_args:
    #    raise TypeError(f'{arg} set both in kwargs and in args!')
    # if arg not in argspec.args:
    #    raise TypeError(f'{arg} is not a valid argument for function!')
    signature[arg] = val

# if len(signature) == len(argspec.args):
#     raise TypeError(f'Received {len(signature)} arguments but expected {len(argspec.args)} arguments!')

Then you can use the dictionary signature itself as the call signature. I showed some "correctness" checks above though you may just want to let the call itself to the function detect and fail. I did not handle functions with **kwargs and *args (the actual used names are given in argspec). I think they may just involve having args and kwargs keys in signature. I am still not sure how robust the above is.

Even better, use the builtin functools.lru_cache which does what you want.

Regarding threading, you have the same dangers as anytime multiple threads access the same array. There is nothing special about function attributes. lru_cache should be safe (there was one bug that was resolved) with the one caveat:

To help measure the effectiveness of the cache and tune the maxsize parameter, the wrapped function is instrumented with a cache_info() function that returns a named tuple showing hits, misses, maxsize and currsize. In a multi-threaded environment, the hits and misses are approximate