Search code examples
pythonpython-multithreading

Storing cached values of a function as an attribute of the function in Python


I would like to have @cached decorator akin @memoized that stores cached values of a function as an attribute of the function. Something like this

def cached(fcn):
    def cached_fcn(*args,**kwargs):
        call_signature=",".join([repr(a) for a in args] +
                                [repr(kwa[0])+"="+repr(kwa[1])
                                 for kwa in sorted(kwargs.items()) ])
        if call_signature not in cached_fcn.cache:
            cached_fcn.cache[call_signature] = fcn(*args,**kwargs)
        return copy.deepcopy(cached_fcn.cache[call_signature])
    cached_fcn.__name__ = fcn.__name__
    cached_fcn.__doc__ = fcn.__doc__
    cached_fcn.__annotations__ = fcn.__annotations__    
    cached_fcn.cache = dict()
    return cached_fcn

@cached
def fib(n):
    if n in (0,1): return 1
    return fin(n-1) + fib(n-2)

Assuming that the function does not access anything global, is it safe to do that? What if threading is used?


Solution

  • There is one pitfall that may be relevant to your implementation. Observe

    def pf(*args, **kwargs):
        print(args)
        print(kwargs)
    

    and call this with

    pf(1, k="a")
    pf(1, "a")
    pf(k="a", x=1)
    

    All argument specs are valid specs for a function with signature f(x, k) (with or without defaults) - so you can't really know the order of the arguments, their names, and sorting on kwargs is definitely not enough in a general case (empty in the second example, while args is empty in the last with order reversed). Defaults make this worse as if f(x, k=3) is the definition, then f(2, 3) and f(2) and f(x=2) f(2, k=3) and f(x=2, k=3) (also reversed) are the same, with differing kwargs and args passed to the wrapper.

    A more robust solution will use inspect.getargspec(your_function). This uses reflection to know the actual argument names of the function as they were defined. You then have to "fill in" the arguments your are given in *args and **kwargs, and use that to generate your call signature:

    import inspect
    def f(x, k=3): pass
    argspec = inspect.getargspec(f) # returns ArgSpec(args=['x', 'k'], varargs=None, keywords=None, defaults=(3,)) 
    

    Now you can generate a call signature (from *args and **kwargs):

    signature = {}
    for arg, default in zip(reversed(argspec.args), reversed(argspec.defaults)):
        signature[arg] = default
    
    set_args = set()
    for arg, val in zip(argspec.args, args):
        set_args.add(arg)
        signature[arg] = val
    
    for arg, val in kwargs.items():
        # if arg in set_args:
        #    raise TypeError(f'{arg} set both in kwargs and in args!')
        # if arg not in argspec.args:
        #    raise TypeError(f'{arg} is not a valid argument for function!')
        signature[arg] = val
    
    # if len(signature) == len(argspec.args):
    #     raise TypeError(f'Received {len(signature)} arguments but expected {len(argspec.args)} arguments!')
    

    Then you can use the dictionary signature itself as the call signature. I showed some "correctness" checks above though you may just want to let the call itself to the function detect and fail. I did not handle functions with **kwargs and *args (the actual used names are given in argspec). I think they may just involve having args and kwargs keys in signature. I am still not sure how robust the above is.

    Even better, use the builtin functools.lru_cache which does what you want.

    Regarding threading, you have the same dangers as anytime multiple threads access the same array. There is nothing special about function attributes. lru_cache should be safe (there was one bug that was resolved) with the one caveat:

    To help measure the effectiveness of the cache and tune the maxsize parameter, the wrapped function is instrumented with a cache_info() function that returns a named tuple showing hits, misses, maxsize and currsize. In a multi-threaded environment, the hits and misses are approximate