Search code examples
pythonpython-3.xcachinglru

How to cache result of function depending on the value returned


I am aware of functools.lru_cache and functools.cache (since Python 3.9) but I struggle with caching these arguments of a function which do not return None (or any other specific value):

from functools import lru_cache

@lru_cache
def my_fun(link):
    res = fetch_data(link)
    return res

res is None when fetch_data is encountering an intermittent error. And this is when I do not want the result to be cached.


Solution

  • Cache for functions with positional arguments

    I figured that I can implement the cache on my own using a dictionary and store the result only when the return value is not None:

    from functools import wraps
    
    def my_cache(no_cache_result=tuple()):
        if no_cache_result is None:
            no_cache_result = tuple()
    
        cache = dict()
    
        def decorator(fun):
            @wraps(fun)
            def wrapper(*args, **kwargs):
                assert len(kwargs) == 0
                if args in cache:
                    print('cache: taken from cache')
                    return cache[args]
                else:
                    res = fun(*args, **kwargs)
                    if res not in no_cache_result:
                        print('cache: stored in cache')
                        cache[args] = res
                    else:
                        print('cache: NOT stored')
                    return res
    
            return wrapper
        return decorator
    
    
    @my_cache(no_cache_result=[None])
    def my_fun(a):
        print(f'my_fun: called with {a}')
        if a <= 1:
            return a
        else:
            return None
    
    
    my_fun(0)
    my_fun(1)
    my_fun(2)
    
    my_fun(0)
    my_fun(1)
    my_fun(2)
    

    Which prints (as expected):

    my_fun: called with 0
    cache: stored in cache
    my_fun: called with 1
    cache: stored in cache
    my_fun: called with 2
    cache: NOT stored
    cache: taken from cache
    cache: taken from cache
    my_fun: called with 2
    cache: NOT stored
    

    Cache for functions with positional and keyword arguments

    The solution above limits the functions that can be decorated to those with only positional arguments and not keyword arguments.

    At the expense of small slow-down it can be improved in the following way:

    def my_cache(no_cache_result=tuple()):
        if no_cache_result is None:
            no_cache_result = tuple()
    
        cache = dict()
    
        def decorator(fun):
            @wraps(fun)
            def wrapper(*args, **kwargs):
                _kwargs = tuple(kwargs.items())
                if (args, _kwargs) in cache:
                    print('cache: taken from cache')
                    return cache[(args, _kwargs)]
                else:
                    res = fun(*args, **kwargs)
                    if res not in no_cache_result:
                        print('cache: stored in cache')
                        cache[(args, _kwargs)] = res
                    else:
                        print('cache: NOT stored')
                    return res
    
            return wrapper
        return decorator
    

    Which works as expected:

    @my_cache(no_cache_result=[None, ])
    def my_fun2(a, b=7):
        print(f'my_fun2: called with {a}, {b}')
        if a <= 1:
            return a
        else:
            return None
    
    
    my_fun2(0, b=2)
    my_fun2(1)
    my_fun2(2)
    
    my_fun2(0, b=2)
    my_fun2(1)
    my_fun2(2)
    

    Printing:

    my_fun2: called with 0, 2
    cache: stored in cache
    my_fun2: called with 1, 7
    cache: stored in cache
    my_fun2: called with 2, 7
    cache: NOT stored
    cache: taken from cache
    cache: taken from cache
    my_fun2: called with 2, 7
    cache: NOT stored
    

    How it works?

    Decorators

    The details on the implementation of wrappers (also, with arguments) you can find amply discussed in answers to Decorators with parameters?.

    Forbidden return values - performance

    The performance of the cache depends on the type of passed no_cache_result argument. In case you wish to restrict caching for more than a few return values, it is recommended to pass a set, instead of typically used list because if x in no_cache_result operation is much quicker for sets than for lists.