Search code examples
pythonpython-3.xnumpycachingdecorator

Cache decorator for numpy arrays


I am trying to make a cache decorator for functions with numpy array input parameters

from functools import lru_cache
import numpy as np
from time import sleep

a = np.array([1,2,3,4])

@lru_cache()
def square(array):
    sleep(1)
    return array * array

square(a)

But numpy arrays are not hashable,

TypeError                                 Traceback (most recent call last)
<ipython-input-13-559f69d0dec3> in <module>()
----> 1 square(a)

TypeError: unhashable type: 'numpy.ndarray'

So they need to be converted to tuples. I have this working and caching correctly:

@lru_cache()
def square(array_hashable):
    sleep(1)
    array = np.array(array_hashable)
    return array * array

square(tuple(a))

But I wanted to wrap it all up in a decorator, so far I have tried:

def np_cache(function):
    def outter(array):
        array_hashable = tuple(array)

        @lru_cache()
        def inner(array_hashable_inner):
            array_inner = np.array(array_hashable_inner)
            return function(array_inner)

        return inner(array_hashable)

    return outter

@np_cache
def square(array):
    sleep(1)
    return array * array

But caching is not working. Computation is performed but not cached properly, as I am always waiting 1 second.

What am I missing here? I'm guessing lru_cache isn't getting the context right and its being instantiated in each call, but I don't know how to fix it.

I have tried blindly throwing the functools.wraps decorator here and there with no luck.


Solution

  • Your wrapper function creates a new inner() function each time you call it. And that new function object is decorated at that time, so the end result is that each time outter() is called, a new lru_cache() is created and that'll be empty. An empty cache will always have to re-calculate the value.

    You need to create a decorator that attaches the cache to a function created just once per decorated target. If you are going to convert to a tuple before calling the cache, then you'll have to create two functions:

    from functools import lru_cache, wraps
    
    def np_cache(function):
        @lru_cache()
        def cached_wrapper(hashable_array):
            array = np.array(hashable_array)
            return function(array)
    
        @wraps(function)
        def wrapper(array):
            return cached_wrapper(tuple(array))
    
        # copy lru_cache attributes over too
        wrapper.cache_info = cached_wrapper.cache_info
        wrapper.cache_clear = cached_wrapper.cache_clear
    
        return wrapper
    

    The cached_wrapper() function is created just once per call to np_cache() and is available to the wrapper() function as a closure. So wrapper() calls cached_wrapper(), which has a @lru_cache() attached to it, caching your tuples.

    I also copied across the two function references that lru_cache puts on a decorated function, so they are accessible via the returned wrapper as well.

    In addition, I also used the @functools.wraps() decorator to copy across metadata from the original function object to the wrapper, such as the name, annotations and documentation string. This is always a good idea, because that means your decorated function will be clearly identified in tracebacks, when debugging and when you need to access documentation or annotations. The decorator also adds a __wrapped__ attribute pointing back to the original function, which would let you unwrap the decorator again if need be.