python python-3.x numpy caching decorator

Cache decorator for numpy arrays

I am trying to make a cache decorator for functions with numpy array input parameters

from functools import lru_cache
import numpy as np
from time import sleep

a = np.array([1,2,3,4])

@lru_cache()
def square(array):
    sleep(1)
    return array * array

square(a)

But numpy arrays are not hashable,

TypeError                                 Traceback (most recent call last)
<ipython-input-13-559f69d0dec3> in <module>()
----> 1 square(a)

TypeError: unhashable type: 'numpy.ndarray'

So they need to be converted to tuples. I have this working and caching correctly:

@lru_cache()
def square(array_hashable):
    sleep(1)
    array = np.array(array_hashable)
    return array * array

square(tuple(a))

But I wanted to wrap it all up in a decorator, so far I have tried:

def np_cache(function):
    def outter(array):
        array_hashable = tuple(array)

        @lru_cache()
        def inner(array_hashable_inner):
            array_inner = np.array(array_hashable_inner)
            return function(array_inner)

        return inner(array_hashable)

    return outter

@np_cache
def square(array):
    sleep(1)
    return array * array

But caching is not working. Computation is performed but not cached properly, as I am always waiting 1 second.

What am I missing here? I'm guessing lru_cache isn't getting the context right and its being instantiated in each call, but I don't know how to fix it.

I have tried blindly throwing the functools.wraps decorator here and there with no luck.

Solution

Your wrapper function creates a new inner() function each time you call it. And that new function object is decorated at that time, so the end result is that each time outter() is called, a new lru_cache() is created and that'll be empty. An empty cache will always have to re-calculate the value.

You need to create a decorator that attaches the cache to a function created just once per decorated target. If you are going to convert to a tuple before calling the cache, then you'll have to create two functions:

from functools import lru_cache, wraps

def np_cache(function):
    @lru_cache()
    def cached_wrapper(hashable_array):
        array = np.array(hashable_array)
        return function(array)

    @wraps(function)
    def wrapper(array):
        return cached_wrapper(tuple(array))

    # copy lru_cache attributes over too
    wrapper.cache_info = cached_wrapper.cache_info
    wrapper.cache_clear = cached_wrapper.cache_clear

    return wrapper

The cached_wrapper() function is created just once per call to np_cache() and is available to the wrapper() function as a closure. So wrapper() calls cached_wrapper(), which has a @lru_cache() attached to it, caching your tuples.

I also copied across the two function references that lru_cache puts on a decorated function, so they are accessible via the returned wrapper as well.

In addition, I also used the @functools.wraps() decorator to copy across metadata from the original function object to the wrapper, such as the name, annotations and documentation string. This is always a good idea, because that means your decorated function will be clearly identified in tracebacks, when debugging and when you need to access documentation or annotations. The decorator also adds a __wrapped__ attribute pointing back to the original function, which would let you unwrap the decorator again if need be.