Search code examples
pythoncachingmemoization

Memoization: set the size of the consumed cache


I am using memoization in order to speed up the utilization of a complex function complexfunct(). This function takes as input a numpy.array of varying dimension (it can store from 5 to 15 values). Each value of the numpy.array belongs to a set of 5 values. So the number of allowed inputs for my complexfunct() is quite large, it is not possible to memoize all of them. That is why when I run my jupyter notebook, it crashes.

The memoization funct I'm using is this one:

def memoize(func):
    """Store the results of the decorated function for fast lookup
    """
    # Store results in a dict that maps arguments to results
    cache = {}
    def wrapper(*args, **kwargs):
        key = str(args) + str(kwargs)
        if key not in cache:
            cache[key] = func(*args, **kwargs)
        return cache[key]
    return wrapper

My question is: can I set the size of the consumed cache, so that if it is saturated and a new input has to be store in the cache, then it will replace the first entry - or better, the least recently used.

Thank you all in advanced.


Solution

  • if it is saturated and a new input has to be store in the cache, then it will replace the first entry - or better, the least recently used.

    Considering that you are caring about insertion order, when deciding what to remove I suggest using collections.OrderedDict in place of dict i.e. add import collections and replace

    cache = {}
    

    using

    cache = collections.OrderedDict()
    

    then add check after insertion and if size is beyond limit just do:

    cache.popitem(last=False)
    

    to jettison oldest entry.