Search code examples
pythonmemoization

What can be done to speed up this memoization decorator?


What I want is a memoization decorator that:

I've tweaked an example I saw and came up with the following:

import functools

class Memoized(object):
  """Decorator that caches a function's return value each time it is called.
  If called later with the same arguments, the cached value is returned, and
  not re-evaluated.
  """

  __cache = {}

  def __init__(self, func):
    self.func = func
    self.key = (func.__module__, func.__name__)

  def __call__(self, *args):
    try:
      return Memoized.__cache[self.key][args]
    except KeyError:
      value = self.func(*args)
      Memoized.__cache[self.key] = {args : value}
      return value
    except TypeError:
      # uncachable -- for instance, passing a list as an argument.
      # Better to not cache than to blow up entirely.
      return self.func(*args)

  def __get__(self, obj, objtype):
    """Support instance methods."""
    return functools.partial(self.__call__, obj)

  @staticmethod
  def reset():
    Memoized.__cache = {}

My problem with it is that the caching part seems to involve a lot of overhead (eg. for recursive functions). Using the following function as an example, I can call fib(30) ten times with the non-memoized version in less time than the memoized version.

def fib(n):

   if n in (0, 1):
      return n
   return fib(n-1) + fib(n-2)

Can anyone suggest a better way to write this decorator? (or point me to a better (ie. faster) decorator that does what I want). I'm not interested in preserving method signatures, or helping introspection tools "know" anything about the decorated function.

Thanks.

P.S. Using python 2.7


Solution

  • You're not actually caching any data, because each time you set a new cached value you overwrite the previous:

    Memoized.__cache[self.key] = {args : value}
    

    eg.

    import functools
    
    class Memoized(object):
        """Decorator that caches a function's return value each time it is called.
        If called later with the same arguments, the cached value is returned, and
        not re-evaluated.
        """
    
        cache = {}
    
        def __init__(self, func):
            self.func = func
            self.key = (func.__module__, func.__name__)
            self.cache[self.key] = {}
    
        def __call__(self, *args):
          try:
              return Memoized.cache[self.key][args]
          except KeyError:
              value = self.func(*args)
              Memoized.cache[self.key][args] = value
              return value
          except TypeError:
              # uncachable -- for instance, passing a list as an argument.
              # Better to not cache than to blow up entirely.
              return self.func(*args)
    
        def __get__(self, obj, objtype):
            """Support instance methods."""
            return functools.partial(self.__call__, obj)
    
        @staticmethod
        def reset():
            Memoized.cache = {}
    
    • fib(30) without caching: 2.86742401123
    • fib(30) with caching: 0.000198125839233

    Some other notes:

    • Don't use __prefix; there's no reason to here and it only uglifies the code.
    • Instead of using a single, monolithic, class-attribute dict, give each instance of Memoized its own dict, and keep a registry of Memoized objects. This improves encapsulation, and removes the oddity of depending on the module and function names.

    .

    import functools
    import weakref
    
    class Memoized(object):
        """Decorator that caches a function's return value each time it is called.
        If called later with the same arguments, the cached value is returned, and
        not re-evaluated.
    
        >>> counter = 0
        >>> @Memoized
        ... def count():
        ...     global counter
        ...     counter += 1
        ...     return counter
    
        >>> counter = 0
        >>> Memoized.reset()
        >>> count()
        1
        >>> count()
        1
        >>> Memoized.reset()
        >>> count()
        2
    
        >>> class test(object):
        ...     @Memoized
        ...     def func(self):
        ...         global counter
        ...         counter += 1
        ...         return counter
        >>> testobject = test()
        >>> counter = 0
        >>> testobject.func()
        1
        >>> testobject.func()
        1
        >>> Memoized.reset()
        >>> testobject.func()
        2
        """
    
        caches = weakref.WeakSet()
    
        def __init__(self, func):
            self.func = func
            self.cache = {}
            Memoized.caches.add(self)
    
        def __call__(self, *args):
          try:
              return self.cache[args]
          except KeyError:
              value = self.func(*args)
              self.cache[args] = value
              return value
          except TypeError:
              # uncachable -- for instance, passing a list as an argument.
              # Better to not cache than to blow up entirely.
              return self.func(*args)
    
        def __get__(self, obj, objtype):
            """Support instance methods."""
            return functools.partial(self.__call__, obj)
    
        @staticmethod
        def reset():
            for memo in Memoized.caches:
                memo.cache = {}
    
    if __name__ == '__main__':
        import doctest
        doctest.testmod()
    

    Edited: add tests, and use weakref.WeakSet. Note that WeakSet is only available in 2.7 (which the OP is using); for a version that works in 2.6, see the edit history.