So I recently asked a question about memoization and got some great answers, and now I want to take it to the next level. After quite a bit of googling, I could not find a reference implementation of a memoize decorator that was able to cache a function that took keyword arguments. In fact, most of them simply used *args
as the key for cache lookups, meaning that it would also break if you wanted to memoize a function that accepted lists or dicts as arguments.
In my case, the first argument to the function is a unique identifier by itself, suitable for use as a dict key for cache lookups, however I wanted the ability to use keyword arguments and still access the same cache. What I mean is, my_func('unique_id', 10)
and my_func(foo=10, func_id='unique_id')
should both return the same cached result.
In order to do this, what we need is a clean and pythonic way of saying 'inspect kwargs for whichever keyword it is that corresponds to the first argument)'. This is what I came up with:
class memoize(object):
def __init__(self, cls):
if type(cls) is FunctionType:
# Let's just pretend that the function you gave us is a class.
cls.instances = {}
cls.__init__ = cls
self.cls = cls
self.__dict__.update(cls.__dict__)
def __call__(self, *args, **kwargs):
"""Return a cached instance of the appropriate class if it exists."""
# This is some dark magic we're using here, but it's how we discover
# that the first argument to Photograph.__init__ is 'filename', but the
# first argument to Camera.__init__ is 'camera_id' in a general way.
delta = 2 if type(self.cls) is FunctionType else 1
first_keyword_arg = [k
for k, v in inspect.getcallargs(
self.cls.__init__,
'self',
'first argument',
*['subsequent args'] * (len(args) + len(kwargs) - delta)).items()
if v == 'first argument'][0]
key = kwargs.get(first_keyword_arg) or args[0]
print key
if key not in self.cls.instances:
self.cls.instances[key] = self.cls(*args, **kwargs)
return self.cls.instances[key]
The crazy thing is that this actually works. Eg, if you decorate like this:
@memoize
class FooBar:
instances = {}
def __init__(self, unique_id, irrelevant=None):
print id(self)
Then from your code you can call either FooBar('12345', 20)
or FooBar(irrelevant=20, unique_id='12345')
and actually get the same instance of FooBar. You can then define a different class with a different name for the first argument, because it works in a general way (ie, the decorator doesn't need to know anything specfic about the class it's decorating in order for this to work).
The problem is, it's an ungodly mess ;-)
It works because inspect.getcallargs
returns a dict mapping the defined keywords to the arguments you supply it, so I supply it some phony arguments and then inspect the dict for the first argument that got passed.
What would be much nicer, if such a thing were to even exist, is an analogue to inspect.getcallargs
that returned both kinds of arguments unified as a list of the arguments instead of as a dict of the keyword arguments. That would allow something like this:
def __call__(self, *args, **kwargs):
key = inspect.getcallargsaslist(self.cls.__init__, None, *args, **kwargs)[1]
if key not in self.cls.instances:
self.cls.instances[key] = self.cls(*args, **kwargs)
return self.cls.instances[key]
The other way I can see of tackling this would be using the dict provided by inspect.getcallargs
as the lookup cache key directly, but that would require a repeatable way to make identical strings from identical hashes, which is something I've heard can't be relied upon (I guess i'd have to construct the string myself after sorting the keys).
Does anybody have any thoughts on this? Is it Wrong to want to call a function with keyword arguments and cache the results? Or just Very Difficult?
I'd suggest something like the following:
import inspect
class key_memoized(object):
def __init__(self, func):
self.func = func
self.cache = {}
def __call__(self, *args, **kwargs):
key = self.key(args, kwargs)
if key not in self.cache:
self.cache[key] = self.func(*args, **kwargs)
return self.cache[key]
def normalize_args(self, args, kwargs):
spec = inspect.getargs(self.func.__code__).args
return dict(kwargs.items() + zip(spec, args))
def key(self, args, kwargs):
a = self.normalize_args(args, kwargs)
return tuple(sorted(a.items()))
Example:
@key_memoized
def foo(bar, baz, spam):
print 'calling foo: bar=%r baz=%r spam=%r' % (bar, baz, spam)
return bar + baz + spam
print foo(1, 2, 3)
print foo(1, 2, spam=3) #memoized
print foo(spam=3, baz=2, bar=1) #memoized
Note that you can also extend key_memoized
and override its key()
method to provide more specific memoization strategies, e.g. to ignore some of the arguments:
class memoize_by_bar(key_memoized):
def key(self, args, kwargs):
return self.normalize_args(args, kwargs)['bar']
@memoize_by_bar
def foo(bar, baz, spam):
print 'calling foo: bar=%r baz=%r spam=%r' % (bar, baz, spam)
return bar
print foo('x', 'ignore1', 'ignore2')
print foo('x', 'ignore3', 'ignore4')