I'm writing a class that parses HTML in order to provide an interface to a profile on a webpage. It looks something like this:
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
def username(self):
return self.title.split(':')[0]
Except more complex and time consuming. Since I know that the underlying profiles aren't going to be changing during the lifetime of a Profile
object, I thought this would be a good place to cache results in order to avoid recalculating values that are already known. I implemented this with a decorator, and the result looks like this:
def cached_resource(method_to_cache):
def decorator(self, *args, **kwargs):
method_name = method_to_cache.__name__
try:
return self._cache[method_name]
except KeyError:
self._cache[method_name] = method_to_cache(self, *args, **kwargs)
return self._cache[method_name]
return decorator
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
self._cache = {}
@cached_resource
def username(self):
return self.title.split(':')[0]
When I give this code to pylint, it complains about cached_resource
having access to a protected variable of a client class.
I realize that the distinction between public and private isn't a huge deal in Python, but I'm still curious -- have I done something bad here? Is it poor style to have decorators rely on implementation details of the classes they're associated with?
EDIT: I'm unclear about how the closure in Duncan's answer works, so maybe this is a little bit kludge-y, but would this be a simpler solution?
def cached_resource(method_to_cache):
def decorator(self, *args, **kwargs):
method_name = method_to_cache.__name__
try:
return self._cache[method_name]
except KeyError:
self._cache[method_name] = method_to_cache(self, *args, **kwargs)
except AttributeError:
self._cache = {}
self._cache[method_name] = method_to_cache(self, *args, **kwargs)
finally:
return self._cache[method_name]
return decorator
There is a bit of a code smell about that, I think I would agree with pylint on this one though it is quite subjective.
Your decorator looks like it is a general-purpose decorator, but it is tied into the internal implementation detail of the class. If you tried to use it from another class it won't work without the initialisation of _cache
in __init__
. The linkage I don't like is that the knowledge of an attribute called '_cache' is shared between both the class and the decorator.
You could move the initialisation of _cache
out of __init__
and into the decorator. I don't know if that would help pacify pylint and it still requires the class to know about and avoid using the attribute. A cleaner solution here (I think) would be to pass the name of the cache attribute into the decorator. That should break the linkage cleanly:
def cached_resource(cache_attribute):
def decorator_factory(method_to_cache):
def decorator(self, *args, **kwargs):
method_name = method_to_cache.__name__
cache = getattr(self, cache_attribute)
try:
return cache[method_name]
except KeyError:
result = cache[method_name] = method_to_cache(self, *args, **kwargs)
return result
return decorator
return decorator_factory
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
self._cache = {}
@cached_resource('_cache')
def username(self):
return self.title.split(':')[0]
And if you don't like a lot of decorator calls repeating the name of the attribute then:
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
self._cache = {}
with_cache = cached_resource('_cache')
@with_cache
def username(self):
return self.title.split(':')[0]
Edit:
Martineau suggests this may be overkill. It could be if you don't actually need separate access to the _cache
attribute inside the class (e.g. to have a cache reset method). In that case you could manage the cache entirely within the decorator, but if you are going to do that you don't need a cache dictionary on the instance at all, as you can store the cache in the decorator and key on the Profile
instance:
from weakref import WeakKeyDictionary
def cached_resource(method_to_cache):
cache = WeakKeyDictionary()
def decorator(self, *args, **kwargs):
try:
return cache[self]
except KeyError:
result = cache[self] = method_to_cache(self, *args, **kwargs)
return result
return decorator
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
self._cache = {}
@cached_resource
def username(self):
return self.title.split(':')[0]