Search code examples
pythonoopstylesdecoratorinformation-hiding

Is it okay for decorators to access private members of a class?


I'm writing a class that parses HTML in order to provide an interface to a profile on a webpage. It looks something like this:

class Profile(BeautifulSoup):
    def __init__(self, page_source):
        super().__init__(page_source)

    def username(self):
        return self.title.split(':')[0]

Except more complex and time consuming. Since I know that the underlying profiles aren't going to be changing during the lifetime of a Profile object, I thought this would be a good place to cache results in order to avoid recalculating values that are already known. I implemented this with a decorator, and the result looks like this:

def cached_resource(method_to_cache):
    def decorator(self, *args, **kwargs):
        method_name = method_to_cache.__name__

        try:
            return self._cache[method_name]
        except KeyError:
            self._cache[method_name] = method_to_cache(self, *args, **kwargs)
            return self._cache[method_name]

    return decorator


class Profile(BeautifulSoup):
    def __init__(self, page_source):
        super().__init__(page_source)
        self._cache = {}

    @cached_resource
    def username(self):
        return self.title.split(':')[0]

When I give this code to pylint, it complains about cached_resource having access to a protected variable of a client class.

I realize that the distinction between public and private isn't a huge deal in Python, but I'm still curious -- have I done something bad here? Is it poor style to have decorators rely on implementation details of the classes they're associated with?

EDIT: I'm unclear about how the closure in Duncan's answer works, so maybe this is a little bit kludge-y, but would this be a simpler solution?

def cached_resource(method_to_cache):
    def decorator(self, *args, **kwargs):
    method_name = method_to_cache.__name__

    try:
        return self._cache[method_name]
    except KeyError:
        self._cache[method_name] = method_to_cache(self, *args, **kwargs)
    except AttributeError:
        self._cache = {}
        self._cache[method_name] = method_to_cache(self, *args, **kwargs)
    finally:
        return self._cache[method_name]

return decorator

Solution

  • There is a bit of a code smell about that, I think I would agree with pylint on this one though it is quite subjective.

    Your decorator looks like it is a general-purpose decorator, but it is tied into the internal implementation detail of the class. If you tried to use it from another class it won't work without the initialisation of _cache in __init__. The linkage I don't like is that the knowledge of an attribute called '_cache' is shared between both the class and the decorator.

    You could move the initialisation of _cache out of __init__ and into the decorator. I don't know if that would help pacify pylint and it still requires the class to know about and avoid using the attribute. A cleaner solution here (I think) would be to pass the name of the cache attribute into the decorator. That should break the linkage cleanly:

    def cached_resource(cache_attribute):
      def decorator_factory(method_to_cache):
        def decorator(self, *args, **kwargs):
            method_name = method_to_cache.__name__
            cache = getattr(self, cache_attribute)
            try:
                return cache[method_name]
            except KeyError:
                result = cache[method_name] = method_to_cache(self, *args, **kwargs)
                return result
    
        return decorator
      return decorator_factory
    
    
    class Profile(BeautifulSoup):
        def __init__(self, page_source):
            super().__init__(page_source)
            self._cache = {}
    
        @cached_resource('_cache')
        def username(self):
            return self.title.split(':')[0]
    

    And if you don't like a lot of decorator calls repeating the name of the attribute then:

    class Profile(BeautifulSoup):
        def __init__(self, page_source):
            super().__init__(page_source)
            self._cache = {}
    
        with_cache = cached_resource('_cache')
    
        @with_cache
        def username(self):
            return self.title.split(':')[0]
    

    Edit: Martineau suggests this may be overkill. It could be if you don't actually need separate access to the _cache attribute inside the class (e.g. to have a cache reset method). In that case you could manage the cache entirely within the decorator, but if you are going to do that you don't need a cache dictionary on the instance at all, as you can store the cache in the decorator and key on the Profile instance:

    from weakref import WeakKeyDictionary
    
    def cached_resource(method_to_cache):
        cache = WeakKeyDictionary()
        def decorator(self, *args, **kwargs):
            try:
                return cache[self]
            except KeyError:
                result = cache[self] = method_to_cache(self, *args, **kwargs)
            return result
        return decorator
    
    class Profile(BeautifulSoup):
        def __init__(self, page_source):
            super().__init__(page_source)
            self._cache = {}
    
        @cached_resource
        def username(self):
            return self.title.split(':')[0]