Search code examples
pythoninheritancepropertiesattributesstructure

Python class structuring, getting and setting properties and inheritance


This is a problem I come across quite frequently, and I finally want to find out the most Pythonic solution for it. At its core, the situation can be described as follows:

  1. A class contains some attributes that must be calculated or tested based on __init__() parameters.
  2. Attributes may or may not allow alteration after class instantiating.
  3. Attributes that are allowed to be altered must be recalculated/tested each time after doing so.
  4. The class should be "inheritance friendly."

Goal in example project: A geometry module that helps with 2d shape calculations (such as the minimum distance between polygons or intersection points between lines). Lines and Polygons have quite some attributes in common, so create a BaseGeometry class from which the Line and Polygon classes inherit. The BaseGeometry attributes:

  • points a (n,x,y) list of points that are set on instantiation and can be mutated. Each time it is set, it must be asserted to be a numpy array.
  • domain: tuple of (xmin, xmax, ymin, ymax) that describes the shape's bounds. Must be recalculated each time points is set, but may not be externally altered.

Below, I have written up three (simplified) solutions for approaching this problem, all with their pros and cons.

METHOD 1: Wrongly allows alteration of domain and unclear on which class attributes exist.

class BaseGeometry1:
    
    def __init__(self, points):
        self.set_points(points)

    def set_points(self, points):
        assert type(points) is np.ndarray
        self.points = points
        x, y = points.T
        self.domain = ((min(x), max(x), min(y), max(y)))

METHOD 2: Still allows domain to be altered. Clear on which attributes exist, but feels clunky/illogical. Arbitrary whether calculate_domain() has points as a parameter or just use self.points. calculate_domain() Would have to be called each time points is updated.

class BaseGeometry2:
    
    def __init__(self, points):
        self.points = self.set_points(points)
        self.domain = self.calculate_domain()

    def set_points(self, points):
        assert type(points) is np.ndarray
        return points

    def calculate_domain(self):
        x, y = self.points.T
        return (min(x), max(x), min(y), max(y))

METHOD 3: I feel like this is on the right track, but I'm still unsure about the structuring. It feels weird that points.setter also sets _domain. Also, is it bad practice to just use the @property decorators for all class attributes?

class BaseGeometry3:

    def __init__(self, points):
        self.points = points

    @property
    def points(self):
        return self._points

    @points.setter
    def points(self, points):
        assert type(points) is np.ndarray
        self._points = points
        x, y = points.T
        self._domain = ((min(x), max(x), min(y), max(y)))
      
    @property
    def domain(self):
        return  self._domain

My questions are as follows:

  1. Is one of these methods considered conventional? Why/why not?
  2. What are some additional conventions to keep in mind while approaching this problem?
  3. What are some other tips or sources that can help me to improve my class structuring?
  4. Would a class inheriting from GeometryBase3 have to completely redefine the points.setter method in order to introduce some new attributes that are calculated from points?

Also, I'm new to asking questions here, so any feedback on the post is welcome as well. Thanks in advance for your time and any answers!

Kind regards,

Joost


Solution

  • There's no one right answer here, but I like parts of Method 2 and Method 3.

    How about something like this? This uses the descriptor protocol, of which a property class is just a specific form. It adheres more to the DRY ("don't repeat yourself") principle than using a @property decorator for each attribute in your class.

    class UnsettableGeometricDescriptor:
        def __set_name__(self, owner, name):
            self.name = name
            self.lookup_name = f'_{name}'
    
        def __get__(self, obj, type=None):
            return getattr(obj, self.lookup_name)
    
        def __set__(self, obj, value):
            raise AttributeError(f'Attribute {self.name} cannot be set directly, use method "update_shape" instead')
    
    
    class BaseGeometry4:
        def __init__(self, points):
            self.update_shape(points)
    
        points = UnsettableGeometricDescriptor()
        domain = UnsettableGeometricDescriptor()
    
        def update_shape(self, points):
            assert isinstance(points, np.ndarray), "points parameter has to be a numpy array!!!"
            self._points = points 
            x, y = points.T
            self._domain = ((min(x), max(x), min(y), max(y)))
    

    The nice thing about this method is its extensibility -- if you need to add more unsettable attributes to a subclass, it's easy to do so, and you can extend update_shape() by calling super() on it in a subclass.

    What's going on here?

    If you instantiate an instance of BaseGeometry4, you'll find that it has a points attribute and a domain attribute, but neither can be set directly.

    What's going on here, is when you "access" the points attribute, this invokes the __get__ method in the UnsettableGeometricDescriptor class. This __get__ method effectively just redirects you to the _points attribute of the BaseGeometry4 instance in the case of the points attribute, and _domain in the case of the domain attribute. The __set__ method of the descriptor just throws an exception, since you never want a user to be able to update those attributes directly. The values that these descriptor-classes defer to (_points and _domain) are set within the update_shape() method of BaseGeometry4

    There's a great RealPython tutorial on the descriptor protocol here, and the official python docs on descriptors can be found here.