Search code examples
pythonmetaclasspython-descriptors

Python: Dynamically generating attributes from a list


I want to be able to dynamically generate attributes of a class from a list or dictionary. The idea is that I can define a list of attributes, and then be able to access those attributes using my_class.my_attribute

For example:

class Campaign(metaclass=MetaCampaign):
    _LABELS = ['campaign_type', 'match_type', 'audience_type'] # <-- my list of attributes
    
    for label in _LABELS:
        setattr(cls, label, LabelDescriptor(label))
    
    def __init__(self, campaign_protobuf, labels)
        self._proto = campaign_protobuf
        self._init_labels(labels_dict)
        
    def _init_labels(self, labels_dict):
        # magic...

This obviously won't work because cls doesn't exist, but I'd like:

my_campaign = Campaign(campaign, label_dict)
print(my_campaign.campaign_type)

to return the value campaign_type for the campaign. This is obviously a little complicated, as campaign_type is actually a Descriptor and does a bit of work to retrieve a value from a base Label object.


The Descriptor:

class DescriptorProperty(object):
    def __init__(self):
        self.data = WeakKeyDictionary()

    def __set__(self, instance, value):
        self.data[instance] = value


class LabelTypeDescriptor(DescriptorProperty):
    """A descriptor that returns the relevant metadata from the label"""
    def __init__(self, pattern):
        super(MetaTypeLabel, self).__init__()
        self.cached_data = WeakKeyDictionary()
        # Regex pattern to look in the label:
        #       r'label_type:ThingToReturn'
        self.pattern = f"{pattern}:(.*)"

    def __get__(self, instance, owner, refresh=False):
        # In order to balance computational speed with memory usage, we cache label values
        # when they are first accessed.        
        if self.cached_data.get(instance, None) is None or refresh:
            ctype = re.search(self.pattern, self.data[instance].name) # <-- does a regex search on the label name (e.g. campaign_type:Primary)
            if ctype is None:
                ctype = False
            else:
                ctype = ctype.group(1)
            self.cached_data[instance] = ctype
        return self.cached_data[instance]

This enables me to easily access the value of a label, and if the label is of a type that I care about, it will return the relevant value, otherwise it will return False.


The Label Object:

class Label(Proto):
    _FIELDS = ['id', 'name']
    _PROTO_NAME = 'label'
    #  We define what labels can pull metadata directly through a property
    campaign_type = LabelTypeDescriptor('campaign_type')
    match_type = LabelTypeDescriptor('match_type')
    audience_type = LabelTypeDescriptor('audience_type')

    def __init__(self, proto, **kwargs):
        self._proto = proto
        self._set_default_property_values(self)  # <-- the 'self' is intentional here, in the campaign object a label would be passed instead.

    def _set_default_property_values(self, proto_wrapper):
        props = [key for (key, obj) in self.__class__.__dict__.items() if isinstance(obj, DescriptorProperty)]
        for prop in props:
            setattr(self, prop, proto_wrapper)

So if I have a protobuf label object stored in my Label (which is basically just a wrapper) which looks like this:

resource_name: "customers/12345/labels/67890"
id {
  value: 67890
}
name {
  value: "campaign_type:Primary"
}

Then my_label.campaign_type would return Primary, and similarly my_label.match_type would return False


The reason being is that I'm creating a number of classes that are all labelled in the same way, and may have a lot of labels. Currently this all works as described, but I'd like to be able to define the attributes more dynamically as they all basically follow the same type of pattern. So instead of :

    campaign_type = LabelTypeDescriptor('campaign_type')
    match_type = LabelTypeDescriptor('match_type')
    audience_type = LabelTypeDescriptor('audience_type')
    ... # (many more labels)

I simply have: _LABELS = ['campaign_type', 'match_type', 'audience_type', ... many more labels] and then have some loop that creates the attributes.

In turn I can cascade a similar approach through to my other classes, so that while a Campaign object may hold a Label object, I can access the value of the label simply by my_campaign.campaign_type. If the campaign does not have a label of the appropriate type, it will simply return False.


Solution

  • While cls does not exist when the class body is run, you can set the attributes by simply setting then in the dictionary returned by locals() inside the class body:

    class Campaign(metaclass=MetaCampaign):
        _LABELS = ['campaign_type', 'match_type', 'audience_type'] # <-- my list of attributes
        
        for label in _LABELS:
            locals()[label] = label, LabelDescriptor(label)
        del label  # so you don't get a spurious "label" attribute in your class 
    
    

    Other than that you can use a metaclass, yes, but also a __init_suclass__ on a base class. Less metaclasses mean less "moving parts" that can behave in strange ways in your system.

    So, say your Proto class is the base for all others that need this feature:

    class Proto:
        def __init_subclass__(cls, **kwd):
            super().__init_subclass__(**kwd)
            for label in cls._LABELS:
                setattr(cls, label, LabelDescriptor(label))
        ...
    

    I had taken a look at your Descriptors and code there - if they ar already working, I'd say they are all right.

    I can comment that it is more usual to store descriptor-related data in the instance's __dict__ itself, instead of creating the data and cached_datain the descriptor itself - so one don't need to care about weakrefs - but both approaches work (just this week, I had implemented a descriptor in this way, even though I usually go for the instance's __dict__)