Search code examples
pythondictionarymissing-features

Build a dictionary of selected features


I have 20K objects and a set of features provided in a list. I need to extract those features from each object and save them into a dictionary. Each object has almost 100 features.

For example:

# object1
Object1.Age = '20'
Object1.Gender = 'Female'
Object1.DOB = '03/05/1997'
Object1.Weight = '130lb'
Object1.Height = '5.5'

#object2
Object1.Age = '22'
Object1.Gender = 'Male'
Object1.DOB = '03/05/1995'
Object1.Weight = '145lb'
Object1.Height = '5.8'

#object3
Object1.Age = '22'
Object1.Gender = 'Male'
Object1.DOB = '03/05/1995'
Object1.Weight = '145lb'

#object4
...

And the list of features that I need to extract from each object (this list may change, so I need the code to be flexible about it):

features = ['Gender', 
        'DOB', 
        'Height']

Currently, I'm using this function to capture all the features that I need form each object:

def get_features(obj, features):
return {f: getattr(obj, f) for f in features}

This function works perfectly if all the objects have all the features that I want. But there are some objects that do not have all the features. For example object3 does not have a filed named "Height". How can I put NaN for the value of missing files in my dictionary so that I can prevent getting an error?


Solution

  • Python getattr documentation:

    getattr(object, name[, default]) Return the value of the named attribute of object. name must be a string. If the string is the name of one of the object’s attributes, the result is the value of that attribute. For example, getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise AttributeError is raised.

    You could just do this:

    def get_features(obj, features):
        return {f: getattr(obj, f, float('Nan')) for f in features}