Search code examples
pythonpython-3.xpyyaml

Python parse at any nested level and get highest attribute value


I've an assignment and I've been trying to solve this for hours with no avail. paper records can be found nested at any level in the dictionary

I'm trying to Parse a yaml file and get the nested values where the output is

sales                      highest_sales
sales.yml/sales1/paper1       Dwight

So far, I'm only able to get key and value pair off my code but I've gone anywhere. Can anyone please spare a thought

This is the output from YAML file

{'sales1': {'paper1': {'Jim': 8, 'Pam': 9, 'Dwight': 43}}, 'sales2': {'morning_sales': {'paper2': {'Jim': 10, 'Pam': 23, 'Dwight': 67, 'Michael': 12}}, 'evening_sales': {'paper3': {'Jim': 3, 'Pam': 43, 'Dwight': 67}}}, 'paper4': {'Jim': 23, 'Pam': 14, 'Dwight': 8}}
import yaml

with open(r"C:\Users\Jason\Assignments\Assignment8\sales.yml", 'r') as yamlfile:
        data = yaml.load(yamlfile, Loader=yaml.FullLoader)  


def iterdict(d):
  for k,v in d.items():        
     if isinstance(v, dict):
         iterdict(v)
     else:            
         print (k,":",v)

iterdict(data)


Solution

  • The key for the maximum value can be found using operator.itemgetter Documentation
    You can try this code, It should work

    import yaml
    import operator
    
    def check_max(k, v, prev_key):
        try:
            # To get key of maximum sale
            sales = max(v.items(), key=operator.itemgetter(1))[0]
            if 'sales' in sales or 'paper' in sales:
                # To check if value is another dictionary
                raise Exception("value is a dictionary")
            print("{:38s} {:10s}".format(str(prev_key)+'/'+str(k),sales))
    
        except:
            # The value was a dictionary 
            for k1, v1 in v.items():
                check_max(k1,v1, prev_key+'/'+str(k))
    
    
    with open(r"C:\Users\Jason\Assignments\Assignment8\sales.yml", 'r') as yamlfile:
            data = yaml.load(yamlfile, Loader=yaml.FullLoader)  
    
    # Formatting of data you can 
    # increase the value for proper formatting
    print("{:38s} {:10s}".format("sales","highest_sales"))
    for k,v in data.items():
        check_max(k,v, 'sales.yml')
    

    Output:

    sales                                  highest_sales
    sales.yml/sales1/paper1                Dwight
    sales.yml/sales2/morning_sales/paper2  Dwight
    sales.yml/sales2/evening_sales/paper3  Dwight
    sales.yml/paper4                       Jim