Search code examples
pythonpython-3.xdefaultdict

How to aggregage default dict data


I have a list of dictionaries as follow:

result = {
    "resultset": [
        {"name": "DOG", "threshold": Decimal("1.45600000"), "current_value": 124},
        {"name": "DOG", "threshold": Decimal("1.45600000"), "current_value": 14},
        {"name": "DOG", "threshold": Decimal("1.45600000"), "current_value": 1},
        {"name": "CAT", "threshold": Decimal("1.45600000"), "current_value": 24},
        {"name": "CAT", "threshold": Decimal("1.45600000"), "current_value": 4},
    ]
}

Now i want to actually do 2 things, basically do an aggrgation where i get:

  1. A list of current_values []
  2. An average of the threshold value

so in the end I wanna see :

{
'DOG': {'current_values': [124,14,1], 'threshold': the average of threshold},
'CAT': {'current_values': [24,4] , 'threshold': the average of threshold}
}

I got half of it working where i can get the list of current_values but not the whole using the default dict where i can do something like

all_animals  = defaultdict(list)
     for i in result['resultset']:                
           all_animals[i['name']].append(float(i['current_value']))

Can someone please help me out


Solution

  • Piece of cake with defaultdict and statistics:

    from decimal import Decimal
    from collections import defaultdict
    import statistics
    
    result = {
        "resultset": [
            {
                "name": "DOG",
                "threshold": Decimal("1.45600000"),
                "current_value": 124,
            },
            {
                "name": "DOG",
                "threshold": Decimal("1.45600000"),
                "current_value": 14,
            },
            {
                "name": "DOG",
                "threshold": Decimal("1.45600000"),
                "current_value": 1,
            },
            {
                "name": "CAT",
                "threshold": Decimal("1.45600000"),
                "current_value": 24,
            },
            {
                "name": "CAT",
                "threshold": Decimal("1.45600000"),
                "current_value": 4,
            },
        ]
    }
    
    current_values_by_name = defaultdict(list)
    thresholds_by_name = defaultdict(list)
    for x in result["resultset"]:
        current_values_by_name[x["name"]].append(x["current_value"])
        thresholds_by_name[x["name"]].append(x["threshold"])
    
    aggregate_result = {
        name: {
            "current_values": current_values_by_name[name],
            "threshold": statistics.mean(thresholds_by_name[name]),
        }
        for name in current_values_by_name
    }
    
    print(aggregate_result)
    

    outputs

    {
        "DOG": {
            "current_values": [124, 14, 1],
            "threshold": Decimal("1.456"),
        },
        "CAT": {
            "current_values": [24, 4],
            "threshold": Decimal("1.456"),
        },
    }