python python-3.x iteration python-itertools

List of dictionaries to summarise

There is a list of dictionaries:

given_list = [
    {"cat_id": 1, "category": "red", "items": 1},
    {"cat_id": 1, "category": "red", "items": 3},
    {"cat_id": 2, "category": "yellow", "items": 2},
    {"cat_id": 2, "category": "yellow", "items": 4},
    {"cat_id": 2, "category": "yellow", "items": 6},
    {"cat_id": 3, "category": "green", "items": 99},
]

The outcome should be another list, where items are summarized:

outcome = [
    {"cat_id": 1, "category": "red", "items": 4},
    {"cat_id": 2, "category": "yellow", "items": 12},
    {"cat_id": 3, "category": "green", "items": 99}
]

Please advise the best way to make this.

Details of the question:

"cat_id" always match "category"
"cat_id" not necessary ordered in the given list
the mix of different dict items in a random way is possible
the outcome should be only this structure: list of dicts

Solution

You can take advantage of groupby in itertools module. So basically you group the dictionaries based on the value of the "cat_id". In the for loop, you get the first dictionary and add it to the final result list, then for the rest of the dictionaries exist in the group, you just increment the value of "items".

from itertools import groupby

given_list = [
    {"cat_id": 1, "category": "red", "items": 1},
    {"cat_id": 1, "category": "red", "items": 3},
    {"cat_id": 2, "category": "yellow", "items": 2},
    {"cat_id": 2, "category": "yellow", "items": 4},
    {"cat_id": 2, "category": "yellow", "items": 6},
    {"cat_id": 3, "category": "green", "items": 99},
]

outcome = []
for _, g in groupby(given_list, key=lambda x: x["cat_id"]):
    outcome.append(next(g))
    for d in g:
        outcome[-1]["items"] += d["items"]

One thing to note is that your data should already be sorted with this approach. If not you need to first sort it. This is how groupby works.