Search code examples
pythonpython-3.xiterationpython-itertools

List of dictionaries to summarise


There is a list of dictionaries:

given_list = [
    {"cat_id": 1, "category": "red", "items": 1},
    {"cat_id": 1, "category": "red", "items": 3},
    {"cat_id": 2, "category": "yellow", "items": 2},
    {"cat_id": 2, "category": "yellow", "items": 4},
    {"cat_id": 2, "category": "yellow", "items": 6},
    {"cat_id": 3, "category": "green", "items": 99},
]

The outcome should be another list, where items are summarized:

outcome = [
    {"cat_id": 1, "category": "red", "items": 4},
    {"cat_id": 2, "category": "yellow", "items": 12},
    {"cat_id": 3, "category": "green", "items": 99}
]

Please advise the best way to make this.

Details of the question:

  • "cat_id" always match "category"
  • "cat_id" not necessary ordered in the given list
  • the mix of different dict items in a random way is possible
  • the outcome should be only this structure: list of dicts

Solution

  • You can take advantage of groupby in itertools module. So basically you group the dictionaries based on the value of the "cat_id". In the for loop, you get the first dictionary and add it to the final result list, then for the rest of the dictionaries exist in the group, you just increment the value of "items".

    from itertools import groupby
    
    given_list = [
        {"cat_id": 1, "category": "red", "items": 1},
        {"cat_id": 1, "category": "red", "items": 3},
        {"cat_id": 2, "category": "yellow", "items": 2},
        {"cat_id": 2, "category": "yellow", "items": 4},
        {"cat_id": 2, "category": "yellow", "items": 6},
        {"cat_id": 3, "category": "green", "items": 99},
    ]
    
    outcome = []
    for _, g in groupby(given_list, key=lambda x: x["cat_id"]):
        outcome.append(next(g))
        for d in g:
            outcome[-1]["items"] += d["items"]
    

    One thing to note is that your data should already be sorted with this approach. If not you need to first sort it. This is how groupby works.