Search code examples
pythondictionarydictionary-comprehension

Summing Dictinary Values


I have a list of dictionaries like this`

[
    {
      "account_name": "Rounded Off (Purchase)",
      "amount": 0.28,
      "doc_date": "2023-04-05",
      "doc": "P.Inv.-1",
      "date_created": "2023-04-05T15:30:42.964203"
    },
    {
      "account_name": "Discount (Purchase)",
      "amount": 100,
      "doc_date": "2023-04-05",
      "doc": "P.Inv.-1",
      "date_created": "2023-04-05T15:30:42.964203"
    },
    {
      "account_name": "Discount (Purchase)",
      "amount": 86.4,
      "doc_date": "2023-04-05",
      "doc": "P.Inv.-1",
      "date_created": "2023-04-05T15:30:42.964203"
    }
  ]`

I would like to simplify the list by adding the "amount" if two dictionaries have the same values for "doc" and "account_name" keys e.g. in this case "account_name" : "Discount (Purchase)" and "doc" : "P.Inv-1".

I can not think of a simple solution without using a lots of placeholder variables and and multiple loops over the list.

The expected result should look like

[
    {
      "account_name": "Rounded Off (Purchase)",
      "amount": 0.28,
      "doc_date": "2023-04-05",
      "doc": "P.Inv.-1",
      "date_created": "2023-04-05T15:30:42.964203"
    },
    {
      "account_name": "Discount (Purchase)",
      "amount": 186.4,
      "doc_date": "2023-04-05",
      "doc": "P.Inv.-1",
      "date_created": "2023-04-05T15:30:42.964203"
    }
]

Any help is greatly appreciated. Thanks.


Solution

  • It often helps to gradually think over such problems, and perhaps use simple pen and paper to understand how our variables will evolve over time. Some little effort now can go a long way mid-long-term.

    My approach would be to create an empty list, which is populated every time your two keys are different. If they are the same it updates the amount-value. You might want to decide what happens to "data_created" field, because now I do not examine it at all (Does it keep the first 'data_created' value or the second? Why?) Since you don't provide a reproducible example I would write something as follows:

    initial_dictionary_list = [
        {
          "account_name": "Rounded Off (Purchase)",
          "amount": 0.28,
          "doc_date": "2023-04-05",
          "doc": "P.Inv.-1",
          "date_created": "2023-04-05T15:30:42.964203"
        },
        {
          "account_name": "Discount (Purchase)",
          "amount": 100,
          "doc_date": "2023-04-05",
          "doc": "P.Inv.-1",
          "date_created": "2023-04-05T15:30:42.964203"
        },
        {
          "account_name": "Discount (Purchase)",
          "amount": 86.4,
          "doc_date": "2023-04-05",
          "doc": "P.Inv.-1",
          "date_created": "2023-04-05T13:30:42.964203"
        }
      ]
    
    from typing import Dict, List # these provide very helpful type hints
    
    def process_dictionary_list(data: list[dict]) -> list[dict]:
        list_of_dictionaries_with_different_keys = {}
        for dictionary in dictionary_list:
            key = (dictionary['account_name'], dictionary['doc'])
            if key in list_of_dictionaries_with_different_keys:
                list_of_dictionaries_with_different_keys[key]['amount'] += dictionary['amount']
            else:
                list_of_dictionaries_with_different_keys[key] = dictionary.copy()
        return list(list_of_dictionaries_with_different_keys.values())
    
    final_dictionary_list = process_dictionary_list(initial_dictionary_list)
    
    print(final_dictionary_list)
    # [{'account_name': 'Rounded Off (Purchase)', 'amount': 0.28, 'doc_date': '2023-04-05', 'doc': 'P.Inv.-1', 'date_created': '2023-04-05T15:30:42.964203'}, {'account_name': 'Discount (Purchase)', 'amount': 186.4, 'doc_date': '2023-04-05', 'doc': 'P.Inv.-1', 'date_created': '2023-04-05T15:30:42.964203'}]