Search code examples
pythonexcelpython-3.xlistdata-dictionary

How to combine dictionaries that have the same element inside a list?


I have a list of dictionaries. Some dictionaries share the same elements.

data = [
                 {'New_PCP_Name': 'Jack0',
                  'Member_Name': 'Jack0, 0',
                  'Member_ID': '111',
                  'DOB': '111',
                  'PCP_ID':'111'
                  },
                 {'New_PCP_Name': 'Jack0',
                  'Member_Name': 'Jack00, 00',
                  'Member_ID': '222',
                  'DOB': '222',
                  'PCP_ID':'111'
                  },
                    {'New_PCP_Name': 'Jack1',
                     'Member_Name': 'Jack1, 1',
                     'Member_ID': '333',
                     'DOB': '333',
                     'PCP_ID':'333'
                     },
                    {'New_PCP_Name': 'Jack2',
                     'Member_Name': 'Jack2, 2',
                     'Member_ID': '444',
                     'DOB': '444',
                     'PCP_ID':'444'
                     }
                 ]

I need to combine them in a specific format. The first 2 dicts share the same element 'New_PCP_Name': 'Jack0'. So I want to combine them. The final product is as below. This format has to be exactly like this because I need to use this format in a mail merge from the data imported from Excel.

data = [
                 {'New_PCP_Name': 'Jack0',
                  'PCP_ID':'111',
                  'Member_Name':[{'Member_Name':'Jack0, 0','Member_ID':'111','DOB':'111'},
                               {'Member_Name': 'Jack00, 00', 'Member_ID': '222', 'DOB': '222'}] 
                  },
                    {'New_PCP_Name': 'Jack1',
                     'Member_Name': 'Jack1, 1',
                     'Member_ID': '333',
                     'DOB': '333',
                     'PCP_ID':'333'
                     },
                    {'New_PCP_Name': 'Jack2',
                     'Member_Name': 'Jack2, 2',
                     'Member_ID': '444',
                     'DOB': '444',
                     'PCP_ID':'444'
                     }
                 ]

I'm new in python.I have tried breaking up the list, modifying the dictionaries and putting them back together as below. That didn't work out. Please help me figure out how to reformat the list of dicts. Or is there a way to import data from Excel in that specific format for the rows that share the same cell values?

data2=[]
for x in range (0,len(data),1):
    print(x)
    print(data[x])
    a = data[x]
    print(a['New_PCP_Name'])
    if x+1<=len(data):
        if data[x]['New_PCP_Name'] == data[x+1]['New_PCP_Name']:
            print('yes')
            data2.append(data[x])
        else:
            print('no')

print('data2=', data2)

Solution

  • One possibility is using itertools.groupby (doc):

    data = [
                     {'New_PCP_Name': 'Jack0',
                      'Member_Name': 'Jack0, 0',
                      'Member_ID': '111',
                      'DOB': '111',
                      'PCP_ID':'111'
                      },
                     {'New_PCP_Name': 'Jack0',
                      'Member_Name': 'Jack00, 00',
                      'Member_ID': '222',
                      'DOB': '222',
                      'PCP_ID':'111'
                      },
                        {'New_PCP_Name': 'Jack1',
                         'Member_Name': 'Jack1, 1',
                         'Member_ID': '333',
                         'DOB': '333',
                         'PCP_ID':'333'
                         },
                        {'New_PCP_Name': 'Jack2',
                         'Member_Name': 'Jack2, 2',
                         'Member_ID': '444',
                         'DOB': '444',
                         'PCP_ID':'444'
                         }
                     ]
    
    from itertools import groupby
    
    out = []
    for v, g in groupby(sorted(data, key=lambda k: k['New_PCP_Name']), lambda k: (k['New_PCP_Name'], k['PCP_ID'])):
        l = [*g]
        if len(l) == 1:
            out.append(l[0])
        else:
            pcp_id = None
            for i in l:
                del i['New_PCP_Name']
                del i['PCP_ID']
            out.append({'New_PCP_Name': v[0],
                        'PCP_ID': v[1],
                      'Member_Name':l
                      })
    
    from pprint import pprint
    pprint(out)
    

    Prints:

    [{'Member_Name': [{'DOB': '111', 'Member_ID': '111', 'Member_Name': 'Jack0, 0'},
                      {'DOB': '222',
                       'Member_ID': '222',
                       'Member_Name': 'Jack00, 00'}],
      'New_PCP_Name': 'Jack0',
      'PCP_ID': '111'},
     {'DOB': '333',
      'Member_ID': '333',
      'Member_Name': 'Jack1, 1',
      'New_PCP_Name': 'Jack1',
      'PCP_ID': '333'},
     {'DOB': '444',
      'Member_ID': '444',
      'Member_Name': 'Jack2, 2',
      'New_PCP_Name': 'Jack2',
      'PCP_ID': '444'}]