Search code examples
pythondata-structuresuniondefaultdict

Merging nested defaultdicts


I have this:

dict1 = defaultdict(lambda:defaultdict(list))
dict1['rl1']['sh1'] = ['a','b']
dict1['rl1']['sh2'] = ['c','d']
dict1['rl2']['sh1'] = ['c','d']

dict2 = defaultdict(lambda:defaultdict(list))
dict2['rl1']['sh1'] = ['p','q']
dict2['rl1']['sh3'] = ['r','s']
dict2['rl3']['sh1'] = ['r','s']

I want to do a union of the two defaultdicts, this would be the result:

uniondict = defaultdict(lambda:defaultdict(list))
uniondict['rl1']['sh1'] = ['a','b','p','q']
uniondict['rl1']['sh2'] = ['c','d']
uniondict['rl1']['sh3'] = ['r','s']
uniondict['rl2']['sh1'] = ['c','d']
uniondict['rl3']['sh1'] = ['r','s']

I'm not sure on how obtain this result.. I've tried using dict1.items() and dict2.items(), or update function, but i must be missing something because i can't manage to get the "union" of the defaultdicts.


Solution

  • slightly more 'elegant':

    uniondict = defaultdict(lambda:defaultdict(list))
    for k1, v1 in dict1.items() + dict2.items():
        for k2, v2 in v1.items():
            uniondict[k1][k2] += v2
    

    for a more memory-efficient solution:

    from itertools import chain
    uniondict = defaultdict(lambda:defaultdict(list))
    for k1, v1 in chain(dict1.iteritems(), dict2.iteritems()):
        for k2, v2 in v1.iteritems():
            uniondict[k1][k2] += v2
    

    this will use iterators to prevent creating temporary lists in memory

    itertools.chain docs


    for posterity, here is a generic function that will merge nested dictionary objects (defaultdict or no) where the second-tier value supports the + operator. (lists, ints, sets, etc):

    from collections import defaultdict
    from itertools import chain
    
    def merge_nested_dicts(dict_list):
        uniondict = defaultdict(lambda:defaultdict(list))
        for k1, v1 in chain(*[d.iteritems() for d in dict_list]):
            for k2, v2 in v1.iteritems():
                uniondict[k1][k2] += v2
        return uniondict