Search code examples
pythonsortingdefaultdict

Sorting nested defaultdicts in Python


I'm nesting 2 levels of default dicts. The inner dict contains a number of fields, and I would like to sort it by one of the values and delete the entries that correspond to the lowest values.

Here's a simplified code example:

from collections import defaultdict

sampleDict = defaultdict(lambda: defaultdict(lambda:defaultdict(lambda:str)))

sampleDict['keyA']['keyB']['size'] = 1000
sampleDict['keyA']['keyC']['size'] = 500
sampleDict['keyA']['keyD']['size'] = 750
sampleDict['keyA']['keyE']['size'] = 250
sampleDict['keyA']['keyB']['desc'] = 'some data'
sampleDict['keyA']['keyC']['desc'] = 'some more data'
sampleDict['keyA']['keyD']['desc'] = 'different data'
sampleDict['keyA']['keyE']['desc'] = 'other data'

In this case, I want to sort and identify that the highest size is ['keyA']['keyB'] and that the second highest is ['keyA']['keyD'] and then remove ['keyA']['keyC'] and ['keyA']['keyE'].

The reason it is nested is because I will then be looping through other entries in the outer dict.


Solution

  • Try this:

    >>> import operator
    >>> sorted(
    ...     reduce(operator.add, 
    ...     [[(k, k1, sampleDict[k][k1]['size']) for k1 in v.keys()]
    ...              for k,v in sampleDict.items()]
    ...     ),
    ...     key=lambda x: x[2], reverse=True)
    [('keyA', 'keyB', 1000), ('keyA', 'keyD', 750), ('keyA', 'keyC', 500), ('keyA', 'keyE', 250)]
    

    The reduce statement is used to turn the nested list [[a],[b,c],[d]] into [a,b,c].

    The key argument for the sorted statement specifies to sort on the (zero inclusive) 2nd argument of (k,k1,val) ie, val.

    The reverse argument orders the list in descending order.