Search code examples
pythonpython-3.xrecursionordereddictionary

Recursive method changing dictionary during iteration


Trying to use a recursive method inside a class to flatten nested OrderedDicts. Results in RuntimeError: dictionary changed size during iteration

I'm given a List of OrderedDicts. Most OrderedDicts are simple Key:string-value attributes, but some values instead contain another OrderedDict. That nesting can go down several levels. Here's a very simplified sample:

records = [
    OrderedDict([
        ('rec-1_field-1', 'r1f1_value'),
        ('rec-1_field-2', 'r1f2_value'),
        ('rec-1_nest-1', OrderedDict([
            ('n1_field-1', 'n1f1_value'),
            ('n1_field-2', 'n1f2_value')
            ])
         )
        ]),
    OrderedDict([
        ...
        ])
]

My aim is un-nest these OrderedDicts such that the above starts to be transformed into this (notice the "higherKey.lowerKey" nomenclature I am trying to get):

flatRecords = [
    {'rec-1_field-1':'r1f1_value',
    'rec-1_field-2':'r1f2_value',
    'rec-1_nest-1.n1_field-1':'n1f1_value',
    'rec-1_nest-1.n1_field-2':'n1f2_value'},
    ...
    ]

Here is a simplified version of my code. I am feeding each OrderedDict to a method that recurses when it finds a nested OrderedDict. I think I am overwriting my flatRecord Dict inside of the recursion but cannot determine how to correct.

class unNested():
    def __init__(self):
        pass
    def flatResults(self, OD):
        self.OD = OD
        self.flattenedRecords = []
        for eachRecord in self.OD:
            self.flattenedRecords.append(self.flatten(eachRecord))
        return self.flattenedRecords
    def flatten(self, record):
        self.record = record
        self.flatRecord = {}
        for eachKey in self.record:
            if isinstance(self.record[eachKey], dict):
                self.subRecord = self.flatten(self.record[eachKey])
                for eachSub in self.subRecord:
                    self.key = eachKey + '.' + eachSub
                    self.flatRecord[self.key] = self.record[eachSub]
            else:
                self.flatRecord[eachKey] = self.record[eachKey]
        return self.flatRecord

So the following snippet results in "RuntimeError: dictionary changed size during iteration"

records = [
    OrderedDict([
        ('rec-1_field-1', 'r1f1_value'),
        ('rec-1_field-2', 'r1f2_value'),
        ('rec-1_nest-1', OrderedDict([
            ('rec-1_nest-1_field-1', 'r1n1f1_value'),
            ('rec-1_nest-1_field-2', 'r1n1f2_value')
            ])
         )
        ]),
    OrderedDict([
        ('rec-2_field-1', 'r2f1_value'),
        ('rec-2_field-2', 'r2f2_value'),
        ('rec-2_nest-1', OrderedDict([
            ('rec-2_nest-1_field-1', 'r2n1f1_value'),
            ('rec-2_nest-1_field-2', 'r2n1f2_value')
            ])
         )
        ])
    ]
crush = unNested()
crush.flatResults(records)

I'm sure it's an amateur mistake, but I'd love to hear any thoughts or guidance. Thanks!


Solution

  • You can use a flattening method:

    from collections import OrderedDict
    records = [OrderedDict([('rec-1_field-1', 'r1f1_value'), ('rec-1_field-2', 'r1f2_value'), ('rec-1_nest-1', OrderedDict([('n1_field-1', 'n1f1_value'), ('n1_field-2', 'n1f2_value')]))])]
    def flatten(d, last=''):
       for a, b in d.items():
          if not isinstance(b, OrderedDict):
             yield (f'{last}.{a}' if last else a, b)
          else:
             yield from flatten(b, last = a)
    
    final_result = dict(flatten(records[0]))
    

    Output:

    {'rec-1_field-1': 'r1f1_value', 'rec-1_field-2': 'r1f2_value', 'rec-1_nest-1.n1_field-1': 'n1f1_value', 'rec-1_nest-1.n1_field-2': 'n1f2_value'}
    

    To create a flattened structure for each element in a list:

    final_result = [dict(flatten(i)) for i in records]