Trying to use a recursive method inside a class to flatten nested OrderedDicts. Results in RuntimeError: dictionary changed size during iteration
I'm given a List of OrderedDicts. Most OrderedDicts are simple Key:string-value attributes, but some values instead contain another OrderedDict. That nesting can go down several levels. Here's a very simplified sample:
records = [
OrderedDict([
('rec-1_field-1', 'r1f1_value'),
('rec-1_field-2', 'r1f2_value'),
('rec-1_nest-1', OrderedDict([
('n1_field-1', 'n1f1_value'),
('n1_field-2', 'n1f2_value')
])
)
]),
OrderedDict([
...
])
]
My aim is un-nest these OrderedDicts such that the above starts to be transformed into this (notice the "higherKey.lowerKey" nomenclature I am trying to get):
flatRecords = [
{'rec-1_field-1':'r1f1_value',
'rec-1_field-2':'r1f2_value',
'rec-1_nest-1.n1_field-1':'n1f1_value',
'rec-1_nest-1.n1_field-2':'n1f2_value'},
...
]
Here is a simplified version of my code. I am feeding each OrderedDict to a method that recurses when it finds a nested OrderedDict. I think I am overwriting my flatRecord Dict inside of the recursion but cannot determine how to correct.
class unNested():
def __init__(self):
pass
def flatResults(self, OD):
self.OD = OD
self.flattenedRecords = []
for eachRecord in self.OD:
self.flattenedRecords.append(self.flatten(eachRecord))
return self.flattenedRecords
def flatten(self, record):
self.record = record
self.flatRecord = {}
for eachKey in self.record:
if isinstance(self.record[eachKey], dict):
self.subRecord = self.flatten(self.record[eachKey])
for eachSub in self.subRecord:
self.key = eachKey + '.' + eachSub
self.flatRecord[self.key] = self.record[eachSub]
else:
self.flatRecord[eachKey] = self.record[eachKey]
return self.flatRecord
So the following snippet results in "RuntimeError: dictionary changed size during iteration"
records = [
OrderedDict([
('rec-1_field-1', 'r1f1_value'),
('rec-1_field-2', 'r1f2_value'),
('rec-1_nest-1', OrderedDict([
('rec-1_nest-1_field-1', 'r1n1f1_value'),
('rec-1_nest-1_field-2', 'r1n1f2_value')
])
)
]),
OrderedDict([
('rec-2_field-1', 'r2f1_value'),
('rec-2_field-2', 'r2f2_value'),
('rec-2_nest-1', OrderedDict([
('rec-2_nest-1_field-1', 'r2n1f1_value'),
('rec-2_nest-1_field-2', 'r2n1f2_value')
])
)
])
]
crush = unNested()
crush.flatResults(records)
I'm sure it's an amateur mistake, but I'd love to hear any thoughts or guidance. Thanks!
You can use a flattening method:
from collections import OrderedDict
records = [OrderedDict([('rec-1_field-1', 'r1f1_value'), ('rec-1_field-2', 'r1f2_value'), ('rec-1_nest-1', OrderedDict([('n1_field-1', 'n1f1_value'), ('n1_field-2', 'n1f2_value')]))])]
def flatten(d, last=''):
for a, b in d.items():
if not isinstance(b, OrderedDict):
yield (f'{last}.{a}' if last else a, b)
else:
yield from flatten(b, last = a)
final_result = dict(flatten(records[0]))
Output:
{'rec-1_field-1': 'r1f1_value', 'rec-1_field-2': 'r1f2_value', 'rec-1_nest-1.n1_field-1': 'n1f1_value', 'rec-1_nest-1.n1_field-2': 'n1f2_value'}
To create a flattened structure for each element in a list:
final_result = [dict(flatten(i)) for i in records]