I am trying to normalize a dictionary containing some lists. As an MVCE (Minimal, Verifiable, Complete Example), consider the following dictionary:
test_dict = {
'name' : 'john',
'age' : 20,
'addresses' : [
{
'street': 'XXX',
'number': 123,
'complement' : [
'HOUSE',
'NEAR MARKET'
]
},
{
'street': 'YYY',
'number': 456,
'complement' : [
'AP',
'NEAR PARK'
]
},
],
'phones' : [
'123456'
],
'gender' : 'MASC'
}
I want each list found in the dictionary to generate a line, so my desired output is:
{'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'HOUSE', 'phones': '123456', 'gender' : 'MASC'}
{'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'NEAR MARKET', 'phones': '123456', 'gender' : 'MASC'}
{'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'AP', 'phones': '123456', 'gender' : 'MASC'}
{'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'NEAR PARK', 'phones': '123456', 'gender' : 'MASC'}
However, when I run my code, I am not able to iterate over more than one list. My intention was to develop a recursive function, so I wouldn't have to worry about a dictionary with more complex structures (a dictionary with more lists inside dictionaries, etc.). However, when I run my code, the output I get is:
{'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'HOUSE'}
{'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'NEAR MARKET'}
{'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'AP'}
{'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'NEAR PARK'}
{'name': 'john', 'age': 20, 'phones': '123456'}
My python code (MVCE):
def get_list_values(lista, dicionario, key_name, results):
if len(lista) > 0:
for l in lista:
if isinstance(l, dict):
search_values(l, dicionario.copy(), results)
else:
dicionario_metodo = dicionario.copy()
dicionario_metodo[key_name] = l
results.append(dicionario_metodo)
def search_values(dicionario, test, results):
for k, v in dicionario.items():
if isinstance(v, list):
get_list_values(v, test, k, results )
else:
test[k] = v
if not any(isinstance(v, list) for v in dicionario.values()):
results.append(test.copy())
return results
test = {}
results = []
for r in search_values(test_dict, test, results):
print(r)
In which part of my recursion am I going wrong, so it doesn't generate my desired output?
Edit 1:
test_dict = {
'name' : 'john',
'age' : 20,
'addresses' : [
{
'street': 'XXX',
'number': 123,
'complement' : [
'HOUSE',
'NEAR MARKET'
]
},
{
'street': 'YYY',
'number': 456,
'complement' : [
'AP',
'NEAR PARK'
]
},
],
'type' : {
'category': 'G123',
'products': [
'test1',
'test2'
]
},
'phones' : [
'123456'
],
'gender' : 'MASC'
}
It took me some time to get this right, but check this out.
def flat(out, *kvs):
match kvs:
case []: yield out
case (k, []), *kvs: yield from flat(out, *kvs)
case (k, list(l)), *kvs:
for v in l: yield from flat(out, (k, v), *kvs)
case (_, dict(d)), *kvs: yield from flat(out, *d.items(), *kvs)
case (k, v), *kvs: yield from flat([*out, (k, v)], *kvs)
case _: raise ValueError("Invalid")
That is all you need! This implementation makes extensive use of recursion, pattern mathcing and generators.
You can try it out like this:
x = map(dict, flat([], (..., test_dict)))
print(*x, sep='\n')
# {'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'HOUSE', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'NEAR MARKET', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'AP', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'NEAR PARK', 'phones': '123456', 'gender': 'MASC'}
With your second input data, result would be as below:
# {'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'HOUSE', 'category': 'G123', 'products': 'test1', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'HOUSE', 'category': 'G123', 'products': 'test2', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'NEAR MARKET', 'category': 'G123', 'products': 'test1', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'XXX', 'number': 123, 'complement': 'NEAR MARKET', 'category': 'G123', 'products': 'test2', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'AP', 'category': 'G123', 'products': 'test1', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'AP', 'category': 'G123', 'products': 'test2', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'NEAR PARK', 'category': 'G123', 'products': 'test1', 'phones': '123456', 'gender': 'MASC'}
# {'name': 'john', 'age': 20, 'street': 'YYY', 'number': 456, 'complement': 'NEAR PARK', 'category': 'G123', 'products': 'test2', 'phones': '123456', 'gender': 'MASC'}
Edit: Mapped the key-value pairs into dicts as per requirements, made the code neatier.