python python-3.x dictionary key dictionary-comprehension

Why does the assignment of values to keys in identical, but differently created nested dictionaries leads to different results?

Code snippets are worth a thousand words, so here they are:

# Creation of data_dict1
data_types = ('abnormal', 'normal')
data_dict1 = dict.fromkeys(data_types, {0: {}, 1: {}})
# Creation of data_dict2
data_dict2 = data_dict2 = {'abnormal': {0: {}, 1: {}}, 'normal': {0: {}, 1: {}}}

# Check
print(data_dict1)
print(data_dict2)
print(data_dict1 == data_dict2)

>>> {'abnormal': {0: {}, 1: {}}, 'normal': {0: {}, 1: {}}}
>>> {'abnormal': {0: {}, 1: {}}, 'normal': {0: {}, 1: {}}}
>>> True

As you can see, the nested dictionaries data_dict1 and data_dict2 are identical regardless of their methods of creation. But when I assign values in the same way, I get different results:

data_dict1['abnormal'][0]['result'] = 'abnormal_0'
data_dict1['abnormal'][1]['result'] = 'abnormal_1'
data_dict1['normal'][0]['result'] = 'normal_0'
data_dict1['normal'][1]['result'] = 'normal_1'

data_dict2['abnormal'][0]['result'] = 'abnormal_0'
data_dict2['abnormal'][1]['result'] = 'abnormal_1'
data_dict2['normal'][0]['result'] = 'normal_0'
data_dict2['normal'][1]['result'] = 'normal_1'

# Check
print(data_dict1)
print(data_dict2)
print(data_dict1 == data_dict2)

>>> {'abnormal': {0: {'result': 'normal_0'}, 1: {'result': 'normal_1'}}, 'normal': {0: {'result': 'normal_0'}, 1: {'result': 'normal_1'}}}
>>> {'abnormal': {0: {'result': 'abnormal_0'}, 1: {'result': 'abnormal_1'}}, 'normal': {0: {'result': 'normal_0'}, 1: {'result': 'normal_1'}}}
>>> False

The values for data_dict1['abnormal'][0]['result'] and data_dict1['abnormal'][0]['result'] are 'normal_0' and 'normal_1', respectively, and not 'abnormal_0' and 'abnormal_1' as they should be. Why is this the case?

Solution

This is coming about due to the way you initiate data_dict1.

data_dict1 = dict.fromkeys(data_types, {0: {}, 1: {}})

When you do this, both of the keys in data_dict1 are set to the same nested dictionary. This means that after these lines:

data_dict1['abnormal'][0]['result'] = 'abnormal_0'
data_dict1['abnormal'][1]['result'] = 'abnormal_1'

The value of both keys in the dictionary, abnormal and normal is changed. If we check data_dict1 at this point, we see:

>>> data_dict1
{'abnormal': {0: {'result': 'abnormal_0'}, 1: {'result': 'abnormal_1'}},
 'normal': {0: {'result': 'abnormal_0'}, 1: {'result': 'abnormal_1'}}}

When we continue to then change the value of the normal key, the same thing happens, giving us the result you have found.

We can check that both nested dictionaries are actually the same dictionary in memory by using is:

>>> data_dict1['abnormal'] is data_dict1['normal']
True

You have initiated data_dict2 by actually assigning different nested dictionaries to the keys, and we can see that this is true in the same way as above:

>>> data_dict2['abnormal'] is data_dict2['normal']
False

We can avoid this behaviour, without having to type it out like you did in data_dict2. One way to do this would be to use a dictionary comprehension:

>>> data_types = ('abnormal', 'normal')
>>> data_dict1 = {k: {0: {}, 1: {}} for k in data_types}
>>> data_dict1['abnormal'] is data_dict1['normal']
False