Search code examples
pythondeep-copyshallow-copy

Shallow and deep copies for lists and dictionaries


I am trying to better understand shallow and deep copies in Python, especially when dictionaries and lists are involved.

Let's say I have a list of dictionaries and I want to copy values of specific keys ('a') into a different list of dictionaries under a different key name ('x'):

list_dict = [{'a': 1}, {'a': 2}]

dict1 = {}
dict1['x'] = {}
list1 = []

for entry in list_dict:
    dict1['x'] = entry['a']
    list1.append(dict1)

>>>print list1
[{'x': 2}, {'x': 2}]

Obviously, not what I wanted. However, I know that I can specify .copy() for a dictionary to create a shallow copy:

for entry in list_dict:
    dict1['x'] = entry['a']
    list1.append(dict1.copy())

>>>print list1
[{'x': 1}, {'x': 2}]

Even though it's shallow, it works. Now, let's make it a bit different - I want to copy it not to dict1['x'], but to dict1['x']['y']:

for entry in list_dict:
    dict1['x']['y'] = entry['a']
    list1.append(dict1.copy())

>>>print list1
[{'x': {'y': 2}}, {'x': {'y': 2}}]

Back to square one - it doesn't work! So, this is the first question - why did it stop working?

And the second question is, why does adding the last line make it work?

for entry in list_dict:
    dict1['x']['y'] = entry['a']
    list1.append(dict1.copy())
    dict1['x'] = {}

>>>print list1
[{'x': {'y': 1}}, {'x': {'y': 2}}]

Thank you very much in advance!

P.S. I know that I can do import copy and then copy.deepcopy(), however I'm interested in learning why does the shallow copy stop working when I add one more level of dictionary and why does the workaround of "resetting" the dictionary work.


Solution

  • for entry in list_dict:
        dict1['x']['y'] = entry['a']
        list1.append(dict1.copy())
    
    >>>print list1
    [{'x': {'y': 2}}, {'x': {'y': 2}}]
    

    Back to square one - it doesn't work! So, this is the first question - why did it stop working?

    Because, you now have three dictionaries:

    1. {'x': {'y': 2}}

    2. {'x': {'y': 2}}

    3. {'y': 2}

      Only two of which are are "copied" (1 and 2). They are both referencing 3. Changes in 3 reflect in 1 and 2.


    And the second question is, why does adding the last line make it work?

    for entry in list_dict:
        dict1['x']['y'] = entry['a']
        list1.append(dict1.copy())
        dict1['x'] = {}
    
    >>>print list1
    [{'x': {'y': 1}}, {'x': {'y': 2}}]
    

    Because, you're now creating a new dictionary on each iteration (using {}), and then you change it. You've created the first one in your initializing code. You're not changing the references of old dictionaries in the list. Only the one about to be inserted.

    Future tip: Best thing to do in this case, to help you follow the code, is either use debug mode and step through the program. Or do what lazy people do and simply print within the objects during the iterations to see what their values are.