Search code examples
pythondeep-copyshallow-copy

Not able to understand shallow copy


>>> a = [1,2,3]
>>> b = a[:]
>>> id(a[0]) == id(b[0])
True
>>> b[0] = 99
>>> id(a[0]) == id(b[0])
False

I understand that to make a shallow copy we can use slicing and also that there is a copy module. But why does writing to the index of "b" change the id.


Solution

  • Part 1: Why did calling b[0] = 99 cause a[0] == b[0] >> False?

    The answer to your question is that when you do b[0] = 99, you do not "modify the memory address that one of B's fields point to", you actually change where one of B's fields point.

    a = [1,2,3]  
    

    Now a contains a reference to a list object, which in turn contains three references to three int objects.

    b = a[:]
    

    now b refers to a list object (not the same one as a refers to), and contains three references to three int objects, the same ones that a refers to.

    id(a) == id(b)
    #False
    

    False, because a and b are references to different list objects.

    id(a[0]) == id(b[0])
    #True
    

    True, because a[0] is a reference to the 1 object, and so is b[0]. There is only one 1 object, and it is referred to by a[0] and b[0]

    b[0] = 99  
    # ^^^^^^ THIS IS NOT WHAT THE WIKIPEDIA ARTICLE IS DESCRIBING
    

    b still refers to the same old list object, but b[0] now refers to the 99 object. You have simply replaced one of the references in the list referred to by b with a new reference to a different object. You have not modified any objects that a points to!

    id(a[0]) == id(b[0]) 
    #False
    

    False, because the 1 object is not the same object as the 99 object.


    Part 2: What the heck is that wikipedia article talking about, then?

    Here's an example of a copy operation where you are actually "modifying the memory address that one of B's fields point to", so you can see the subsequent change in the copied object.

    a = [[1],[2],[3]]
    

    a is a reference to a list object, but now the list object contains three references to list objects, which each contain a reference to an int object.

    b = a[:]
    

    As before, you've made b refer to a new, different list object, which refers to the SAME three list objects as are referred to in a. Here's proof:

    id(a) == id(b)
    # False
    

    False, because as before a and b are references to different list objects.

    id(a[0]) == id(b[0]) 
    #True
    

    True, because as before, both a[0] and b[0] refer to the same object. This time it's a list object, which is not immutable, unlike an int object - we can actually change the contents of the list object! Here's where it gets different:

    b[0][0] = 99  
    

    We did it - we changed the contents of the list object referred to by b[0]

    a[0][0]
    # 99 !!!!!!!!!!!!!!  Wikipedia doesn't need to be edited!
    

    See? The 'list' referred to by the list referred to by a now refers to the 99 object, rather than the 1 object, because it's the same list that you accessed with b[0][0] = 99. Simple, huh ;)

    id(a[0]) == id(b[0])
    #True !!!
    

    True, because although we changed the contents of what a[0] referenced, we did not change the reference itself in a[0] and b[0] - this is closer to what the Wikipedia article is describing in the 'Shallow Copy' section.