Search code examples
pythonlistnested-listsmutable

Generating sublists using multiplication ( * ) unexpected behavior


I'm sure this has been answered somewhere but I wasn't sure how to describe it.

Let's say I want to create a list containing 3 empty lists, like so:

lst = [[], [], []]

I thought I was being all clever by doing this:

lst = [[]] * 3

But I discovered, after debugging some weird behavior, that this caused an append update to one sublist, say lst[0].append(3), to update the entire list, making it [[3], [3], [3]] rather than [[3], [], []].

However, if I initialize the list with

lst = [[] for i in range(3)]

then doing lst[1].append(5)gives the expected [[], [5], []]

My question is why does this happen? It is interesting to note that if I do

lst = [[]]*3
lst[0] = [5]
lst[0].append(3)

then the 'linkage' of cell 0 is broken and I get [[5,3],[],[]], but lst[1].append(0) still causes [[5,3],[0],[0].

My best guess is that using multiplication in the form [[]]*x causes Python to store a reference to a single cell...?


Solution

  • My best guess is that using multiplication in the form [[]] * x causes Python to store a reference to a single cell...?

    Yes. And you can test this yourself

    >>> lst = [[]] * 3
    >>> print [id(x) for x in lst]
    [11124864, 11124864, 11124864]
    

    This shows that all three references refer to the same object. And note that it really makes perfect sense that this happens1. It just copies the values, and in this case, the values are references. And that's why you see the same reference repeated three times.

    It is interesting to note that if I do

    lst = [[]]*3
    lst[0] = [5]
    lst[0].append(3)
    

    then the 'linkage' of cell 0 is broken and I get [[5,3],[],[]], but lst[1].append(0) still causes [[5,3],[0],[0].

    You changed the reference that occupies lst[0]; that is, you assigned a new value to lst[0]. But you didn't change the value of the other elements, they still refer to the same object that they referred to. And lst[1] and lst[2] still refer to exactly the same instance, so of course appending an item to lst[1] causes lst[2] to also see that change.

    This is a classic mistake people make with pointers and references. Here's the simple analogy. You have a piece of paper. On it, you write the address of someone's house. You now take that piece of paper, and photocopy it twice so you end up with three pieces of paper with the same address written on them. Now, take the first piece of paper, scribble out the address written on it, and write a new address to someone else's house. Did the address written on the other two pieces of paper change? No. That's exactly what your code did, though. That's why the other two items don't change. Further, imagine that the owner of the house with address that is still on the second piece of paper builds an add-on garage to their house. Now I ask you, does the house whose address is on the third piece of paper have an add-on garage? Yes, it does, because it's exactly the same house as the one whose address is written on the second piece of paper. This explains everything about your second code example.

    1: You didn't expect Python to invoke a "copy constructor" did you? Puke.