Search code examples
pythonarrayslistpass-by-referencepass-by-value

Python - array copying/assign, unexpected '=array[:]' behaviour for numpy


I was reading up on copying arrays (&list) by reference or value. However, I ran into an issue here.. To show my problem, I made three examples, each with an assignment and a change.

First Example: By default, it is copied by reference.
Thus, the change effekts a and ArrayA, both having the same address. OK

Second Example: Since the right hand side is evaluated first, the *1 doesn't change its value, but leads to a copying by value. (I Think this could be done few other ways as well, like using copy() and ..)
Thus, the change does only effect c, having different address than ArrayC. OK

Third Example: Here I add [:] to the array, thus copying the array(=by value), as far as I understand. It can be confirmed, by the different addresses of e and ArrayE. However, the change does not only effect e, but also ArrayE. For me, that´s pretty much unexpected, since it even showed me different addresses before. WHY?

Thanks ahead =)

import numpy as np
# Example 1, by reference
ArrayA = np.array([5,2,3,5,4])
ArrayB = np.array(  [1,2,3,4])

a = ArrayA
a[1:] += ArrayB

print("{}:\t{},\tid: {}".format("ArrayA",ArrayA, id(ArrayA) ))
print("{}:\t  {},\tid: {}".format("ArrayB",ArrayB, id(ArrayB) ))
print("{}:\t{},\tid: {}".format("a",a, id(a) ))

ArrayC = np.array([5,2,3,5,4])
ArrayD = np.array(  [1,2,3,4])


# Example 2, by value
c = ArrayC*1
c[1:] += ArrayD

print()
print("{}:\t{},\tid: {}".format("ArrayC",ArrayC, id(ArrayC) ))
print("{}:\t  {},\tid: {}".format("ArrayD",ArrayD, id(ArrayD) ))
print("{}:\t{},\tid: {}".format("c",c, id(c) ))

# Example 3, by reference/value?!?!
ArrayE = np.array([5,2,3,5,4])
ArrayF = np.array(  [1,2,3,4])

e = ArrayE[:]
e[1:] += ArrayF

print()
print("{}:\t{},\tid: {}".format("ArrayE",ArrayE, id(ArrayE) ))
print("{}:\t  {},\tid: {}".format("ArrayF",ArrayF, id(ArrayF) ))
print("{}:\t{},\tid: {}".format("e",e, id(e) ))
ArrayA: [5 3 5 8 8],    id: 2450575020480
ArrayB:   [1 2 3 4],    id: 2450575021680
a:      [5 3 5 8 8],    id: 2450575020480

ArrayC: [5 2 3 5 4],    id: 2450575021280
ArrayD:   [1 2 3 4],    id: 2450575022080
c:      [5 3 5 8 8],    id: 2450575022240

ArrayE: [5 3 5 8 8],    id: 2450575022640
ArrayF:   [1 2 3 4],    id: 2450575022000
e:      [5 3 5 8 8],    id: 2450575022880

Solution

  • EDITED - see @juanpa.arrivillaga's comments below.

    In all of your examples, values of the ndarrays are
    numpy.int32 objects, which are mutable.
    So from your 3rd example, both e and ArrayE point to the same numpy.int32 objects.
    That's why the changes are reflected on both.
    You can verify that by checking their ids.

    print(id(e[0]) == id(ArrayE[0]))