I'm currently learning Pytorch, and encountered some unexpected behaviour when using the torch.from_numpy() command.
import torch as t
import numpy as np
array = np.arange(1, 10)
tensor = t.from_numpy(array)
print(array, tensor)
array[:] += 1
print(array, tensor)
outputs to this:
[1 2 3 4 5 6 7 8 9] tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
[ 2 3 4 5 6 7 8 9 10] tensor([ 2, 3, 4, 5, 6, 7, 8, 9, 10])
When I run the above code, the pytorch tensor change when the numpy array is changed and vice versa. This is expected behaviour according to pytorch documentation for from_numpy.
However, when I change the code a bit to:
import torch as t
import numpy as np
array = np.arange(1, 10)
tensor = t.from_numpy(array)
print(array, tensor)
array = array +1
print(array, tensor)
the output becomes:
[1 2 3 4 5 6 7 8 9] tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
[ 2 3 4 5 6 7 8 9 10] tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
Weirdly, if i change the array line to array += 1, the tensor and numpy array behaves as expected. Can anyone explain why? I'm using google collab to run this, and using cpu.
The difference is array[:] += 1
is an in-place operation, while array = array +1
is not.
To start, lets create the arrays and look at their data ids
array = np.arange(1, 10)
tensor = torch.from_numpy(array)
print(id(array), id(tensor))
> (140232345845456, 140232355106384)
In the above, array
and tensor
are objects with a specific ID.
array
and tensor
have different IDs, as they are different objects. Under the hood, tensor
points to the same piece of memory as array
. This is the point of using torch.from_numpy
- it creates a tensor referencing the same memory which avoids copying the data from the numpy array.
Now we update with array[:] += 1
. This is an in-place operation, meaning we mutate the underlying data of array
. When we print the IDs of array
and tensor
, note that they are the same as above. We are looking at the same objects. Then we print array
and tensor
themselves. We added 1 to array
, the values of array
are updated. We see the values of tensor
are also updated. This is because tensor
references the same memory as array
and we updated array
with an in-place operation, changing that piece of memory.
array[:] += 1
print(id(array), id(tensor))
> (140232345845456, 140232355106384)
print(array, tensor)
> [ 2 3 4 5 6 7 8 9 10] tensor([ 2, 3, 4, 5, 6, 7, 8, 9, 10])
Now we update with array = array +1
. This is not an in-place operation, so we are creating a new array
reference. When we look at the ID values, we see that array
has a different ID, while tensor
has the same ID.
The variable array
now references a new object, while tensor
references the old object. This is why array = array + 1
updates array
but not tensor
.
array = array + 1
print(id(array), id(tensor))
> (139781912744080, 140232355106384)
print(array, tensor)
[ 3 4 5 6 7 8 9 10 11] tensor([ 2, 3, 4, 5, 6, 7, 8, 9, 10])