I'm getting the following error when calling .backward():
Encounter the RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
Here's the code:
for i, j, k in zip(X, Y, Z):
A[:, i, j] = A[:, i, j] + k
I've tried .clone(), torch.add(), and so on.
Please help!
After the comments I'm a bit confused about what you want to accomplish. The code you gave gives me an error using the dimensions you provided in the comments
Traceback (most recent call last):
A[:, i, j] = A[:, i, j] + k
RuntimeError: The size of tensor a (32) must match the size of tensor b (200) at non-singleton dimension 0
But here's what I think you want to do, please correct me in the comments if this is wrong...
Given tensors X
, Y
, and Z
, each entry of X
, Y
, and Z
correspond to a coordinate (x,y) and a value z. What you want is to add z to A
at coordinate (x,y). For most cases the batch dimension is kept independent, although its not clear that's the case in the code you posted. For now that's what I'll assume you want to do.
For example lets say A
contains all zeros and has shape 3x4x5 and X
,Y
are shape 3x3 and Z
is shape 3x3x1. For this example let's assume A
contains all zeros to start, and X
, Y
, and Z
have the following values
X = tensor([[1, 2, 3],
[1, 2, 3],
[2, 2, 2]])
Y = tensor([[1, 2, 3],
[1, 2, 3],
[1, 1, 1]])
Z = tensor([[[0.1], [0.2], [0.3]],
[[0.4], [0.5], [0.6]],
[[0.7], [0.8], [0.9]]])
Then we would expect A
to have the following values after the operation
A = tensor([[[0, 0, 0, 0, 0],
[0, 0.1, 0, 0, 0],
[0, 0, 0.2, 0, 0],
[0, 0, 0, 0.3, 0]],
[[0, 0, 0, 0, 0],
[0, 0.4, 0, 0, 0],
[0, 0, 0.5, 0, 0],
[0, 0, 0, 0.6, 0]],
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 2.4, 0, 0, 0],
[0, 0, 0, 0, 0]]])
In order to accomplish this we can make use to the index_add
function which allows us to add to a list of indices. Since this only supports 1-dimensional operations we first need to convert X
,Y
to a linear index for flattened tensor A
. Afterwards we can un-flatten to the original shape.
layer_size = A.shape[1] * A.shape[2]
index_offset = torch.arange(0, A.shape[0] * layer_size, layer_size).unsqueeze(1)
indices = (X * A.shape[2] + Y) + index_offset
A = A.view(-1).index_add(0, indices.view(-1), Z.view(-1)).view(A.shape)