I simply want to add the first three values of the third dimension of tensor2 to tensor1, without affecting the graph when doing backpropagation. Tensor2 is only required for its values, it shall not be part of the graph.
Does this work? That's how I would have done it in numpy.
tensor1[:, :, :3] += tensor2[:, :, :3]
Should I better use torch.add() or use .data? I am confused about when to use what. Thank you.
You should be able to use detatch()
to return a copy of the tensor (tensor2
) with requires_grad = False
. Using the inplace += operator causes errors during backpropagation (i.e. at various times during the forward pass, the same variable stored 2 different values with 2 different associated gradients, but only one set of value/ gradient is stored in that variable during the backwards pass.) I'm a bit fuzzy on whether in-place operations are allowed for variables that are part of the computation graph but when the operation itself is not. You can test this to see, but to be safe I recommend:
tensor1[:,:,:3] = torch.add(tensor1[:,:,:3],tensor2[:,:,:3].detach())
Later, if you want to do another operation using tensor2 where the gradient IS part of the computation graph, you still can do this as well.