Trying to utilize a custom loss function and getting error ‘RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn’. Error occurs during loss.backward()
I’m aware that all computations must be done in tensors with ‘require_grad = True’. I’m having trouble implementing that as my code requires a nested for loop. I believe it could be the for loop. Is there a way to create an empty tensor and append it? Below is my code.
def Gaussian_Kernal(x, mu, sigma):
p = (1./(math.sqrt(2. * math.pi * (sigma**2)))) * torch.exp((-1.) * (((Variable(x)**2) - mu)/(2. * (sigma**2))))
return p
class MEE(torch.nn.Module):
def __init__(self):
super(MEE,self).__init__()
def forward(self,output, target, mu, variance):
error = torch.subtract(Variable(output),Variable(target))
error_diff = []
for i in range(0, error.size(0)):
for j in range(0, error.size(0)):
error_diff.append(error[i] - error[j])
error_diff = torch.cat(error_diff)
torch.tensor(error_diff,requires_grad=True)
loss = (1./(target.size(0)**2)) * torch.sum(Gaussian_Kernal(Variable(error_diff), mu, variance*(2**0.5)))
loss = Variable(loss)
return loss
As long as you operate on Tensors and apply PyTorch functions and basic operators, it should work. Therefore no need to wrap your variables with torch.tensor
or Variable
. The latter has been being deprecated (since v0.4, I believe).
The Variable API has been deprecated: Variables are no longer necessary to use autograd with tensors. Autograd automatically supports Tensors with requires_grad set to True. PyTorch docs
I'm assuming output
and target
are tensors and mu
and variance
are reals and not tensors? Then, the first dimension of output
and target
would be the batch.
def Gaussian_Kernel(x, mu, sigma):
p = (1./(math.sqrt(2. * math.pi * (sigma**2)))) * torch.exp((-1.) * (((x**2) - mu)/(2. * (sigma**2))))
return p
class MEE(torch.nn.Module):
def __init__(self):
super(MEE, self).__init__()
def forward(self, output, target, mu, variance):
error = output - target
error_diff = []
for i in range(0, error.size(0)):
for j in range(0, error.size(0)):
error_diff.append(error[i] - error[j]) # Assuming that's the desired operation
error_diff = torch.cat(error_diff)
kernel = Gaussian_Kernel(error_diff, mu, variance*(2**0.5))
loss = (1./(target.size(0)**2))*torch.sum(kernel)
return loss