# a part of my code, it works independently
import torch
class wmodel(torch.nn.Module):
def __init__(self):
super().__init__()
self.model = torch.nn.Linear(8, 10)
self.lossfunc = torch.nn.BCELoss()
self.optimizer = optimizer(self.parameters(), lr=0.2)
rrr = wmodel()
datai = torch.tensor([1.,2,3,4,5,6,7,8])
target = torch.tensor([1.,2,3,4,5,6,7,8,9,10])
qan = rrr.model(datai)
loss = target-qan
I'm trying to calculate the loss[0]'s partial derivative to rrr.model.weight[0][0], one of the problem I'm facing is rrr.model.weight[0][0] does not participate in the calculation, which means I can't call grad on it, I'm searching for a variable that actually participated in the process
It may be helpful in pinpointing the exact problem if you post a minimal reproducible example. However, I suspect that it's possible that you're not calling your_model.backwards()
after calculating your loss. The gradients won't propagate until you call backwards()
, this is to account for things like RNNs or taking the average gradient over multiple batches. Each time you call backwards()
, it adds (as in arithmetic addition) the new gradient to the gradient currently propagated, until you call your_model.zero_grad()
, which sets your gradient back to zero.