I am writing code for a PINN model. While calculating the gradients for the loss PDE, I used torch.autograd.grad()
. But it is showing
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
for the line
dphidx = torch.autograd.grad(train_output[:, 0], X_train_tensor[:,0], torch.ones_like(train_output[:, 0]), create_graph=True)[0]
I checked both train_output[:, 0]
and X_train_tensor[:,0]
are true for gradient_requres(True)
. Now I am confused about what is wrong here.
I am attaching the model's code for clarity:
import torch.nn as nn
class PINNFP(nn.Module):
def __init__(self):
super().__init__()
self.manual_layers = nn.Sequential(
nn.Linear(in_features = 3, out_features = 5),
nn.Linear(in_features = 5, out_features = 5),
nn.Linear(in_features = 5, out_features = 5),
nn.Linear(in_features = 5, out_features = 5),
nn.Linear(in_features = 5, out_features = 2))
def forward(self, x):
return self.manual_layers(x)
model_1 = PINNFP()
train_output = model_1(X_train_tensor)
dphidx = torch.autograd.grad(train_output[:, 0], X_train_tensor[:,0], torch.ones_like(train_output[:, 0]), create_graph=True)[0]
How can I fix this error? I used allow_unused=None
, and in that case, I get none value as the gradient which I do not want.
You have to change X_train_tensor[:,0]
to X_train_tensor
import torch.nn as nn
class PINNFP(nn.Module):
def __init__(self):
super().__init__()
self.manual_layers = nn.Sequential(
nn.Linear(in_features = 3, out_features = 5),
nn.Linear(in_features = 5, out_features = 5),
nn.Linear(in_features = 5, out_features = 5),
nn.Linear(in_features = 5, out_features = 5),
nn.Linear(in_features = 5, out_features = 2))
def forward(self, x):
return self.manual_layers(x)
model_1 = PINNFP()
X_train_tensor = torch.randn(8, 3, requires_grad=True)
train_output = model_1(X_train_tensor)
dphidx = torch.autograd.grad(train_output[:,0], X_train_tensor, torch.ones_like(train_output[:,0]))[0]
Slicing creates a new tensor with a new computational graph. So the tensor X_train_tensor[:,0]
is actually part of a new computational graph stemming directly from X_train_tensor
. This means there's no path from train_output[:,0]
to X_train_tensor[:,0]
. What you can do instead is backprop from train_output[:,0]
to X_train_tensor
and take a slice of the gradient.