I have the follow code
import torch
import torch.nn as nn
from torchviz import make_dot, make_dot_from_trace
class Net(nn.Module):
def __init__(self, input, output):
super(Net, self).__init__()
self.fc = nn.Linear(input, output)
def forward(self, x):
x = self.fc(x)
x = self.fc(x)
return x
model = Net(12, 12)
print(model)
x = torch.rand(1, 12)
y = model(x)
make_dot(y, params = dict(model.named_parameters()))
Here I reuse the self.fc
twice in the forward
.
The computational graph is look
I am confused about the computational graph and, I am curious how to train this model in back propagation? It seem for me the gradient will live in a loop forever. Thanks a lot.
There are no issues with your graph. You can train it the same way as any other feed-forward model.
fc.bias
parameter. Since you are reusing the same layer two times, the bias has two outgoing arrows (used in two places of your net). During backpropagation stage the direction is reversed: bias will get gradients from two places, and these gradients will add up.Addmm(bias, x, T(weight)
, where T
is transposing and Addmm
is matrix multiplication plus adding a vector. So, you can see how data (weight
, bias
) is passed into functions (Addmm
, T
)