I am working on the pytorch to learn.
And There is a question how to check the output gradient by each layer in my code.
My code is below
#import the nescessary libs
import numpy as np
import torch
import time
# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms
# Get GPU Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)
# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()
# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F
model = nn.Sequential(nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10),
nn.LogSoftmax(dim = 1)
)
model.to(device)
# Define the loss
criterion = nn.CrossEntropyLoss()
# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)
# Define the epochs
epochs = 5
train_losses, test_losses = [], []
# start = time.time()
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten Fashion-MNIST images into a 784 long vector
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model.forward(images)
loss = criterion(output, labels)
loss.backward()
# print(loss.grad)
optimizer.step()
running_loss += loss.item()
else:
print(model[0].grad)
If I print model[0].grad after back-propagation, Is it going to be the output gradient by each layer for every epoches?
Or, If I want to know the output gradient by each layer, where and what am I should print?
Thank you!!
Thank you for reading
Well, this is a good question if you need to know the inner computation within your model. Let me explain to you!
So firstly when you print the model
variable you'll get this output:
Sequential(
(0): Linear(in_features=784, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=10, bias=True)
(3): LogSoftmax(dim=1)
)
And if you choose model[0]
, that means you have selected the first layer of the model. that is Linear(in_features=784, out_features=128, bias=True)
. If you will look at the documentation of torch.nn.Linear
here, you will find that there are two variables to this class that you can access. One is Linear.weight and the other is Linear.bias which will give you the weights and biases of that corresponding layer respectively.
Remember you cannot use model.weight
to look at the weights of the model as your linear layers are kept inside a container called nn.Sequential
which doesn't has a weight
attribute.
So coming back to looking at weights and biases, you can access them per layer. So model[0].weight
and model[0].bias
are the weights and biases of the first layer. And similarly to access the gradients of the first layer model[0].weight.grad
and model[0].bias.grad
will be the gradients.