Search code examples
machine-learningdeep-learningpytorchtorch

pytorch questions: how to add bias term and extract its value? class vs sequential model? and softmax


I have a basic neural network model in pytorch like this:

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.sigmoid = nn.Sigmoid()
        self.fc2 = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        out = self.fc1(x)
        out = self.sigmoid(out)
        out = self.fc2(out)
        return out
net = Net(400, 512,10)

How can I extract bias/intercept term from net.parameters()? And is this model equivalent to using sequential()?

net = nn.Sequential(nn.Linear(input_dim, hidden_dim[0]),
                      nn.Sigmoid(),
                      nn.Linear(hidden_dim[0], hidden_dim[1]),
                      nn.Sigmoid(),
                      nn.Linear(hidden_dim[1], output_dim))

Is nn.Softmax() optional at the end of either model for multi-class classification? If I understand correctly, with software it outputs probability of a certain class but without it returns predicted output?

Thanks in advance for answering my newbie questions.


Solution

  • Let's answer questions one by one. is this model equivalent to using sequential() Short answer: No. You can see that you have added two Sigmoid and two linear layers. You can print your net and see the result:

    net = Net(400, 512,10)
    
    print(net.parameters())
    print(net)
    input_dim = 400
    hidden_dim = 512
    output_dim = 10
    
    model = Net(400, 512,10)
    
    net = nn.Sequential(nn.Linear(input_dim, hidden_dim),
                          nn.Sigmoid(),
                          nn.Linear(hidden_dim, hidden_dim),
                          nn.Sigmoid(),
                          nn.Linear(hidden_dim, output_dim))
    
    print(net)
    

    The output is:

    Net(
      (fc1): Linear(in_features=400, out_features=512, bias=True)
      (sigmoid): Sigmoid()
      (fc2): Linear(in_features=512, out_features=10, bias=True)
    )
    
    Sequential(
      (0): Linear(in_features=400, out_features=512, bias=True)
      (1): Sigmoid()
      (2): Linear(in_features=512, out_features=512, bias=True)
      (3): Sigmoid()
      (4): Linear(in_features=512, out_features=10, bias=True)
    )
    
    

    I hope you can see where they differ.

    Your first question: How can I extract bias/intercept term from net.parameters()

    The answer:

    model = Net(400, 512,10)
    
    bias = model.fc1.bias
    
    print(bias)
    

    the output is:

    tensor([ 3.4078e-02,  3.1537e-02,  3.0819e-02,  2.6163e-03,  2.1002e-03,
             4.6842e-05, -1.6454e-02, -2.9456e-02,  2.0646e-02, -3.7626e-02,
             3.5531e-02,  4.7748e-02, -4.6566e-02, -1.3317e-02, -4.6593e-02,
            -8.9996e-03, -2.6568e-02, -2.8191e-02, -1.9806e-02,  4.9720e-02,
            ---------------------------------------------------------------
            -4.6214e-02, -3.2799e-02, -3.3605e-02, -4.9720e-02, -1.0293e-02,
             3.2559e-03, -6.6590e-03, -1.2456e-02, -4.4547e-02,  4.2101e-02,
            -2.4981e-02, -3.6840e-03], requires_grad=True)