machine-learning deep-learning pytorch torch

pytorch questions: how to add bias term and extract its value? class vs sequential model? and softmax

I have a basic neural network model in pytorch like this:

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.sigmoid = nn.Sigmoid()
        self.fc2 = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        out = self.fc1(x)
        out = self.sigmoid(out)
        out = self.fc2(out)
        return out
net = Net(400, 512,10)

How can I extract bias/intercept term from net.parameters()? And is this model equivalent to using sequential()?

net = nn.Sequential(nn.Linear(input_dim, hidden_dim[0]),
                      nn.Sigmoid(),
                      nn.Linear(hidden_dim[0], hidden_dim[1]),
                      nn.Sigmoid(),
                      nn.Linear(hidden_dim[1], output_dim))

Is nn.Softmax() optional at the end of either model for multi-class classification? If I understand correctly, with software it outputs probability of a certain class but without it returns predicted output?

Thanks in advance for answering my newbie questions.

Solution

Let's answer questions one by one. is this model equivalent to using sequential() Short answer: No. You can see that you have added two Sigmoid and two linear layers. You can print your net and see the result:

net = Net(400, 512,10)

print(net.parameters())
print(net)
input_dim = 400
hidden_dim = 512
output_dim = 10

model = Net(400, 512,10)

net = nn.Sequential(nn.Linear(input_dim, hidden_dim),
                      nn.Sigmoid(),
                      nn.Linear(hidden_dim, hidden_dim),
                      nn.Sigmoid(),
                      nn.Linear(hidden_dim, output_dim))

print(net)

The output is:

Net(
  (fc1): Linear(in_features=400, out_features=512, bias=True)
  (sigmoid): Sigmoid()
  (fc2): Linear(in_features=512, out_features=10, bias=True)
)

Sequential(
  (0): Linear(in_features=400, out_features=512, bias=True)
  (1): Sigmoid()
  (2): Linear(in_features=512, out_features=512, bias=True)
  (3): Sigmoid()
  (4): Linear(in_features=512, out_features=10, bias=True)
)

I hope you can see where they differ.

Your first question: How can I extract bias/intercept term from net.parameters()

The answer:

model = Net(400, 512,10)

bias = model.fc1.bias

print(bias)

the output is:

tensor([ 3.4078e-02,  3.1537e-02,  3.0819e-02,  2.6163e-03,  2.1002e-03,
         4.6842e-05, -1.6454e-02, -2.9456e-02,  2.0646e-02, -3.7626e-02,
         3.5531e-02,  4.7748e-02, -4.6566e-02, -1.3317e-02, -4.6593e-02,
        -8.9996e-03, -2.6568e-02, -2.8191e-02, -1.9806e-02,  4.9720e-02,
        ---------------------------------------------------------------
        -4.6214e-02, -3.2799e-02, -3.3605e-02, -4.9720e-02, -1.0293e-02,
         3.2559e-03, -6.6590e-03, -1.2456e-02, -4.4547e-02,  4.2101e-02,
        -2.4981e-02, -3.6840e-03], requires_grad=True)