Search code examples
python-3.xpytorchskorch

optimizer got an empty parameter list (skorch)


So, I am used to use PyTorch and now decided to give Skorch a shot.

Here they define the network as


class ClassifierModule(nn.Module):
    def __init__(
            self,
            num_units=10,
            nonlin=F.relu,
            dropout=0.5,
    ):
        super(ClassifierModule, self).__init__()
        self.num_units = num_units
        self.nonlin = nonlin
        self.dropout = dropout

        self.dense0 = nn.Linear(20, num_units)
        self.nonlin = nonlin
        self.dropout = nn.Dropout(dropout)
        self.dense1 = nn.Linear(num_units, 10)
        self.output = nn.Linear(10, 2)

    def forward(self, X, **kwargs):
        X = self.nonlin(self.dense0(X))
        X = self.dropout(X)
        X = F.relu(self.dense1(X))
        X = F.softmax(self.output(X), dim=-1)
        return X

I prefer inputting lists of neurons in each layer i.e num_units=[30,15,5,2] would have 2 hidden layers with 15 and 5 neurons. Furthermore we have 30 features and 2 classes, thus re-writing it to something like this


class Net(nn.Module):
    def __init__(
            self,
            num_units=[30,15,5,2],
            nonlin=[F.relu,F.relu,F.relu],
            dropout=[0.5,0.5,0.5],
            ):
        super(Net, self).__init__()

        self.layer_units = layer_units     
        self.nonlin = nonlin #Activation function
        self.dropout = dropout #Drop-out rates in each layer
        self.layers = [nn.Linear(i,p) for i,p in zip(layer_units,layer_units[1:])] #Dense layers



    def forward(self, X, **kwargs):
        print("Forwards")
        for layer,func,drop in zip(self.layers[:-1],self.nonlin,self.dropout):
            print(layer,func,drop)
            X=drop(func(layer(X)))


        X = F.softmax(X, dim=-1)
        return X


should do the trick. The problem is that when calling

net = NeuralNetClassifier(Net,max_epochs=20,lr=0.1,device="cuda")
net.fit(X,y)

I get the error "ValueError: optimizer got an empty parameter list". I have narrowed it down to the removal of self.output = nn.Linear(10, 2) simply makes the net not enter forward i.e it seems like output is some kind of "trigger" variable. Is that really the case the network need a variable called output (being a layer) at the end, and that we are not free to define the variable-names ourself ?


Solution

  • Pytorch will look for subclasses of nn.Module, so changing

    self.layers = [nn.Linear(i,p) for i,p in zip(layer_units,layer_units[1:])]
    

    to

    self.layers = nn.ModuleList([nn.Linear(i,p) for i,p in zip(layer_units,layer_units[1:])])
    

    should work fine