Search code examples
pytorchskorch

Skorch training object from scratch


I'm trying to use skorch class to execut GridSearch on a classifier. I tried running with the vanilla NeuralNetClassifier object, but I haven't found a way to pass the Adam optimizer only the trainable weights (I'm using pre-trained embeddings and I would like to keep them frozen). It's doable if a module is initialized, and then pass those weights with the optimizer__params option, but module needs an uninitialized model. Is there a way around this?

net = NeuralNetClassifier(module=RNN, module__vocab_size=vocab_size, module__hidden_size=hidden_size,
                          module__embedding_dim=embedding_dim, module__pad_id=pad_id,
                          module__dataset=ClaimsDataset, lr=lr, criterion=nn.CrossEntropyLoss,
                          optimizer=torch.optim.Adam, optimizer__weight_decay=35e-3, device='cuda',
                          max_epochs=nb_epochs, warm_start=True)

The code above works. However, with the batch_size set at 64, I've got to run the model for the specified number of epochs on every batch! Which is not the behavior I'm seeking. I'd be grateful if someone could suggest a nicer way to do this.

My other issue is with subclassing skorch.NeuralNet. I run into a similar issue: figuring out a way to pass only the trainable weights to Adam optimizer. The code below is what I've got so far.

class Train(skorch.NeuralNet):
def __init__(self, module, lr, norm, *args, **kwargs):
    self.module = module
    self.lr = lr
    self.norm = norm
    self.params = [p for p in self.module.parameters(self) if p.requires_grad]
    super(Train, self).__init__(*args, **kwargs)

def initialize_optimizer(self):
    self.optimizer = torch.optim.Adam(params=self.params, lr=self.lr, weight_decay=35e-3, amsgrad=True)

def train_step(self, Xi, yi, **fit_params):
    self.module.train()

    self.optimizer.zero_grad()
    yi = variable(yi)

    output = self.module(Xi)

    loss = self.criterion(output, yi)
    loss.backward()

    nn.utils.clip_grad_norm_(self.params, max_norm=self.norm)
    self.optimizer.step()

def score(self, y_t, y_p):
    return accuracy_score(y_t, y_p)

Initializing the class gives the error:

Traceback (most recent call last):
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1664, in <module>
    main()
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals)  # execute the script
File "/snap/pycharm-community/74/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/l/Documents/Bsrc/cv.py", line 115, in <module>
    main()
File "/home/l/B/src/cv.py", line 86, in main
    trainer = Train(module=RNN, criterion=nn.CrossEntropyLoss, lr=lr, norm=max_norm)
File "/home/l/B/src/cv.py", line 22, in __init__
   self.params = [p for p in self.module.parameters(self) if p.requires_grad]
File "/home/l/B/src/cv.py", line 22, in <listcomp>
   self.params = [p for p in self.module.parameters(self) if p.requires_grad]
File "/home/l/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 739, in parameters
    for name, param in self.named_parameters():
AttributeError: 'Train' object has no attribute 'named_parameters' 

Solution

  • but module needs an uninitialized model

    That is not correct, you can pass an initialized model as well. The documentation of the model parameter states:

    It is, however, also possible to pass an instantiated module, e.g. a PyTorch Sequential instance.

    The problem is that when passing an initialized model you cannot pass any module__ parameters to the NeuralNet as this would require the module to be re-initialized. But of course that's problematic if you want to do a grid search over module parameters.

    A solution for this would be to overwrite initialize_model and after creating a new instance loading and freezing the parameters (by setting the parameter's requires_grad attribute to False):

    def _load_embedding_weights(self):
        return torch.randn(1, 100)
    
    def initialize_module(self):
        kwargs = self._get_params_for('module')
        self.module_ = self.module(**kwargs)
    
        # load weights
        self.module_.embedding0.weight = self._load_embedding_weights()
        # freeze layer
        self.module_.embedding0.weight.requires_grad = False
    
        return self