Search code examples
pytorchallennlp

AllenNLP Multi-Task Model: Keep encoder weights for new heads


I have trained a (AllenNLP) multi-task model. I would like to keep the encoder/backbone weights and continue training with new heads on new datasets. How can I do that with AllenNLP?

I have two basic ideas for how to do that:

  1. I followed this AllenNLP tutorial to load the trained model and then instead of just making predictions I wanted to change the configuration and the model-heads to continue training on the new datasets...but I am kinda lost in how to do that.

  2. I guess it should be possible to (a) save the state-dict of the previously trained encoder in a file and then (b) point to those weights in the configuration file for the new model (instead of pointing to "bert-base-cased"-weights for example). But looking at the PretrainedTransformerEmbedder-class I don't see how I could pass my own model-weights to that class.

As an additional question: Is it also possible to save the weights of the heads separately and initialize new heads with those weights?

Any help is appreciated :)


Solution

  • Your second idea is the preferred way, which you can accomplish by using a PretrainedModelInitializer. See the CopyNet model for an example of how to add this to your model.