I have trained a (AllenNLP) multi-task model. I would like to keep the encoder/backbone weights and continue training with new heads on new datasets. How can I do that with AllenNLP?
I have two basic ideas for how to do that:
I followed this AllenNLP tutorial to load the trained model and then instead of just making predictions I wanted to change the configuration and the model-heads to continue training on the new datasets...but I am kinda lost in how to do that.
I guess it should be possible to (a) save the state-dict of the previously trained encoder in a file and then (b) point to those weights in the configuration file for the new model (instead of pointing to "bert-base-cased"-weights for example). But looking at the PretrainedTransformerEmbedder-class I don't see how I could pass my own model-weights to that class.
As an additional question: Is it also possible to save the weights of the heads separately and initialize new heads with those weights?
Any help is appreciated :)
Your second idea is the preferred way, which you can accomplish by using a PretrainedModelInitializer
. See the CopyNet
model for an example of how to add this to your model.