Search code examples
tensorflowdeep-learningmachine-translationsequence-to-sequence

Training trained seq2seq model on additional training data


I have trained a seq2seq model with 1M samples and saved the latest checkpoint. Now, I have some additional training data of 50K sentence pairs which has not been seen in previous training data. How can I adapt the current model to this new data without starting the training from scratch?


Solution

  • You do not have to re-run the whole network initialization. You may run an incremental training.

    Training from pre-trained parameters

    Another use case it to use a base model and train it further with new training options (in particular the optimization method and the learning rate). Using -train_from without -continue will start a new training with parameters initialized from a pre-trained model.

    Remember to tokenize your 50K corpus the same way you tokenized the previous one.

    Also, you do not have to use the same vocabulary beginning with OpenNMT 0.9. See the Updating the vocabularies section and use the appropriate value with -update_vocab option.