PyTorch: What's the purpose of saving the optimizer state?

PyTorch is capable of saving and loading the state of an optimizer. An example is shown in the PyTorch tutorial. I'm currently just saving and loading the model state but not the optimizer. So what's the point of saving and loading the optimizer state besides not having to remember the optimizers params such as the learningrate. And what's contained in the optimizer state?

Solution

You should save the optimizer state if you want to resume model training later. Especially if Adam is your optimizer. Adam is an adaptive learning rate method, which means it computes individual learning rates for various parameters.

It is not required if you only want to use the saved model for inference.

However, It's best practice to save both model state and optimizer state. You can also save loss history and other running metrics if you want to plot them later.

I'd do it like,

    torch.save({
            'epoch': epochs,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'train_loss_history': loss_history,
            }, PATH)

If you are using PyTorch lightning module for the optimizer_state_dict. You can do something like this

  return torch.save({
          "model_state_dict": model.state_dict(),
          "model_class": VAE,
          "model_args": {"z_dim": model.z_dim},
          "optimizer_state_dict": model.optimizers().optimizer.state_dict(),
         },path)