Search code examples
tensorflowcheckpoint

Tensorflow checkpoint models getting deleted


I am using tensorflow checkpointing after every 10 epochs using the following code :

checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))
checkpoint_prefix = os.path.join(checkpoint_dir, "model")
...
if current_step % checkpoint_every == 0:
    path = saver.save(sess, checkpoint_prefix, global_step=current_step)
    print("Saved model checkpoint to {}\n".format(path))

The problem is that, as the new files are getting generated, previous 5 model files are getting deleted automatically.


Solution

  • This is the expected behavior, the docs for tf.train.Saver say that by default the 5 most recent checkpoint files are kept. To adjust that, set max_to_keep the the desired value.