Search code examples
tensorflowcheckpointing

TensorFlow train.Supervisor - save checkpoint upon training stop?


In TensorFlow 1.0, tf.train.Supervisor saves checkpoints at intervals of save_model_secs. Is there any way to save a checkpoint at the termination of training, rather than periodically during training?


Solution

  • tf.train.Supervisor writes a checkpoint at the end of looping. If you want to avoid writing other checkpoints, you can just set save_model_secs to a large value. Here is an example that just saves a single, final checkpoint:

    import tensorflow as tf
    
    y = tf.Variable(0)
    y = tf.assign_add(y, 1)
    
    sv = tf.train.Supervisor(logdir='/tmp', save_model_secs=100000000)
    
    with sv.managed_session() as sess:
    
        for step in range(10):
            if sv.should_stop():
                 break
    
            print(sess.run(y))