Search code examples
pythontensorflowdeep-learninglearning-rate

Properly set up exponential decay of learning rate in tensorflow


I need to apply an exponential decay of learning rate every 10 epochs. Initial learning rate is 0.000001, and decay factor is 0.95

is this the proper way to set it up?

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
        initial_learning_rate=0.000001, 
        decay_steps=(my_steps_per_epoch*10), 
        decay_rate=0.05)
opt = tf.keras.optimizers.SGD(learning_rate=lr_schedule, momentum=0.9)

The formula of exponential decay is current_lr = initial_lr * (1 - decay_factor)^t Except that in the code it is implemented as :

decayed_learning_rate = learning_rate *
                      decay_rate ^ (global_step / decay_steps)

To my knowledge, decay_rate should be 1 - decay_factor and decay_steps should mean how many steps are performed before applying the decay, in my case my_steps_per_epoch*10. Is that correct?

EDIT:

If I pause and save my model (using callbacks) after the 10th epoch, and then resume by loading the model and calling model.fit with initial_epoch=10 and epochs=11, will it start in the 11th epoch and apply the exponential decay?


Solution

  • decay_steps can be used to state after how many steps (processed batches) you will decay the learning rate. I find it quite useful to just specify the initial and the final learning rate and calculate the decay_factor automatically via the following:

    initial_learning_rate = 0.1
    final_learning_rate = 0.0001
    learning_rate_decay_factor = (final_learning_rate / initial_learning_rate)**(1/epochs)
    steps_per_epoch = int(train_size/batch_size)
    
    lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
                    initial_learning_rate=initial_learning_rate,
                    decay_steps=steps_per_epoch,
                    decay_rate=learning_rate_decay_factor,
                    staircase=True)