Search code examples
tensorflow2.0loss-functionloss

Question about Tensorflow Model Reproducibility


I am currently working on tensorflow model and got a issue about its reproducibility.

I built a simple Dense model initialized with a constant value and trained with dummy data.

import tensorflow as tf
weight_init = tf.keras.initializers.Constant(value=0.001)

inputs = tf.keras.Input(shape=(5,))
layer1 = tf.keras.layers.Dense(5, 
                           activation=tf.nn.leaky_relu,
                           kernel_initializer=weight_init,
                           bias_initializer=weight_init)
outputs = layer1(inputs)
model = tf.keras.Model(inputs=inputs, outputs=outputs, name="test")

model.compile(loss='mse',
               optimizer = 'Adam'
               )
model.fit([[111,1.02,-1.98231,1,1],
      [112,1.02,-1.98231,1,1],
      [113,1.02,-1.98231,1,1],
      [114,1.02,-1.98231,1,1],
      [115,1.02,-1.98231,1,1],
      [116,1.02,-1.98231,1,1],
      [117,1.02,-1.98231,1,1]], 
      
      [1,1,1,1,1,1,2], epochs = 3, batch_size=1)

Even though I set initial value of model as 0.001, loss of training changes with every attempt....

What am I missing here? Are there any additional values for me to fix to constant?

What is more surprising is that if I change batch_size to 16, loss doesn't change with attempt

Please.. teach me guys...


Solution

  • Since keras.model.fit() has default kwarg shuffle=True, data will be shuffled cross batch. If you change batch_size to any integer that larger than data length, any shuffle will be invalid, because there left only one batch.

    So, add shuffle=False in model.fit() will achieve reproducibility here.

    Additionally, if your model grows bigger, real reproducibility problem will arise, i.e, there will be slight error in the results of two successive calculations, even though you do no random or shuffle, but just click run, then click run. We draw this as determinism for reproducibility.

    Determinism is a good question that usually easily ignored by many users. Let's start with the conclusion, i.e., reproducibility is influence by:

    • random seed (operation_seed+hidden_global_seed)
    • operation determinism

    How to do? Tensorflow determinism has declared precisicely, i.e, add the following codes before building or restoring the model.

    tf.keras.utils.set_random_seed(some_seed)
    tf.config.experimental.enable_op_determinism()
    

    But it can be used only if you rely heavily on reproducibility, since tf.config.experimental.enable_op_determinism() will reduce the speed significantly. The deeper reason is that, hardware reduces some accuracy in order to speed up the calculation, which usually does not affect our algorithm. Except in the deep learning, the model is very large, leading errors occured easily, and the training cycle is very long, leading accumulated errors. If in a regression model, any extra error is unacceptable, so we need deterministic algorithm in this occasion.