Search code examples
pythontensorflowmachine-learningkerasneural-network

Ensuring Deterministic Outputs in Neural Network Training


I am new to neural networks and currently working with TensorFlow. For an experiment, I would like to build a model that consistently produces the same output for identical inputs. However, my initial attempt using a trivial test and setting the batch_size equal to the size of the training data did not achieve this goal:

model = keras.Sequential([keras.layers.Dense(1)])
model.compile( loss="MSE", metrics=[keras.metrics.BinaryAccuracy()])
model.fit(
  training_inputs,
  training_targets,
  epochs=5,
  batch_size=1000,
  validation_data=(val_inputs, val_targets)
)

I suspect that the default optimizer, SGD (Stochastic Gradient Descent), might be causing random outputs.

My questions are:

  • Are there any other factors in above code, besides the default optimizer (SGD), that can introduce randomness in the output of above neural network model?
  • How can I modify the provided code to ensure that the model produces the same output for the same input?

Thank you for your assistance.


Solution

  • Are there any other factors in above code, besides the default optimizer (SGD), that can introduce randomness in the output of above neural network model?

    Two other factors are that layer weights will be initialised with random values; and data shuffling (if enabled) would also be a source of randomness.

    How can I modify the provided code to ensure that the model produces the same output for the same input?

    Set the random seed for various random number generators at the start of the script:

    import numpy as np
    import random
    import tensorflow as tf
    
    #Set seeds for consistent results each run
    seed_value = 0
    np.random.seed(seed_value)
    random.seed(seed_value)
    tf.random.set_seed(seed_value)
    

    If you're working in a notebook, and you run a cell multiple times (outside of the seeding above), then each run of the cell will give you a different result if the cell depends on a random generator. So make sure to set the seed inside that cell as well. That way, each time you run the cell, it will initialise at the same place.

    Might also be worth seeing this answer for notes about setting the PYTHONHASHSEED environment variable. I haven't needed to configure it. Your use-case may or may not require it.