Search code examples
pythonnumpymnist

Same data split into training, dev and test set


I am struggling to have the same split of data on every iteration in the following function ?

def data(filename):

    X_train = data('train-images.gz')
    Y_train = data('train-labels.gz')
    X_test = data('t10k-images.gz')
    Y_test = data('t10k-labels.gz')

    X_train, X_devel = X_train[:, :-devel_size], X_train[:, -devel_size:]
    Y_train, Y_devel = Y_train[:-devel_size], Y_train[-devel_size:]

    return X_train, Y_train, X_devel, Y_devel, X_test, Y_test

How can I have the same split of data to training & validation for the above function when I call it?

The reason is, I want to re-run the function with several optimization techniques and compare the accuracy.


Solution

  • Set the random seeds.

    tf.random.set_seed(1)
    np.random.seed(1)