Previously, I mentioned that there aren't any options available to define a seed value in Branscripts for CNTK sequential machine learning models[1]. Hence I migrated my code to Python API (CNTK), which gives more fine-grained options when defining the seed values of sequential machine learning models. Below are the instances that I have used random initialization in my implementation (and set the corresponding seed value as well)
// CNTK imports
import numpy as np
import pandas as pd
import random
import math as m
from cntk.device import *
from cntk import Trainer
from cntk.layers import *
import cntk
import cntk.ops as o
import cntk.layers as l
//defining the random seed
np.random.seed(8888)
random.seed(8888)
// Defining input and output training vectors
input_array_df = np.asarray(input_split_df[1:len(input_split_df)], dtype=np.float32)
output_array_df = np.asarray(output_df_df[1:len(output_df_df)], dtype=np.float32)
tup=(input_array_df, output_array_df)
listOfTuplesOfInputsLabels.append(tup)
//shuffling the input vector
random.shuffle(listOfTuplesOfInputsLabels)
//Defining sequential model
num_minibatches = len(features) // minibatch_size
epoch_size = len(features)*1
feature = o.input_variable((input_dim),np.float32)
label = o.input_variable((output_dim),np.float32)
netout=Sequential([For(range(1), lambda i: Recurrence(LSTM(lstm_cell_dimension,use_peepholes=LSTM_USE_PEEPHOLES,init=glorot_uniform(seed=8888)))),Dense(output_dim,bias=BIAS,init=glorot_uniform(seed=8888))])(feature)
learner = momentum_sgd(netout.parameters, lr = learning_rate_schedule([(4,0.003),(16,0.002)], unit=UnitType.sample,epoch_size=epoch_size),
momentum=momentum_as_time_constant_schedule(minibatch_size / -m.log(0.9)), gaussian_noise_injection_std_dev = gaussian_noise,l2_regularization_weight =l2_regularization_weight)
//Splitting into mini batches
tf = np.array_split(features,num_minibatches)
tl = np.array_split(labels,num_minibatches)
//Train
features = np.ascontiguousarray(tf[i%num_minibatches])
labels = np.ascontiguousarray(tl[i%num_minibatches])
trainer.train_minibatch({feature : features, label : labels})
Unfortunately, even though I was able to successfully define the seed value in my code, I could still observe some smaller variations in my final result. Is this because of the floating point calculations? or could you find anything in my code that I should have set the seed value, which I haven't done it already?
Thanks !
[1] Defining a seed value in Branscripts for CNTK sequential machine learning models
Can you try the below:
from _cntk_py import set_fixed_random_seed, force_deterministic_algorithms
set_fixed_random_seed(1)
force_deterministic_algorithms()