I am building more than one DNN model based on Tensorflow's skflow library. I partition my data into minibatches and use partial_fit for fitting. After every cycle of partial_fit, I would like to copy the weights of the first n-hidden layers of a TensorFlowDNNClassifier model to another TensorFlowDNNClassifier model. And then continue learning/copying using partial_fit. (The topology of the first n-hidden layers for both models are identical.)
I know how to retrieve weights from classifier1:
classifier.get_tensor_value('dnn/layer0/Linear/Matrix:0')
But I don't know how to copy their values to a classifier2!
The usecase:
I am trying to build an ensemble of M DNN models based on skflow's TensorFlowDNNClassifier/TensorFlowDNNRegressor. I would like these M models to share the first n layers among each other. That is, the same inputs, architecture, and values. I wanted to do this with minimal change to the original code of skflow. To do this, I thought of dividing my data into minibatches and training the models one minibatch at the time. During each step (using a minibatch), I apply partial_fit on one model and copy the weights of its first n-hidden layers to the next model in the ensemble. Then I partial_fit the second model using the same minibatch and then I copy the new values of the weights to the next model. I repeat this training/copying until I reach to the last model in the ensemble. After training the Mth model, I copy the weights of its first n-hidden layers to all the previous (M-1) models. I then repeat this with the next minibatch until the weights of all M models converge.
EDIT: Thanks to @Ismael and @ilblackdragon(via another forum) for their valuable input. Their suggested solutions work best upon model creation. I had to add extra functions to TensorFlowEstimator so that I can easily copy weights from one model to another as I train (multiple steps of training on minibatches). I added the following functions to the class TensorFlowEstimator (defined in the file estimators/base.py)
def extract_num_hidden_layers(self,graph_ops):
nhl = 0
are_there_more_layers = True
while are_there_more_layers:
are_there_more_layers = False
layerName = 'dnn/layer' + str(nhl) + '/Linear/Matrix'
for op in graph_ops:
if(op.name == layerName):
nhl+=1
are_there_more_layers = True
break
return nhl
def create_updaters(self):
self.weight_updaters = []
self.bias_updaters = []
for h in range(0,self.num_hidden_layers):
with tf.variable_scope('', reuse=True):
wName = 'dnn/layer' + str(h) + '/Linear/Matrix'
wUpOp = tf.assign(tf.get_variable(wName), self.nValues)
self.weight_updaters.append(wUpOp)
bName = 'dnn/layer' + str(h) + '/Linear/Bias'
bUpOp = tf.assign(tf.get_variable(bName), self.nValues)
self.bias_updaters.append(bUpOp)
def get_layer_weights(self, layer_num):
layer_name = 'dnn/layer' + str(layer_num) + '/Linear/Matrix:0'
return self.get_tensor_value(layer_name)
def get_layer_biases(self, layer_num):
layer_name = 'dnn/layer' + str(layer_num) + '/Linear/Bias:0'
return self.get_tensor_value(layer_name)
def get_layer_params(self, layer_num):
return [self.get_layer_weights(layer_num), self.get_layer_biases(layer_num)]
def set_layer_weights(self, layer_num, weights_values):
self._session.run(self.weight_updaters[layer_num],
feed_dict = {self.nValues: weights_values})
def set_layer_biases(self, layer_num, biases_values):
self._session.run(self.bias_updaters[layer_num],
feed_dict = {self.nValues: biases_values})
def set_layer_params(self, layer_num, params_values):
self.set_layer_weights(layer_num, params_values[0])
self.set_layer_biases(layer_num, params_values[1])
I then added the following lines into the function _setup_training right after creating the model's graph using self.model_fn(self._inp, self._out)
graph_ops = self._graph.get_operations()
self.num_hidden_layers = self.extract_num_hidden_layers(graph_ops)
self.nValues = tf.placeholder(tf.float32)
#self.weight_updaters & self.bias_updaters
self.create_updaters()
And here how to use the getter and setter functions:
iris = datasets.load_iris()
classifier = skflow.TensorFlowDNNClassifier(hidden_units=[10,5,4], n_classes=3,continue_training=True)
classifier.fit(iris.data, iris.target)
l1b = classifier.get_layer_biases(1)
l1b[3] = 2 # here I manually change the value for demo
classifier.set_layer_biases(1,l1b)
You should use TensorFlowEstimator, in which you can define your custom models, basically you can insert any TensorFlow code into a custom model.
So if you know how to retrieve weights you can use tf.Variable and pass the weights to a new dnn as its initial value, since:
tf.Variable could have a Tensor or Python object convertible to a Tensor as initial value
. So I am thinking that the transfer of weights should look something like this:
weights_i = classifier_i.get_tensor_value('dnn/layer0/Linear/Matrix:0')
def my_model_i_plus_1(X, y):
W = tf.Variable(weights_i)
b = tf.Variable(tf.zeros([biases_size]))
layer = tf.nn.relu(tf.matmul(X, W) + b)
return skflow.models.logistic_regression(layer, y)
classifier_i_plus_1 = skflow.TensorFlowEstimator(model_fn=my_model_i_plus_1,
n_classes=3,
optimizer="SGD")