I'm not used to using TensorFlow or nn so forgive me if I don't get what you guys say at the first time.
I'm currently trying to add one batch normalization layer after every convolutional layer in yolo v1 code that I got at the Internet.
Code below is the batch normalization function that I used.
def batchnorm(self, inp):
with tf.variable_scope("batchnorm"):
channels = inp.get_shape()[3]
offset = tf.get_variable("offset",
channels,
dtype=tf.float32,
initializer=tf.zeros_initializer())
scale = tf.get_variable("scale",
channels,
dtype=tf.float32,
initializer=tf.random_normal_initializer(1.0, 0.02))
mean, variance = tf.nn.moments(inp, axes=[0, 1, 2], keep_dims=False)
variance_epsilon = 1e-5
normalized = tf.nn.batch_normalization(inp, mean, variance,
offset, scale, variance_epsilon)
return normalized
Code below is the structure of the yolov1 code that I'm using
if self.verbose:
print('Building Yolo Graph....')
# Reset default graph
tf.reset_default_graph()
# Input placeholder
self.x = tf.placeholder('float32', [None, 448, 448, 3])
self.label_batch = tf.placeholder('float32', [None, 73])
# conv1, pool1
self.conv1 = self.conv_layer(1, self.x, 64, 7, 2)
self.pool1 = self.maxpool_layer(2, self.conv1, 2, 2)
# size reduced to 64x112x112
# conv2, pool2
self.conv2 = self.conv_layer(3, self.pool1, 192, 3, 1)
self.pool2 = self.maxpool_layer(4, self.conv2, 2, 2)
# size reduced to 192x56x56
# conv3, conv4, conv5, conv6, pool3
self.conv3 = self.conv_layer(5, self.pool2, 128, 1, 1)
self.conv4 = self.conv_layer(6, self.conv3, 256, 3, 1)
self.conv5 = self.conv_layer(7, self.conv4, 256, 1, 1)
self.conv6 = self.conv_layer(8, self.conv5, 512, 3, 1)
self.pool3 = self.maxpool_layer(9, self.conv6, 2, 2)
# size reduced to 512x28x28
# conv7 - conv16, pool4
self.conv7 = self.conv_layer(10, self.pool3, 256, 1, 1)
self.conv8 = self.conv_layer(11, self.conv7, 512, 3, 1)
self.conv9 = self.conv_layer(12, self.conv8, 256, 1, 1)
self.conv10 = self.conv_layer(13, self.conv9, 512, 3, 1)
self.conv11 = self.conv_layer(14, self.conv10, 256, 1, 1)
self.conv12 = self.conv_layer(15, self.conv11, 512, 3, 1)
self.conv13 = self.conv_layer(16, self.conv12, 256, 1, 1)
self.conv14 = self.conv_layer(17, self.conv13, 512, 3, 1)
self.conv15 = self.conv_layer(18, self.conv14, 512, 1, 1)
self.conv16 = self.conv_layer(19, self.conv15, 1024, 3, 1)
self.pool4 = self.maxpool_layer(20, self.conv16, 2, 2)
# size reduced to 1024x14x14
# conv17 - conv24
self.conv17 = self.conv_layer(21, self.pool4, 512, 1, 1)
self.conv18 = self.conv_layer(22, self.conv17, 1024, 3, 1)
self.conv19 = self.conv_layer(23, self.conv18, 512, 1, 1)
self.conv20 = self.conv_layer(24, self.conv19, 1024, 3, 1)
self.conv21 = self.conv_layer(25, self.conv20, 1024, 3, 1)
self.conv22 = self.conv_layer(26, self.conv21, 1024, 3, 2)
self.conv23 = self.conv_layer(27, self.conv22, 1024, 3, 1)
self.conv24 = self.conv_layer(28, self.conv23, 1024, 3, 1)
# size reduced to 1024x7x7
# fc1, fc2, fc3
self.fc1 = self.fc_layer(29, self.conv24, 512,
flatten=True, linear=False)
self.fc2 = self.fc_layer(
30, self.fc1, 4096, flatten=False, linear=False)
self.fc3 = self.fc_layer(
31, self.fc2, 1470, flatten=False, linear=True)
varlist = self.print_tensors_in_checkpoint_file(file_name=self.weightFile, all_tensors=True, tensor_name=None)
variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
self.saver = tf.train.Saver(variables[:len(varlist)])
self.loss = self.calculate_loss_function(self.fc3 , self.label_batch)
self.sess = tf.Session()
self.saver.restore(self.sess, self.weightFile)
self.only_restore_conv20 = False
if self.only_restore_conv20:
after_20_initializer = [var.initializer for var in tf.global_variables()[3:]]
self.sess.run(after_20_initializer)
#exerpath = 'C:/Users/dml/PycharmProjects/YOLOv1-master/exer_ckpt/exer.ckpt'
self.training = tf.train.MomentumOptimizer(momentum=0.5, learning_rate=1e-4).minimize(self.loss)
Momentum_initializers = [var.initializer for var in tf.global_variables() if 'Momentum' in var.name]
self.sess.run(Momentum_initializers)
And finally the error I'm getting after putting a batchnorm layer right after conv1 layer like
self.conv1 = self.conv_layer(1, self.x, 64, 7, 2)
self.bn1 = self.batchnorm(self.conv1)
self.pool1 = self.maxpool_layer(2, self.bn1, 2, 2)
Is
NotFoundError: Key batchnorm/offset not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
After few days of struggle I found out that it's related to restoring weights in the checkpoint file. And because my batchnorm variable is not in the checkpoint file. But I can't find out how to make my code work.
You are right, the issue is that when you load a checkpoint TensorFlow wants to restore the values of all variables. It raises an error if some variable is not found in the checkpoint file.
I guess your checkpoint file does not contain the variables in your new normalization layer. If so, this checkpoint is probably useless. The pre-trained variable values will likely give pretty bad results when used in a new networks structure (with your normalization layer after each conv layer).
If you still want to try using the pre-trained weights from the checkpoint file, you will need to load the variable values from the checkpoint yourself. Assuming the variable names and shapes did not change, you should be able to use a version of optimistic_restore function in this gist. This gist shows an example of adding a layer after creating a checkpoint - your exact case.