I want to load a pre-trained model (optimized by AdadeltaOptimizer) and continue training with SGD (GradientDescentOptimizer). The models are saved and loaded with tensorlayer API:
save model:
import tensorlayer as tl
tl.files.save_npz(network.all_params,
name=model_dir + "model-%d.npz" % global_step)
load model:
load_params = tl.files.load_npz(path=resume_dir + '/', name=model_name)
tl.files.assign_params(sess, load_params, network)
If I continue training with adadelta, the training loss (cross entropy) looks normal (start at a close value as the loaded model). However, if I change the optimizer to SGD, the training loss would be as large as a newly initialized model.
I took a look at the model-xxx.npz
file from tl.files.save_npz
. It only saves all model parameters as ndarray. I'm not sure how the optimizer or learning rate is involved here.
You probably would have to import the tensor into a variable which is the loss function/cross-entropy that feeds into your Adam Optimizer previously. Now, just feed it through your SGD optimizer instead.
saver = tf.train.import_meta_graph('filename.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))
graph = tf.get_default_graph()
cross_entropy = graph.get_tensor_by_name("entropy:0") #Tensor to import
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
In this case, I have tagged the cross-entropy Tensor before training my pre-train model with the name entropy
, as such
tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv), name = 'entropy')
If you are unable to make changes to your pretrain model, you can obtain the list of Tensors in your model(after you have imported it) from graph
and deduce which Tensor you require. I have no experience with Tensorlayer, so this guide is to provide more of an understanding. You can take a look at Tensorlayer-Layers, they should explain how to obtain your Tensor. As Tensorlayer is built on top of Tensorflow, most of the functions should still be available.