I have been following the methodology of saving/restoring GPflow models with success. But now I've run into a snag.
When I try to restore a model with a Linear mean function, the restore crashes with an error.
I think that the issue comes in the naming convention of the tensorflow Linear mean function object. The above "-44dbadbb-0" is random and changes every time the model is rebuilt, so if I check the tensor names when I saved the model with
from tensorflow.python.tools.inspect_checkpoint import print_tensors_in_checkpoint_file
print_tensors_in_checkpoint_file(file_name='./model.ckpt', tensor_name='', all_tensors=False)
I get the return:
Linear-eeb5f9f3-0/A/unconstrained (DT_DOUBLE) [1,1] Linear-eeb5f9f3-0/b/unconstrained (DT_DOUBLE) [1] model/X/dataholder (DT_DOUBLE) [15,1] model/Y/dataholder (DT_DOUBLE) [15,1] model/kern/kernels/0/lengthscales/unconstrained (DT_DOUBLE) [] model/kern/kernels/0/variance/unconstrained (DT_DOUBLE) [] model/kern/kernels/1/lengthscales/unconstrained (DT_DOUBLE) [] model/kern/kernels/1/variance/unconstrained (DT_DOUBLE) [] model/likelihood/variance/unconstrained (DT_DOUBLE) []
Where the Linear function clearly has a different name from the model which is trying to be restored.
I have tried to fix this by renaming the variables before the restore, but this doesn't work with tensorflow. I also tried different saving/restoring methods, but then I have problems with being able to sample from the model.
import gpflow
import numpy as np
import random
import tensorflow as tf
# define data
rng = np.random.RandomState(4)
X = rng.uniform(0, 5.0, 15)[:, np.newaxis]
Y = np.sin((X[:, 0] - 2.5) ** 2).reshape(len(X),1)
# define the mean function
mf = gpflow.mean_functions.Linear(np.ones((1,1)),np.zeros((1,)))
# create the GP model
with gpflow.defer_build():
k = gpflow.kernels.Matern32(1)+gpflow.kernels.RBF(1)
m = gpflow.models.GPR(X, Y, kern=k,name='model',mean_function=mf)
m.likelihood.variance = 1e-03
m.likelihood.trainable = False
tf.global_variables_initializer()
tf_session = m.enquire_session()
m.compile( tf_session )
gpflow.train.ScipyOptimizer().minimize(m)
saver = tf.train.Saver()
save_path = saver.save(tf_session, "./model.ckpt")
print("Model saved in path: %s" % save_path)
import gpflow
import numpy as np
import random
import tensorflow as tf
# define data
rng = np.random.RandomState(4)
X = rng.uniform(0, 5.0, 15)[:, np.newaxis]
Y = np.sin((X[:, 0] - 2.5) ** 2).reshape(len(X),1)
# define the mean function
mf = gpflow.mean_functions.Linear(np.ones((1,1)),np.zeros((1,)))
with gpflow.defer_build():
k = gpflow.kernels.Matern32(1)+gpflow.kernels.RBF(1)
m = gpflow.models.GPR(X, Y, kern=k,name='model',mean_function=mf)
m.likelihood.variance = 1e-03
m.likelihood.trainable = False
# construct and compile the tensorflow session
tf.global_variables_initializer()
tf_session = m.enquire_session()
m.compile( tf_session )
saver = tf.train.Saver()
save_path = saver.restore(tf_session, "./model.ckpt")
print("Model loaded from path: %s" % save_path)
m.anchor(tf_session)
The code crashes at save_path = saver.restore(tf_session, "./model.ckpt")
with the error:
NotFoundError (see above for traceback): Key Linear-44dbadbb-0/A/unconstrained not found in checkpoint...
The defer_build()
does a bunch of things - but one part of constructing the entire model (i.e. tensorflow graph) in one go is that all the tensorflow variables & placeholders get consistent names, with all their names relating to the name of the model itself (which you set by passing the name='model'
keyword argument to the model constructor).
In your code, however, the Linear
mean function is constructed outside of the defer_build()
scope. This means gpflow has to construct a graph for it right away - including setting up variables for the parameters (slope & offset in this case). All tensorflow variables live in a global name space, so the only way of allowing more than a single object to be created is to assign them randomized names. (E.g., imagine wanting to construct a sum of two kernels of the same type!)
Fortunately, the fix is easy: simply move the construction of the mean function into the defer_build
block:
with gpflow.defer_build():
# define the mean function
mf = gpflow.mean_functions.Linear(np.ones((1,1)), np.zeros((1,)))
k = gpflow.kernels.Matern32(1) + gpflow.kernels.RBF(1)
m = gpflow.models.GPR(X, Y, kern=k, mean_function=mf, name='model')
m.likelihood.variance = 1e-03
m.likelihood.trainable = False
# construct and compile the tensorflow session
tf.global_variables_initializer()
tf_session = m.enquire_session()
m.compile(tf_session)
If you do this in both the "save" and "load" scripts, everything runs and hopefully as you expect it. Hope this helps!