Search code examples
pythontensorflowmachine-learningtensorflow-estimator

Tensors not in same graph for custom estimator


I'm new to tensorflow, and trying to make a ConvLSTM run using custom estimator.

I defined the model_fn as such :

def model_fn(features,labels,mode,params = None):

    if params==None:
        batch_size = 50
        time_steps = 150
        dim =40
    else:
        batch_size = params['batch_size']
        time_steps = params['time_steps']
        dim = params['dim']

    #instantiate cell
    net = tf.contrib.rnn.ConvLSTMCell(conv_ndims = 2,input_shape = [dim,dim,1],output_channels = 1,kernel_shape = [3,3])

    state = net.zero_state(batch_size,dtype = tf.float32)
    features = tf.cast(features,tf.float32)

    if mode != tf.estimator.ModeKeys.PREDICT: # Added in order to solve tf.cast problem if there is no labels
        labels = tf.cast(labels,tf.float32)
        state = net.zero_state(batch_size,dtype = tf.float32) # <-- inconsistent state size between training and predict, is it problematic ?
    else:
        state = net.zero_state(1,dtype = tf.float32)

    inputs = tf.split(features,time_steps,axis = 1)
    inputs_list = [tf.squeeze(input_,[1]) for input_ in inputs]

    outputs = []

    with tf.variable_scope("convLSTM") as scope: 
        for i, input_ in enumerate(inputs_list):
            if i>0:
                scope.reuse_variables()
            t_output ,state = net(input_,state)
            outputs.append(t_output)
    outputs = tf.stack(outputs,1)
    rmse = tf.Variable(tf.zeros([],dtype = np.float32))

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode,predictions=outputs)

    elif mode == tf.estimator.ModeKeys.TRAIN:
        loss = tf.losses.absolute_difference(labels,outputs)
        optimizer= tf.train.AdagradOptimizer (learning_rate = 0.1)
        train_op = optimizer.minimize(loss,global_step = tf.train.get_global_step())

        rmse = tf.metrics.root_mean_squared_error(labels,outputs)[0]
        tf.summary.scalar('RMSE loss',rmse)

        return tf.estimator.EstimatorSpec(mode,loss=loss,train_op = train_op)

    elif mode == tf.estimator.ModeKeys.EVAL:
        loss = tf.losses.absolute_difference(labels,outputs)
        rmse = tf.metrics.root_mean_squared_error(labels,outputs)[0]

        tf.summary.scalar('RMSE loss',rmse)

        return tf.estimator.EstimatorSpec(mode,loss=loss,eval_metric_ops = {'RMSE':rmse})

The input functions:

def input_fn_train(batch_size):
    dataset = tf.data.TFRecordDataset(['Data/train.tfrecords'])
    dataset = dataset.map(parse_)
    dataset = dataset.shuffle(buffer_size = 5)
    dataset = dataset.batch(batch_size)

    return dataset.prefetch(buffer_size = 5)


def input_fn_eval(batch_size):
    dataset = tf.data.TFRecordDataset(['Data/eval.tfrecords'])
    dataset = dataset.map(parse_)
    dataset = dataset.shuffle(buffer_size = 5)  
    dataset = dataset.batch(batch_size)
    return dataset.make_one_shot_iterator().get_next()

Which way is better, the iterator or the dataset output ?

And for the main :

def main():

    batch_size = 5
    data_pred = misc.input_fn_eval(1)


    rnn = tf.estimator.Estimator(
                                model_fn = model.model_fn,
                                model_dir = "logs/20_08/",
                                params = {'batch_size':batch_size,'time_steps':150,'dim':40})

    rnn.train(input_fn =lambda : misc.input_fn_train(batch_size),steps = 1)

    video = rnn.predict(input_fn = lambda:data_pred)


    print(next(video))

if __name__ == "__main__":
    main()

Now, the code seems to be fine, at least in syntax, for training. I wanted to predict a few frames, in order to check the evolution, but I keep getting errors :

ValueError: Tensor("ConvLSTMCellZeroState_1/zeros_1:0", shape=(1, 40, 40, 1), dtype=float32) must be from the same graph as Tensor("Squeeze:0", shape=(?, 40, 40, 1), dtype=float32).

I also had this for Iterator and Dataset ( I believe for the prediction input function, which used to be the same as the one for the training. Creating a different one seems to have solved it.).

Thanks a lot for the help ! I hope the question is clear enough, please let me know if it isn't.


Solution

  • Try to change your code in the following way:

    video = rnn.predict(input_fn = lambda:misc.input_fn_eval(1))
    

    The problem is that you have to call input_fn_eval from input_fn. Then all the tensors created by that function belong to the graph created by Estimator

    You could find similar issues here and here