Search code examples
pythontensorflowdeep-learningartificial-intelligence

Prediction from saved tensorflow variables


I'm very new on phyton programming. I saved tensorflow neural network while training just as:

n_hidden_1 =200
n_hidden_2 =100 # 1st layer num features
n_hidden_3 = 100 # 2nd layer num features
n_input = 128  n_output = 128

X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])

def encoder(x):
    weights = {                    
        'encoder_w1': tf.Variable(tf.random.truncated_normal([n_input, n_hidden_1],stddev=0.1),name='encoder_w1'),
        'encoder_w2': tf.Variable(tf.random.truncated_normal([n_hidden_1, n_hidden_2],stddev=0.1),name='encoder_w2'),
        'encoder_w3': tf.Variable(tf.random.truncated_normal([n_hidden_2, n_hidden_3],stddev=0.1),name='encoder_w3'),
        'encoder_w4': tf.Variable(tf.random.truncated_normal([n_hidden_3, n_output],stddev=0.1),name='encoder_w4'),            
    }

    biases = {            
        'encoder_b1': tf.Variable(tf.random.truncated_normal([n_hidden_1],stddev=0.1),name='encoder_b1'),
        'encoder_b2': tf.Variable(tf.random.truncated_normal([n_hidden_2],stddev=0.1),name='encoder_b2'),
        'encoder_b3': tf.Variable(tf.random.truncated_normal([n_hidden_3],stddev=0.1),name='encoder_b3'),
        'encoder_b4': tf.Variable(tf.random.truncated_normal([n_output],stddev=0.1),name='encoder_b4'),          
    }

    # Encoder Hidden layer with sigmoid activation #1
    #layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']), biases['encoder_b1']))
    layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['encoder_w1']), biases['encoder_b1']))
    layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['encoder_w2']), biases['encoder_b2']))
    layer_3 = tf.nn.relu(tf.add(tf.matmul(layer_2, weights['encoder_w3']), biases['encoder_b3']))
    layer_4 = tf.nn.sigmoid(tf.add(tf.matmul(layer_3, weights['encoder_w4']), biases['encoder_b4']))
    return layer_4

y_pred = encoder(X)
y_true = Y

# Define loss and optimizer, minimize the squared error
cost = tf.reduce_mean(tf.pow(y_true - y_pred, 2))
learning_rate = tf.placeholder(tf.float32, shape=[])
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

saver = tf.train.Saver(max_to_keep=1) 
with tf.Session(config=config) as sess:

  sess.run(init)
  traing_epochs = 100
  learning_rate_current = 0.001   #0.01

  for epoch in range(traing_epochs):

> # Code Block.................

input_labels.append(x) 
input_samples.append(y)
batch_x = np.asarray(input_samples)
batch_y = np.asarray(input_labels

 _,c = sess.run([optimizer,cost], feed_dict={X:batch_x, Y:batch_y, learning_rate:learning_rate_current})

After training I load trained network print variables as:

with tf.Session() as sess2:
  saver2= tf.train.import_meta_graph('../my_model.ckpt.meta')
  saver2.restore(sess2,'../my_model.ckpt')
savevariable=tf.all_variables()

I saw saved variables after training , 48 variables are present ( why 8 initial parameters increase to 48 value I didn't understand) I have 4 weights and 4 bias for layer1 ,layer 2 and layer 3 and layer 4 for initial values. For prediction which 4 weights and 4 bias should I use from this saved values ?? I'm very confused. Printed Saving variables here:

 <tf.Variable 'encoder_w1:0' shape=(128, 200) dtype=float32_ref>,
 <tf.Variable 'encoder_w2:0' shape=(200, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w3:0' shape=(100, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w4:0' shape=(100, 128) dtype=float32_ref>,
 <tf.Variable 'encoder_b1:0' shape=(200,) dtype=float32_ref>,
 <tf.Variable 'encoder_b2:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b3:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b4:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'encoder_w1/RMSProp:0' shape=(128, 200) dtype=float32_ref>,
 <tf.Variable 'encoder_w1/RMSProp_1:0' shape=(128, 200) dtype=float32_ref>,
 <tf.Variable 'encoder_w2/RMSProp:0' shape=(200, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w2/RMSProp_1:0' shape=(200, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w3/RMSProp:0' shape=(100, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w3/RMSProp_1:0' shape=(100, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w4/RMSProp:0' shape=(100, 128) dtype=float32_ref>,
 <tf.Variable 'encoder_w4/RMSProp_1:0' shape=(100, 128) dtype=float32_ref>,
 <tf.Variable 'encoder_b1/RMSProp:0' shape=(200,) dtype=float32_ref>,
 <tf.Variable 'encoder_b1/RMSProp_1:0' shape=(200,) dtype=float32_ref>,
 <tf.Variable 'encoder_b2/RMSProp:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b2/RMSProp_1:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b3/RMSProp:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b3/RMSProp_1:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b4/RMSProp:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'encoder_b4/RMSProp_1:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'encoder_w1:0' shape=(128, 200) dtype=float32_ref>,
 <tf.Variable 'encoder_w2:0' shape=(200, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w3:0' shape=(100, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w4:0' shape=(100, 128) dtype=float32_ref>,
 <tf.Variable 'encoder_b1:0' shape=(200,) dtype=float32_ref>,
 <tf.Variable 'encoder_b2:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b3:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b4:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'encoder_w1/RMSProp:0' shape=(128, 200) dtype=float32_ref>,
 <tf.Variable 'encoder_w1/RMSProp_1:0' shape=(128, 200) dtype=float32_ref>,
 <tf.Variable 'encoder_w2/RMSProp:0' shape=(200, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w2/RMSProp_1:0' shape=(200, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w3/RMSProp:0' shape=(100, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w3/RMSProp_1:0' shape=(100, 100) dtype=float32_ref>,
 <tf.Variable 'encoder_w4/RMSProp:0' shape=(100, 128) dtype=float32_ref>,
 <tf.Variable 'encoder_w4/RMSProp_1:0' shape=(100, 128) dtype=float32_ref>,
 <tf.Variable 'encoder_b1/RMSProp:0' shape=(200,) dtype=float32_ref>,
 <tf.Variable 'encoder_b1/RMSProp_1:0' shape=(200,) dtype=float32_ref>,
 <tf.Variable 'encoder_b2/RMSProp:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b2/RMSProp_1:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b3/RMSProp:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b3/RMSProp_1:0' shape=(100,) dtype=float32_ref>,
 <tf.Variable 'encoder_b4/RMSProp:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'encoder_b4/RMSProp_1:0' shape=(128,) dtype=float32_ref>

Would you give me advices??


Solution

  • First of all you made a mistake:

    input_labels.append(x) 
    input_samples.append(y)
    

    should be:

    input_samples.append(x)
    input_labels.append(y)
    


    Answering your question:

    Actually you should have 24 variables (see below), but when you started the second session the original Tensorflow variables remained back and the newly loaded variables have been created upon these variables. So always good to do something like:

    tf.keras.backend.clear_session()
    

    You batch definition was a bit problematic, so I fixed it in my example code as well:

    Note: OP didn't provided reproducible example at the moment of my answer, so I created one base on the initial data and commented it as per OP's question:

    Import all moduls

    import numpy as np
    from numpy import random as random
    import tensorflow as tf
    
    tf.compat.v1.disable_eager_execution()
    

    Clear all session data, which can easily accumulate in interactive environment.

    tf.keras.backend.clear_session()
    

    Define all elements we will need later:

    X_train = random.rand(100, 128)
    y_train = random.rand(100, 128)
    
    n_hidden_1 =200
    n_hidden_2 =100 # 1st layer num features
    n_hidden_3 = 100 # 2nd layer num features
    n_input = 128  
    n_output = 128
    
    X = tf.compat.v1.placeholder("float", [None, n_input])
    Y = tf.compat.v1.placeholder("float", [None, n_output])
    
    def encoder(x):
        weights = {                    
            'encoder_w1': tf.Variable(tf.random.truncated_normal([n_input, n_hidden_1],stddev=0.1),name='encoder_w1'),
            'encoder_w2': tf.Variable(tf.random.truncated_normal([n_hidden_1, n_hidden_2],stddev=0.1),name='encoder_w2'),
            'encoder_w3': tf.Variable(tf.random.truncated_normal([n_hidden_2, n_hidden_3],stddev=0.1),name='encoder_w3'),
            'encoder_w4': tf.Variable(tf.random.truncated_normal([n_hidden_3, n_output],stddev=0.1),name='encoder_w4'),            
        }
    
        biases = {            
            'encoder_b1': tf.Variable(tf.random.truncated_normal([n_hidden_1],stddev=0.1),name='encoder_b1'),
            'encoder_b2': tf.Variable(tf.random.truncated_normal([n_hidden_2],stddev=0.1),name='encoder_b2'),
            'encoder_b3': tf.Variable(tf.random.truncated_normal([n_hidden_3],stddev=0.1),name='encoder_b3'),
            'encoder_b4': tf.Variable(tf.random.truncated_normal([n_output],stddev=0.1),name='encoder_b4'),          
        }
    
        # Encoder Hidden layer with sigmoid activation #1
        #layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']), biases['encoder_b1']))
        layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['encoder_w1']), biases['encoder_b1']))
        layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['encoder_w2']), biases['encoder_b2']))
        layer_3 = tf.nn.relu(tf.add(tf.matmul(layer_2, weights['encoder_w3']), biases['encoder_b3']))
        layer_4 = tf.nn.sigmoid(tf.add(tf.matmul(layer_3, weights['encoder_w4']), biases['encoder_b4']))
        return layer_4
    
    
    y_pred = encoder(X)
    y_true = Y
    
    # Define loss and optimizer, minimize the squared error
    cost = tf.reduce_mean(tf.pow(y_true - y_pred, 2))
    learning_rate = tf.compat.v1.placeholder(tf.float32, shape=[])
    optimizer = tf.compat.v1.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)
    
    saver = tf.compat.v1.train.Saver(max_to_keep=1) 
    init = tf.compat.v1.global_variables_initializer()
    

    Start training session:

    with tf.compat.v1.Session() as sess:
        sess.run(init)
    
        traing_epochs = 100
        learning_rate_current = 0.001   #0.01
        # define batch size
        batch_size = 10
        num_samples = (X_train.shape[0])
    
        # starting training loop
        for epoch in range(traing_epochs):
            print("Epoch " + str(epoch))
    
            # starting per batch training loop
            for i in range(int(num_samples/batch_size)):
                print("{}/{}".format(batch_size * i, num_samples))
    
                # get batches
                batch_x = X_train[i * batch_size:(i + 1) * batch_size]
                batch_y = y_train[i * batch_size:(i + 1) * batch_size]
                _,c = sess.run([optimizer,cost], 
                               feed_dict={X: batch_x, Y: batch_y,
                                          learning_rate: learning_rate_current})
        saver.export_meta_graph('../my_model.ckpt.meta')
        saver.save(sess, '../my_model.ckpt')
    

    Out:

    .
    .
    .
    70/100
    80/100
    90/100
    100/100
    Epoch 99
    10/100
    20/100
    30/100
    40/100
    50/100
    60/100
    70/100
    80/100
    90/100
    100/100
    

    Clear all previous session data, removing tensorflow variables also

    tf.keras.backend.clear_session()
    

    Reload previously saved session:

    init = tf.compat.v1.global_variables_initializer()
    with tf.compat.v1.Session() as sess2:
        sess2.run(init)
        saver2 = tf.compat.v1.train.import_meta_graph('../my_model.ckpt.meta')
        saver2.restore(sess2,'../my_model.ckpt')
        savevariable = tf.compat.v1.all_variables()
    
    for i, variable in enumerate(savevariable):
        print("Variable {}: {}".format(i, variable))
    


    Interpreting our output:

    Out:
    INFO:tensorflow:Restoring parameters from ../my_model.ckpt
    

    8 weights and biases variables you expected:

    Variable 0: <tf.Variable 'encoder_w1:0' shape=(128, 200) dtype=float32>
    Variable 1: <tf.Variable 'encoder_w2:0' shape=(200, 100) dtype=float32>
    Variable 2: <tf.Variable 'encoder_w3:0' shape=(100, 100) dtype=float32>
    Variable 3: <tf.Variable 'encoder_w4:0' shape=(100, 128) dtype=float32>
    Variable 4: <tf.Variable 'encoder_b1:0' shape=(200,) dtype=float32>
    Variable 5: <tf.Variable 'encoder_b2:0' shape=(100,) dtype=float32>
    Variable 6: <tf.Variable 'encoder_b3:0' shape=(100,) dtype=float32>
    Variable 7: <tf.Variable 'encoder_b4:0' shape=(128,) dtype=float32>
    

    2 x 8 weights and biases variables as the optimizer's accumulator variables we to can be able to reload the actual optimizer states to continue from that state:

    Variable 8: <tf.Variable 'encoder_w1/RMSProp:0' shape=(128, 200) dtype=float32>
    Variable 9: <tf.Variable 'encoder_w1/RMSProp_1:0' shape=(128, 200) dtype=float32>
    Variable 10: <tf.Variable 'encoder_w2/RMSProp:0' shape=(200, 100) dtype=float32>
    Variable 11: <tf.Variable 'encoder_w2/RMSProp_1:0' shape=(200, 100) dtype=float32>
    Variable 12: <tf.Variable 'encoder_w3/RMSProp:0' shape=(100, 100) dtype=float32>
    Variable 13: <tf.Variable 'encoder_w3/RMSProp_1:0' shape=(100, 100) dtype=float32>
    Variable 14: <tf.Variable 'encoder_w4/RMSProp:0' shape=(100, 128) dtype=float32>
    Variable 15: <tf.Variable 'encoder_w4/RMSProp_1:0' shape=(100, 128) dtype=float32>
    Variable 16: <tf.Variable 'encoder_b1/RMSProp:0' shape=(200,) dtype=float32>
    Variable 17: <tf.Variable 'encoder_b1/RMSProp_1:0' shape=(200,) dtype=float32>
    Variable 18: <tf.Variable 'encoder_b2/RMSProp:0' shape=(100,) dtype=float32>
    Variable 19: <tf.Variable 'encoder_b2/RMSProp_1:0' shape=(100,) dtype=float32>
    Variable 20: <tf.Variable 'encoder_b3/RMSProp:0' shape=(100,) dtype=float32>
    Variable 21: <tf.Variable 'encoder_b3/RMSProp_1:0' shape=(100,) dtype=float32>
    Variable 22: <tf.Variable 'encoder_b4/RMSProp:0' shape=(128,) dtype=float32>
    Variable 23: <tf.Variable 'encoder_b4/RMSProp_1:0' shape=(128,) dtype=float32>
    


    Update answering OP's additional questions:

    Are these variables initial values or last updated values after the session is saved?:

    These are the last updated values as we printed it after the reloading.

    In fact, the exact point is that when I reload the session in another python file, I want to predict the output from the trained network variables. Should I use variables from Variable 0 to Variable 7 or should I use 16 optimizer accumulator variables?

    When you will use the reloaded model parameters for predicting, then you will need only the first 8 variables, the others are only for continue the training.

    Regarding the optimizer's state variables: it is depending on the given optimizer e.g. it can be the iteration and momentum etc.