Search code examples
tensorflowmomentum

Tensorflow MomentumOptimizer issue


When I run the following simple code, I get this error:

tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value Variable_5/Momentum

This code works with GradientDescentOptimizer, but I have the error with MomentumOptimizer. Please guide me to solve it.

Here is my code:

import tensorflow as tf
import numpy as np
import scipy.io as sio
import h5py
from tensorflow.python.training import queue_runner

maxiter = 200000
display = 1
sess = tf.InteractiveSession()
decay_rate = 0.00005
starter_learning_rate = 0.000009
alpha = 0.00005
init_momentum = 0.9

nnodes1 = 350
nnodes2 = 100
batch_size = 50

train_mat = h5py.File('Basket_train_data_binary.mat')
test_mat = h5py.File('Basket_test_data_binary.mat')

train_mat = train_mat["binary_train"].value
test_mat = test_mat["binary_test"].value

Train = np.transpose(train_mat)
Test = np.transpose(test_mat)

# import the data                                                                                                                                
#from tensorflow.examples.tutorials.mnist import input_data                                                                                      
# placeholders, which are the training data                                                                                                      
x = tf.placeholder(tf.float32, shape=[None,43])
y_ = tf.placeholder(tf.float32, shape=[None])

# define the variables                                                                                                                                                                                                                                                                        
W1 = tf.Variable(tf.zeros([43,nnodes1]))

b1 = tf.Variable(tf.zeros([nnodes1]))

W2 = tf.Variable(tf.zeros([nnodes1,nnodes2]))
b2 = tf.Variable(tf.zeros([nnodes2]))

W3 = tf.Variable(tf.zeros([nnodes2,1]))
b3 = tf.Variable(tf.zeros([1]))

# Passing global_step to minimize() will increment it at each step.                                                                              
global_step = tf.Variable(0, trainable=False)
momentum = tf.Variable(init_momentum, trainable=False)


# initilize the variables                                                                                                                       
sess.run(tf.initialize_all_variables())

# prediction function (just one layer)                                                                                                         

layer1 = tf.nn.sigmoid(tf.matmul(x,W1) + b1)
layer2 = tf.nn.sigmoid(tf.matmul(layer1,W2) + b2)
y = tf.matmul(layer2,W3) + b3

# cost function                                                                                                                                  
cost_function = tf.reduce_sum(tf.square(y_ - y))

l2regularization = tf.reduce_sum(tf.square(W1)) + tf.reduce_sum(tf.square(b1)) + tf.reduce_sum(tf.square(W2)) + tf.reduce_sum(tf.square(b2)) + tf.reduce_sum(tf.square(W3)) + tf.reduce_sum(tf.square(b3))
loss = cost_function + alpha*l2regularization

# define the learning_rate and its decaying procedure.                                                                                           
learning_rate = tf.train.exponential_decay(starter_learning_rate,     global_step,10000, decay_rate, staircase=True)
# define the training paramters and model, gradient model and feeding the function                                                               
#train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
train_step = tf.train.MomentumOptimizer(learning_rate,0.9).minimize(loss, global_step=global_step)                                              


# evaluation                                                                                                                                     
# it returns 1, if both y and y_ are equal.                                                                                                      
correct_prediction = tf.reduce_sum(tf.square(y_ - y))

# calculate the accuracy                                                                                                                         
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


# Train the Model for 1000 times. by defining the batch number we     determine that it is sgd                                                       
for i in range(maxiter):
  batch = np.random.randint(0,len(Train),size=batch_size)
  train_step.run(feed_dict={x:Train[batch,0:43], y_:Train[batch,43]})
  if np.mod(i,display) == 0:
    # print tset loss                                                                                                                            
    print "Test", accuracy.eval(feed_dict={x: Test[:,0:43], y_: Test[:,43]})
    # print training loss                                                                                                                        
    print "Train" , sess.run(cost_function,feed_dict={x:     Train[:,0:43], y_: Train[:,43]})

Please guide me how I can solve this problem. Thanks in advance, Afshin


Solution

  • At the line

    # initilize the variables                                                                                                                       
    sess.run(tf.initialize_all_variables())
    

    you're initializing all the variables declared before that line.

    You declared the optimizer (and other variables) after that line, so the variables used by the optimizer are not affected by the initialization.

    Move the initialization after the complete graph declaration (eg: after the declaration of every variable and op) to fix.

    TL;DR: move

    # initilize the variables                                                                                                                       
    sess.run(tf.initialize_all_variables())
    

    after

    # calculate the accuracy                                                                                                                         
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))