Search code examples
pythonmachine-learningtensorflowmatrix-factorization

memory overflow when processing eval() in Tensorflow


I'm using Tensorflow to process a simple matrix factorization algorithm. Every step went correct but at the last step, where I want to eval() a Tensor to store it, the program didn't work and only occupied more and more memory. So is there something wrong in my code? I'm a beginner in Tensorflow and I'm don't know where the problem is. Below is my code.

    class model(object):
    def __init__(self, D, Q, stepsize = 6e-7, max_iter = 200, inner_maxiter = 50, dim = 200, verbose = 5):
        self.D = tf.constant(D, dtype = tf.float32)
        self.Q = tf.constant(Q, dtype = tf.float32)
        self.rank = dim
        self.stepsize = stepsize
        self.max_iter = max_iter
        self.inner_maxiter = inner_maxiter
        self.verbose = verbose

        self.W = tf.Variable((np.random.rand(self.rank, sample_num)), dtype = tf.float32, name = 'W')
        self.C = tf.Variable((np.random.rand(context_num, self.rank)), dtype = tf.float32, name = 'C')

    def _run(self, sess):
        Q = self.Q
        D = self.D
        W = self.W
        C = self.C

        for i in xrange(self.max_iter):
            if (i + 1) % 2 == 1:
                for j in xrange(self.inner_maxiter):
                    ED = tf.transpose(Q) * (1.0 / (1.0 + tf.exp(- tf.matmul(C, W))))
                    recons = D - ED
                    W_grad = tf.matmul(tf.transpose(C), recons)
                    W = W + self.stepsize * W_grad
            else:
                for j in xrange(self.inner_maxiter):
                    ED = tf.transpose(Q) * (1.0 / (1.0 + tf.exp(- tf.matmul(C, W))))
                    recons = D - ED
                    C_grad = tf.matmul(recons, tf.transpose(W))
                    C = C + self.stepsize * C_grad
            print 'epoch: %d' % i
        print W.eval()
        print C.eval()

train_epoch = model(D, Q, args.step_size, \
        args.max_iter, args.inner_maxiter, args.dim, args.verbose)
with tf.Session(config = config) as sess:
    tf.initialize_all_variables().run()
    train_epoch._run(sess)

The program stopped and occupying memory at the last two lines in _run(), which contains W.eval() and C.eval(). So what should I do to fix it? Can someone help?


Solution

  • Solved. You cannot add iteration of symbolic operations in Tensorflow. Instead, you should build the data flow first, which means you should define your operations in the initialize step.