Search code examples
python-3.xtensorflowtensorboard

Trying to resolve training/testing summaries using tf.Data


I'm looking to develop summaries during my NN training, similar to here, but all the examples I see are using feed_dict and not tf.data. My training and testing have separate initializers:

self.train_init = iterator.make_initializer(train_data) # initializer for train_data
self.test_init = iterator.make_initializer(test_data)   # initializer for test_data

During my training, I initialize the training initializer with sess.run(self.train_init), but in order to test the accuracy I need to to initialize sess.run(self.test_init) I believe. Currently my code is shown below:

for i in range(100):
    sess.run(self.train_init)
    total_loss = 0
    n_batches = 0
    try:
        while True:
              _, l = sess.run([self.optimizer, self.loss])
              total_loss += l
              n_batches += 1
    except tf.errors.OutOfRangeError:
        pass

        if i % (10/1) == 0:
           print('Avg. loss epoch {0}: {1}'.format(i, total_loss/n_batches))
           acc, summ = sess.run(self.accuracy, self.summary_op)
           writer.add_summary(summ, i)

As it currently stands, accuracy is measured every 10 iterations, but its using the training batch, not the testing batch. I want to be able to see the training and testing accuracy over time in order to see clearly whether or not over-fitting is occurring (good training accuracy but poor testing accuracy).

I have no idea how to do this when I'm using tf.Data. How do I switch between initializers while going through 100 iterations, all the while creating summaries of what I need?


Solution

  • Usually one evaluates the test set outside the training process in order to optimize the performance. But if you really want to do it in-situ, one of the solution which works perfectly for me is to:

    1. Create two tf.data, and a placeholder to switch between these.
    2. Use a tf.cond() to control the flow like this post

    The code might look something like:

    with tf.name_scope('train_pipeline'):
        train_ds = tf.data.Dataset.from_tensor_slices(...)
        ...
        train_ds = iterator.make_initializer(train_data)
        train_init = iterator.initialize
        X_iterator_train = iterator.get_next()
    with tf.name_scope('test_pipeline'):
        test_ds = tf.data.Dataset.from_tensor_slices(...)
        ...
        test_ds = iterator.make_initializer(test_data)
        test_init = iterator.initialize
        X_iterator_test = iterator.get_next()
    
    train_or_test = tf.placeholder(tf.string, name='switch_buton')
    def f1(): X_iterator_train
    def f2(): X_iterator_test
    inputs = tf.cond(tf.equal(train_or_test, 'train'), lambda :f1(), lambda: f2(), name='input_cond')
    
    # model
    ... # use your input(IteratorGetNext) at your first layer, something like tf.nn.conv2d(inputs, ...)
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
    
        # init summary writers for two different path
        train_writer = tf.summary.FileWriter(...)
        test_writer = tf.summary.FileWriter(...)
    
        for ep in range(nb_epoch):
            sess.run([train_init, test_init])
            # begin training
            for step in range(nb_batch):
                # 90% train, 10% test
                if step % 9 == 0:
                    sess.run(train_op, feed_dict={train_or_test: 'test'})  # switch to test input pipeline
                    train_writer.add_summary()
                else:
                    sess.run(train_op, feed_dict={train_or_test: 'train'})  # switch to train input pipeline
                    test_writer.add_summary()