I made a little change to Tensorflow MNIST tutorial. Original code (fully_connected_feed.py, lines 194-202):
checkpoint_file = os.path.join(FLAGS.log_dir, 'model.ckpt')
saver.save(sess, checkpoint_file, global_step=global_step)
#Evaluate against the training set.
print('Training Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.train)
I simply added one more evaluation:
checkpoint_file = os.path.join(FLAGS.log_dir, 'model.ckpt')
saver.save(sess, checkpoint_file, global_step=global_step)
print('Something strange:')
do_eval(sess, eval_correct, images_placeholder,labels_placeholder,
data_sets.train)
#Evaluate against the training set.
print('Training Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.train)
Results of this evaluations are close, but not same (numbers vary from launch to launch):
Something strange:
Num examples: 55000 Num correct: 49218 Precision @ 1: 0.8949
Training Data Eval:
Num examples: 55000 Num correct: 49324 Precision @ 1: 0.8968
How does it possible? UPD: added link to tensorflow github: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/mnist
The do_eval()
function in fact does have a side effect, because data_sets.train
is a stateful DataSet
object that contains a current _index_in_epoch
member, which is advanced on each call to DataSet.next_batch()
(i.e. in fill_feed_dict()
).
On its own, this fact shouldn't be enough to give non-deterministic results, but there are two other details about DataSet.next_batch()
that lead to the non-determinism:
Every time a new epoch is started, the examples are randomly shuffled.
When the data set reaches the end of an epoch, the data set resets to the start and the last num_examples % batch_size
examples are discarded. Thanks to the random shuffling, a random sub-batch of examples is discarded each time, leading to the non-deterministic results.
Given the way the code is structured (with the DataSet
shared between training and testing), it's tricky to make the code deterministic.
The DataSet
class is sparsely documented, but this behavior is surprising, so I'd consider filing a GitHub issue about this problem.