Search code examples
tensorflow2.0mnisttf.keras

Why does very simple port of the official Keras mnist example to tensorflow 2.x result in massive drop in accuracy?


Here is the mnist example from the Keras documentation: https://keras.io/examples/mnist_cnn/

I put it into google colab, under Tensorflow 1.x, and it performs really well: https://colab.research.google.com/drive/15NW-lXhRUxqSCCygVxddXCo5ID7yF2iL

I made very simple changes to make it execute under TF-2.x: https://colab.research.google.com/drive/1ul-eFn1XRe9ta3cu5vHchaa4DxStRda_

It completely crushes performance! Accuracy drops like a rock!

What did I do wrong?


Solution

  • The difference is in the optimizers. tf.keras.optimizers.Adadelta uses a learning rate of 0.001. keras.optimizers.Adadelta uses a learning rate of 1.0.

    Check keras.optimizers and tf.keras.optimizers.Adadelta for more details. In particular, the Tensorflow page mentions that Adadelta is supposed to have a learning rate of 1.0 to match the original paper.