Here is the mnist example from the Keras documentation: https://keras.io/examples/mnist_cnn/
I put it into google colab, under Tensorflow 1.x, and it performs really well: https://colab.research.google.com/drive/15NW-lXhRUxqSCCygVxddXCo5ID7yF2iL
I made very simple changes to make it execute under TF-2.x: https://colab.research.google.com/drive/1ul-eFn1XRe9ta3cu5vHchaa4DxStRda_
It completely crushes performance! Accuracy drops like a rock!
What did I do wrong?
The difference is in the optimizers. tf.keras.optimizers.Adadelta
uses a learning rate of 0.001. keras.optimizers.Adadelta
uses a learning rate of 1.0.
Check keras.optimizers and tf.keras.optimizers.Adadelta for more details. In particular, the Tensorflow page mentions that Adadelta is supposed to have a learning rate of 1.0 to match the original paper.