I am trying to train the mnist database using the LeNet Architecture.
I downloaded the mnist_png images from github (https://github.com/myleott/mnist_png) and it had over 50000 images. I am trying to build a LeNet model for the prediction of handwritten numbers using the LeNet Architecture which was written using keras
Code for generating images.
train_ds = tf.keras.utils.image_dataset_from_directory(
validation_split = 0.2,
subset = "training",
seed = 123,
image_size = (32, 32),
batch_size = 100)
val_ds = tf.keras.utils.image_dataset_from_directory(
validation_split = 0.2,
subset = "validation",
seed = 123,
image_size = (32, 32),
batch_size = 100)
test_ds = tf.keras.utils.image_dataset_from_directory(
seed = 123,
image_size = (32, 32),
batch_size = 1000)
Found 40818 files belonging to 7 classes.
Using 32655 files for training.
Found 40818 files belonging to 7 classes.
Using 8163 files for validation.
Found 10000 files belonging to 10 classes.
Input shape = (32, 32, 3)
My model summary
Model: "sequential"
Layer (type) Output Shape Param #
conv2d (Conv2D) (None, 28, 28, 6) 456
average_pooling2d (AverageP (None, 14, 14, 6) 0
activation (Activation) (None, 14, 14, 6) 0
conv2d_1 (Conv2D) (None, 10, 10, 16) 2416
average_pooling2d_1 (Averag (None, 5, 5, 16) 0
activation_1 (Activation) (None, 5, 5, 16) 0
conv2d_2 (Conv2D) (None, 1, 1, 120) 48120
flatten (Flatten) (None, 120) 0
dense (Dense) (None, 84) 10164
dense_1 (Dense) (None, 10) 850
Total params: 62,006
Trainable params: 62,006
Non-trainable params: 0
Model compiled with this code
model.compile(optimizer='adam', loss=losses.sparse_categorical_crossentropy, metrics=['accuracy'])
I have trained it for 10 epochs and I get this output -
Epoch 1/10
327/327 [==============================] - 31s 79ms/step - loss: 0.9729 - accuracy: 0.6456 - val_loss: 0.3609 - val_accuracy: 0.8951
Epoch 2/10
327/327 [==============================] - 25s 77ms/step - loss: 0.3036 - accuracy: 0.9021 - val_loss: 0.2276 - val_accuracy: 0.9330
Epoch 3/10
327/327 [==============================] - 28s 85ms/step - loss: 0.2170 - accuracy: 0.9307 - val_loss: 0.1862 - val_accuracy: 0.9389
Epoch 4/10
327/327 [==============================] - 29s 89ms/step - loss: 0.1778 - accuracy: 0.9433 - val_loss: 0.1892 - val_accuracy: 0.9401
Epoch 5/10
327/327 [==============================] - 25s 76ms/step - loss: 0.1521 - accuracy: 0.9519 - val_loss: 0.1692 - val_accuracy: 0.9476
Epoch 6/10
327/327 [==============================] - 27s 83ms/step - loss: 0.1392 - accuracy: 0.9553 - val_loss: 0.1340 - val_accuracy: 0.9588
Epoch 7/10
327/327 [==============================] - 26s 79ms/step - loss: 0.1203 - accuracy: 0.9609 - val_loss: 0.1131 - val_accuracy: 0.9632
Epoch 8/10
327/327 [==============================] - 25s 76ms/step - loss: 0.1128 - accuracy: 0.9644 - val_loss: 0.1170 - val_accuracy: 0.9644
Epoch 9/10
327/327 [==============================] - 27s 81ms/step - loss: 0.1061 - accuracy: 0.9663 - val_loss: 0.1051 - val_accuracy: 0.9659
Epoch 10/10
327/327 [==============================] - 29s 89ms/step - loss: 0.0968 - accuracy: 0.9699 - val_loss: 0.0950 - val_accuracy: 0.9705
When I run model.evaluate(test)
, i get a high loss and and a low accuracy.
10/10 [==============================] - 4s 200ms/step - loss: 9.2694 - accuracy: 0.0656
Is there any reason for that?
Nothing seems obviously wrong. In test_ds try setting shuffle=False. To get a clue try running model.evaluate on val_ds and see if it gives the correct result. Only other thing I can think of is that something is amiss with the test data. Take a look at a few of the images and see if their associated label is corect.