Search code examples
pythontensorflowkerasmnist

Wrong predictions with own mnist-like images


Trying to recognise handwritten digits using simple architecture. Test gives 0.9723 accuracy

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow import keras
from tensorflow.keras.layers import Dense, Flatten
from sklearn.model_selection import train_test_split


# data split
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# normalizing
x_train = x_train / 255
x_test = x_test / 255

y_train_cat = keras.utils.to_categorical(y_train, 10)
y_test_cat = keras.utils.to_categorical(y_test, 10)

# creating model
model = keras.Sequential([
    Flatten(input_shape=(28, 28, 1)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

x_train_split, x_val_split, y_train_split, y_val_split = train_test_split(x_train, y_train_cat, test_size=0.2)

model.fit(
    x_train_split,
    y_train_split,
    batch_size=32,
    epochs=6,
    validation_data=(x_val_split, y_val_split))

# saving model
model.save('mnist_model.h5')

# test
model.evaluate(x_test, y_test_cat)

But when I try to recognise my own numbers (0 to 9), some of them aren't recognised correctly: numbers and prediction above

Trying with this code:

from keras.models import load_model
from tensorflow.keras.datasets import mnist
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

model = load_model('mnist_model.h5')

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_test = x_test / 255
y_test_cat = keras.utils.to_categorical(y_test, 10)

model.evaluate(x_test, y_test_cat)

filenames = [
    'project_imgs/0.png', 'project_imgs/1.png', 'project_imgs/2.png', 'project_imgs/3.png',
    'project_imgs/4.png', 'project_imgs/5.png', 'project_imgs/6.png', 'project_imgs/7.png',
    'project_imgs/8.png', 'project_imgs/9.png'
             ]

data = []
data_eds = []

for file in filenames:
    picture = Image.open(file).convert('L')
    pic_r = picture.resize((28, 28))
    pic = np.array(pic_r)
    pic = 255 - pic
    pic = pic / 255
    pic_eds = np.expand_dims(pic, axis=0)

    data.append(pic)
    data_eds.append(pic_eds)

plt.figure(figsize=(10, 5))
for i in range(10):
    ax = plt.subplot(2, 5, i+1)
    ax.set_title(f'Looks like {np.argmax(model.predict(data_eds[i]))}')

    plt.xticks([])
    plt.yticks([])

    plt.imshow(data[i], cmap=plt.cm.binary)
plt.show()

I don't understand why is this happening. Could it be because of the pictures? I've seen that MNIST produces images that are more black and not as grey as mine. Or is it because of the size of the figures in relation to this 28x28 square?


Solution

  • OK, the key was working with images. I wrote code with which I was able to recognise 9 out of 10 images, but the number "9" was still not recognised.

    for file in filenames:
        img = Image.new('RGBA', size=(28, 28), color='white')
        number = Image.open(file).convert('RGBA')
        number_res = number.resize((20, 20), resample=Image.ANTIALIAS)\
            .rotate(6, expand=1, fillcolor='white')
        img.paste(number_res, (4, 4))
        img = img.convert('L')
        img = np.array(img)
        img = 255 - img
        img = img / 255
        img_eds = np.expand_dims(img, axis=0)
    
        data.append(img)
        data_eds.append(img_eds)
    

    Then I worked with it in Photoshop and it worked. "9", as I understood it, was not recognised because there was quite a large horizontal distance between the end of the tail and the loop. Because of this, it was impossible to place the digit in the centre. FINAL RESULTS: RESULTS