Search code examples
pythonkerasmnist

Neuronal Network doesnt recognize Handwritten Digits 28*28 Pixel MNIST Data set trained


I have done a YT tutorial https://www.youtube.com/watch?v=bte8Er0QhDg&t=1159s about creating a neuronal network which recognizes handwritten digits. It's trained with the mnist dataset and my network recognizes them all correctly. But when i show this network my own handwritten digits in paint, it does not recognize them. I did everything from asking chatgpt to rewriting re doing changing params etc. My image has the exact same shape and looks identical but it just will not recognize my digits ... Looking at my code what is missing why are my digits not recognized if the networks works well?

import cv2
import numpy as np
import tensorflow as tf

# My own data
model = tf.keras.models.load_model("handwritten.model")

img = cv2.imread("Digits/digit11.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_flat = gray.reshape(28, 28)

prediction = model.predict(np.expand_dims(img_flat, axis=0))
print(f"This digit is probably a {np.argmax(prediction)}")

# MNIST Data
prediction = model.predict(np.expand_dims(x_train[27], axis=0))
print(f"This digit is probably a {np.argmax(prediction)}")

Solution

  • Actually the problem was in reading the image file. Cv2 takes an inverse numbering of the pixels. What I mean is that in the zeros are 255 and 255s are zeros in your image. So you have to invert them.

    import cv2
    import numpy as np
    import tensorflow as tf
    from keras.layers import Flatten, Dense
    
    mnist = tf.keras.datasets.mnist
    
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    
    
    x_train = x_train.astype(float)
    x_test = x_test.astype(float)
    
    x_train /= 255.0
    x_test /= 255.0
    
    model = tf.keras.models.Sequential()
    model.add(Flatten(input_shape=(28,28)))
    model.add(Dense(128, activation="relu"))
    model.add(Dense(128, activation="relu"))
    model.add(Dense(10, activation="softmax"))
    model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", 
    metrics=["accuracy"])
    
    model.fit(x_train, y_train, epochs=3)
    
    # My own data
    # In[]
    
    img = cv2.imread("Digits/digit11.png")
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_flat = 255 - gray.reshape(28, 28).astype(float) # Here, you invert the values
    
    img_flat /= 255.
    
    prediction = model.predict(np.expand_dims(img_flat, axis=0))
    print(f"This digit is probably a {np.argmax(prediction)}")