Search code examples
pythonimagescikit-learnmnist

Scikit-learn SVM digit recognition


I want to make a program to recognize the digit in an image. I follow the tutorial in scikit learn .

I can train and fit the svm classifier like the following.

First, I import the libraries and dataset

from sklearn import datasets, svm, metrics

digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

Second, I create the SVM model and train it with the dataset.

classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])

And then, I try to read my own image and use the function predict() to recognize the digit.

Here is my image: enter image description here

I reshape the image into (8, 8) and then convert it to a 1D array.

img = misc.imread("w1.jpg")
img = misc.imresize(img, (8, 8))
img = img[:, :, 0]

Finally, when I print out the prediction, it returns [1]

predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
print predicted

Whatever I user others images, it still returns [1]

enter image description here enter image description here

When I print out the "default" dataset of number "9", it looks like:enter image description here

My image number "9" :

enter image description here

You can see the non-zero number is quite large for my image.

I dont know why. I am looking for help to solve my problem.


Solution

  • My best bet would be that there is a problem with your data types and array shapes.

    It looks like you are training on numpy arrays that are of the type np.float64 (or possibly np.float32 on 32 bit systems, I don't remember) and where each image has the shape (64,).

    Meanwhile your input image for prediction, after the resizing operation in your code, is of type uint8 and shape (1, 64).

    I would first try changing the shape of your input image since dtype conversions often just work as you would expect. So change this line:

    predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))

    to this:

    predicted = classifier.predict(img.reshape(img.shape[0]*img.shape[1]))

    If that doesn't fix it, you can always try recasting the data type as well with

    img = img.astype(digits.images.dtype).

    I hope that helps. Debugging by proxy is a lot harder than actually sitting in front of your computer :)

    Edit: According to the SciPy documentation, the training data contains integer values from 0 to 16. The values in your input image should be scaled to fit the same interval. (http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)