Extract numbers and letters features from an image

I'm writting a python program to classify letters and numbers. I've wrote the classifier and I have the images for my dataset. I really don't have much experience in python or working with images. My problem is how to create my dataset with the images I have. How to create like an array with the shape of them. Should I just create a numpy array of each image? Or use color histogram? I will probably convert all images to grayscale.

I've found the link bellow that classifies cats and dogs. It uses two method to extract images features but I don't know if this would apply for my case.

k-nn-classifier-for-image-classification

Could anyone guide me could I extract the features of my images to a vector, for example, so I can write this data in my "dataset.data" file?

I'll use images like the image bellow:

Letter "e"

I've even considered resizing the image to 32x32 and create like a bitmap of 0's and 1's representing the image.

Could anyone guide me could I extract the features of my images to a vector, for example, so I can write this data in my "dataset.data" file?

Thank you.

Solution

You would usually want to create a Numpy array to hold all your training data. It is common to arrange it in the following shape:

X_train.shape = (N, img.shape[0], img.shape[1])

where N is the number of images in the set.

This way, if you are using single channel (gray scale), X_train[i,:,:] will hold the values of the i'th image pixels. Note that it's recommended to normalize these values, but this will depend on the model you choose to train.

Here is a quick example of how you can create build such an array:

import numpy as np
import cv2

X = np.zeros((N, IMG_SIZE[0], IMG_SIZE[1]), dtype=np.float32) 
y = np.zeros((N))
for idx, img_path in enumerate(images_path):
  img = cv2.imread(img_path)
  assert ((img.shape[0], img.shape[1]) == IMG_SIZE)
  gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  X[idx, :, :] = gray
  y[idx] = # label of this image

# if you wish to normalize:
X = (X/255.0) - 0.5

There are many tutorials for digit classifiers out there, usually using the MNIST dataset as an example. Here is one example but you should go ahead and google it.

If you want to achieve better results, you would probably want to look into neural networks. Again, many tutorials out there, here is one example using tensorflow.