So I have trained a tensorflow model for OCR using a alphabet dataset i downloaded from here
creating Xtrain, Xtest and Ytrain, Ytest: folders contain folders of each alphabets with 15k images in it of sixe 32x32.
import os
from PIL import Image
from numpy import asarray
folders = os.listdir(path)
train_max = 100
test_max = 10
Xtrain = []
Ytrain = []
Xtest = []
Ytest = []
for folder in folders:
folder_opened = path + folder + '/'
count = 0
for chars in os.listdir(folder_opened):
count += 1
if count <= train_max:
image = + chars)
data = asarray(image)
elif count > train_max and count <= train_max + test_max:
image = + chars)
data = asarray(image)
My training code :
import tensorflow as tf
Xtrain = tf.keras.utils.normalize(Xtrain, axis = 1)
Xtest = tf.keras.utils.normalize(Xtest, axis = 1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(30, activation=tf.nn.softmax))
), factorize(Ytrain)[0], epochs=40, validation_data = (Xtest, factorize(Ytest)[0]))
This model works perfectly in predicting the images that contain a single alphabet of size 32x32.
But for real life application, i need to use this model to extract an entire text from a documetn (eg: PAN card, ID card, passport, etc..)
What all I have tried :
I tried to read the image using pillow and convert it into numpy array and then use model.predict on it.
image_adhar = + 'adhar1.jpeg')
image_adhar = asarray(image_adhar)
When doing so, I get this error
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'tuple'> input: (<tf.Tensor 'IteratorGetNext:0' shape=(None, 500, 3) dtype=uint8>,)
Consider rewriting this model with the Functional API.
WARNING:tensorflow:Model was constructed with shape (None, 32, 32) for input KerasTensor(type_spec=TensorSpec(shape=(None, 32, 32), dtype=tf.float32, name='flatten_30_input'), name='flatten_30_input', description="created by layer 'flatten_30_input'"), but it was called on an input with incompatible shape (None, 500, 3).
ValueError: Input 0 of layer dense_94 is incompatible with the layer: expected axis -1 of input shape to have value 1024 but received input with shape (None, 1500)
Forgive me, but I am new to keras and tensorflow.
I know this error has something to do with the shape of the training files and the shape of the image that i passed (adhar1.jpeg). They are not the same shape. (32x32 and 500x281) But I dont know how to modify to accept my adhar1.jpeg image
Since you have trained the model using 32x32 image, you need to give an input image of the same dimension to your model.
Step 1: Load the input image from the disk, convert it to grayscale, and blur it to reduce noise
Step 2: Perform edge detection, find contours in the edge map, and sort the resulting contours from left-to-right
Step 3: Loop over the contours, compute the bounding box of the contour and filter out too small and large boxes.
Step 4: Extract the character and threshold it to make the character appear as white (foreground) on a black background, then grab the width and height of the thresholded image
Step 5: Resize the image and apply padding if needed
Step 6: Run your model for all the chars found
For more reference, you can look into: