Could anybody please share with me ideas on a heuristic to capture digits on a water meter by means of OpenCV? I have a dataset of images with different water meter (like the one below) and the task is to recognize the numbers on it (the ones showing how much water is gone: in this image these are - 0 0 0 0 1 0 2 5 )
The first task I see is to somehow capture the contours where the numbers are. From what I tried so far, the best contours finding strategy was the simplest one using Canny edge detector followed by cv2.findContours method:
import imutils
import cv2
import numpy as np
import argparse
import glob
image = cv2.imread("watercounter.jpg")
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(blurred, 100, 1, 255, apertureSize=3)
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
cv2.drawContours(image, cnts, -1, (0, 255, 0), 3)
cv2.imshow("output", image)
This approach outputs the following image and now I am thinking about a heuristic to discriminate the digits' contours from all others that I could further feed them to standard digits recognition techniques. Thank you very much for the ideas.
One way to get rid of unnecessary contours is to use HoughCircles and HoughLines. With Hough circles, you can recognize the inside area of the meter. HoughLines will highlight the rectangles with digits, allowing you to segment them from the rest.
This is one tutorial for digit recognition with KNN.
I am not sure if using contours is the best approach for digit recognition, because for most digits you will find an inner contour and outer contours. Training a Haar classifier (search for "OpenCV HaarTraining Examples") will produce much better results.