I've been using Microsoft's Computer Vision OCR to extract text from various types of images - but have seem to hit a bump in the road with Seven Segment font.
It sometimes can pick up on them, but it mostly gets it wrong.
I've looked around and found some alternative methods, but would rather continue using the service we already have. Any suggestions?
After a month of research and experimentation, I'm going to share my findings and solutions here in case anyone else encounters the same or a similar problem.
The Problem
I needed a reliable way to extract the temperature from multiple types of Refrigeration Displays. Some of these displays used a standard font that Microsoft's Computer Vision had no trouble with, while others used a Seven-Segmented font.
Due to the nature of Optical Character Recognition (OCR), Seven-Segmented font is not supported directly. To overcome this, you need to apply some image processing techniques to join the segmented text before passing it into the OCR.
Solution Overview
Solution Breakdown
First, we pass the image into our Object Detection Model.
Input: Original Image
Object Detection Output: Object Detection Output
Then we pass that image into the Classification Model to determine the display type.
Classification Output: Classification Result
Next, we perform a series of image processing techniques, including:
Since this display is classified as 'Segmented,' it then gets passed into Tesseract and analyzed using the 'LetsGoDigital' model, which is specialized for digital fonts.
Tesseract Output: "rawText": "- 16.-9,,,6\n\f"
After some Regex, we're left with: "value": "-16.96"
Admittedly, this process isn't providing the best results, but it's sufficient to move forward. By refining the template, input images, Custom Vision Models, and the OCR process, we can expect to see better results in the future.
It would be amazing to see Seven Segment Font natively supported by Microsoft's Computer Vision, as the current solution feels somewhat hacky. I'd prefer to continue using Computer Vision instead of Tesseract or any other OCR method, considering the nature of our application.