Getting the same quality result when using OpenCV Mat
as when using Leptonica Pix
when doing OCR with Tesseract.
C++17, OpenCV 3.4.1, Tesseract 3.05.01, Leptonica 1.74.4, Visual Studio Community 2017, Windows 10 Pro 64-bit
I'm working with Tesseract and OCR, and have found what I think is a peculiar behaviour.
And this is my code:
#include "stdafx.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#pragma comment(lib, "ws2_32.lib")
using namespace std;
using namespace cv;
using namespace tesseract;
void opencvVariant(string titleFile);
void leptonicaVariant(const char* titleFile);
int main()
{
cout << "Tesseract with OpenCV and Leptonica" << endl;
const char* titleFile = "raptor-companion-2.jpg";
opencvVariant(titleFile);
leptonicaVariant(titleFile);
cout << endl;
system("pause");
return 0;
}
void opencvVariant(string titleFile) {
cout << endl << "OpenCV variant..." << endl;
TessBaseAPI ocr;
ocr.Init(NULL, "eng");
Mat image = imread(titleFile);
ocr.SetImage(image.data, image.cols, image.rows, 1, image.step);
char* outText = ocr.GetUTF8Text();
int confidence = ocr.MeanTextConf();
cout << "Text: " << outText << endl;
cout << "Confidence: " << confidence << endl;
}
void leptonicaVariant(const char* titleFile) {
cout << endl << "Leptonica variant..." << endl;
TessBaseAPI ocr;
ocr.Init(NULL, "eng");
Pix *image = pixRead(titleFile);
ocr.SetImage(image);
char* outText = ocr.GetUTF8Text();
int confidence = ocr.MeanTextConf();
cout << "Text: " << outText << endl;
cout << "Confidence: " << confidence << endl;
}
The methods opencvVariant
and leptonicaVariant
is basically the same except that one is using the class Mat
from OpenCV and the other Pix
from Leptonica. Yet, the result is quite different.
OpenCV variant...
Text: Rapton
Confidence: 68
Leptonica variant...
Text: Raptor Companion
Confidence: 83
As one can see in the output above, the Pix
variant gives a much better result than the Mat
variant. Since my code relies heavily on OpenCV for the computer vision before the OCR its essential for me that the OCR works well with OpenCV and its' classes.
Pix
give a better result than Mat
, and vice versa?Mat
variant as efficient as the Pix
variant?OpenCV imread
function by default reads image as colored, which means you get pixels as BGRBGRBGR...
.
In your example you are assuming opencv image is grayscale, so there are 2 ways of fixing that:
Change your SetImage
line according to number of channels in opencv image
ocr.SetImage((uchar*)image.data, image.size().width, simageb.size().height, image.channels(), image.step1());
Convert your opencv image to grayscale with 1 channel
cv::cvtColor(image, image, CV_BGR2GRAY);