Search code examples
c#opencvocrtesseractemgucv

Global threshold for emgucv in csharp


Well I am new to OCR and Emgucv and here it goes. I am having colored ID card that I want to process through tesseract ocr to get details. I have converted color image to gray scale and I have successfully converted to binary image and passed the same to tesseract and worked properly though i have to filter junk data from text received.

Now my issue is I am expecting different kind of images having such as contrast, lighting conditioning from users. I want to know if something is available which I can do to get global threshold to get binary image. I already played with adaptivethreshold, Otsu but did not worked for me.

What I am assuming is images are matrix of pixels so is their anyway to write some function which will work on any image? I am not able to figureout where should I start with.

I am working on csharp and using tesseract for ocr. I have used following code.

double th = CvInvoke.Threshold(source, source2, 0, 255, ThresholdType.Otsu);
CvInvoke.Threshold(source, source, th/2, 255, ThresholdType.Binary);

Sample Image: [1]: https://i.sstatic.net/JleRx.jpg

Please suggest example in csharp to find global threshold.

I am doing following steps for OCR.

  • Gray scale
  • Threshold
  • Tesseract

Additionally Please let me know whether I am doing expected algorithm for OCR or I am missing something? Also please suggest what should i do to improve ocr accuracy? Any help will be highly appreciated.


Solution

  • You should use Canny Edge detection. Emgu CV Canny

    It should help your accuracy. It finds edges by their local differences, not by the brightness/contrast of the whole image.