Search code examples
c#.netocrtesseracttessnet2

Tessnet2 returning only one character


I'm writing an application and I want to read an image and extract the text. For testing porpuses I'm passing in an image with 6 characters. Here is my code.

Bitmap image = new Bitmap("eurotext.tif");
tessnet2.Tesseract ocr = new tessnet2.Tesseract();
ocr.SetVariable("tessedit_char_whitelist", "abcdefghijklmopqrstuvwxyz0123456789"); // If digit only
ocr.Init(null, "eng", false); // To use correct tessdata
List<tessnet2.Word> result = ocr.DoOCR(image, Rectangle.Empty);
foreach (tessnet2.Word word in result)
Console.WriteLine("{0} : {1}", word.Confidence, word.Text);

The result is 100 : ~

the second time I ran it returned:

100 : -

PLEASE HELP!!! THANKS


Solution

  • Try a bigger Picture. I got an "~" a few times as result when I started with tessnet2. After I used a bigger picture (Textsize should be more than 12) the Programm worked fine.

    To enlarge the picture and try different sizes, you can use a trackbar and the following code:

    C#

            Bitmap originalImage = new Bitmap(imagePath, true);
            double needdedHeigth = Convert.ToDouble(trackbar1.Value);
            double faktor = needdedHeigth / (double)originalImage.Height;
            int newWidth = Convert.ToInt32(faktor * (double)originalImage.Width);
            Bitmap ORCImage = new Bitmap(originalImage,newWidth,Convert.ToInt32(needdedHeigth));
    

    Use the "OCRImage' Bitmap in the 'DoOCR()' method