Search code examples
c#ocrtesseract

Read .jpeg image text using c# and Tesseract


Im trying to read the text content of an image using Tesseract. Im using the following code for that.

try
{
    //long i;
    var image = new Bitmap(@"D:\Projects\Project Docs\Oasis\20180405T105834.618.jpeg");
    var ocr = new tessnet2.Tesseract();
    //ocr.SetVariable("tessedit_char_whitelist", "0123456789");
    ocr.Init(@"D:\Projects\Project Docs\Oasis\", "eng", false);

    var result = ocr.DoOCR(image, Rectangle.Empty);
    foreach (tessnet2.Word word in result)
    {
        Console.WriteLine(word.Text);
        Console.Read();
    }
    Console.ReadKey();
}
catch (Exception Ex)
{

    throw;
}

at the ocr.Init(@"D:\Projects\Project Docs\Oasis\20180405T105834.618.jpeg", "eng", false); the application breaking without any exception.


Solution

  •  ocr.Init(@"D:\Projects\Project Docs\Oasis\", "eng", false);
    

    In the above line the path for int will be the path of tessdata inside the solution. I corrected the path for my application to ocr.Init(@"D:\vijesh\My Projects\Tesseract_OCR-master\Tesseract_OCR-master\Content\tessdata", "eng", false);