Search code examples
c#ocr

C# ironOCR to recognize single number


I am trying to use IronOCR to recognize japanese.

When it comes to single numbers like 1, 3 , and 7 it does not work. Numbers like 5,920 or any longer numbers than single digit will show correctly.

I have read some related articles here.

Suggestions like Ocr.Configuration.PageSegmentationMode=TesseractPageSegmentationMode.SingleChar; are not available.

I can not be sure there is always single or not single digit.

Here is my code and what should i do ?

using (var Input = new OcrInput(croppedImage))
{
    Input.DeNoise();
    Input.Invert();
    //Input.DeepCleanBackgroundNoise();
    var Result = Ocr.Read(Input);
    textBox1.Text = Result.Text;
    //Result.SaveAsTextFile("JapaneseText.txt");
}

Working number

Image showing the text "29,600"

Not working number

Image showing the text "2"


Solution

  • Have you tried download the Financial package and trying it again with the following code?

    PM> Install-Package IronOCR.Languages.Financial
    

    You can read about it here, it is supposed to help you recognize numbers. https://ironsoftware.com/csharp/ocr/languages/financial/

    Below is the code I tried with a single number and it worked for me.

    var ocr = new IronTesseract();
    ocr.Language = OcrLanguage.Financial;
    ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.SingleChar;
    

    You can also whitelist it to do only numbers which might help out.

    ocr.Configuration.WhiteListCharacters = "0123456789";