I am trying to use IronOCR to recognize japanese.
When it comes to single numbers like 1
, 3
, and 7
it does not work. Numbers like 5,920
or any longer numbers than single digit will show correctly.
I have read some related articles here.
Suggestions like Ocr.Configuration.PageSegmentationMode=TesseractPageSegmentationMode.SingleChar;
are not available.
I can not be sure there is always single or not single digit.
Here is my code and what should i do ?
using (var Input = new OcrInput(croppedImage))
{
Input.DeNoise();
Input.Invert();
//Input.DeepCleanBackgroundNoise();
var Result = Ocr.Read(Input);
textBox1.Text = Result.Text;
//Result.SaveAsTextFile("JapaneseText.txt");
}
Working number
Not working number
Have you tried download the Financial package and trying it again with the following code?
PM> Install-Package IronOCR.Languages.Financial
You can read about it here, it is supposed to help you recognize numbers. https://ironsoftware.com/csharp/ocr/languages/financial/
Below is the code I tried with a single number and it worked for me.
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.Financial;
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.SingleChar;
You can also whitelist it to do only numbers which might help out.
ocr.Configuration.WhiteListCharacters = "0123456789";