Search code examples
javaandroidopencvimage-processingtesseract

Threshold image using opencv (Java)


I am working with Opencv for my project. I need to convert the image below to threshold image

Original Image

I tried this function:

Imgproc.threshold(imgGray, imgThreshold, 0, 255, Imgproc.THRESH_BINARY + Imgproc.THRESH_OTSU); 

But the result was not so good, as you see below

threshold

So I tried the adaptiveThreshold function:

Imgproc.adaptiveThreshold(imgGray, imgThreshold, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, 11, 2); 

and it resulted:

adaptiveThreshold

I just expect a binary image with white background and black text only, no black area or noise ( I do not prefer using Photo.fastNlMeansDenoising because it takes a lot of time). Please help me with a solution for this.

Also, I am using Tesseract for Japanese recognization but the accuracy rate is not good. Do you have any suggestion on better OCR for Japanese, or any method to improve Tesseract quality?


Solution

  • adaptiveThreshold is the right choice here. Just need a litte tuning. With these parameters (it's C++, but you can easily translate to Java)

    Mat1b gray= imread("path_to_image", IMREAD_GRAYSCALE);
    Mat1b result;
    adaptiveThreshold(gray, result, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, 15, 40);
    

    the resulting image is:

    enter image description here