Search code examples
javaandroid-studioocrtesseracttext-recognition

Tesseract gives no recognition results (Android studio; Java)


I am making an app on Android Studio with tesseract OCR. I made a code which should recognize text on images taken by phone camera. Problem: tesseract function getUTF8Text() gives no result AT ALL (null, despite picture being with text). Program does not give any errors.

I wondered about possible issues: 1. Maybe I integrated tesseract into my project not properly? (Compilator does not show any issues when using tesseract classes in code) 2. Maybe problem in code? (bad traineddata path??).

Main class: Code:

private TessOCR Tess; 

//after taking picture I call:
PictureCallback pictureCallback = new PictureCallback() {
    @Override
    public void onPictureTaken(byte[] data, Camera camera) {
        Bitmap bitmap = BitmapFactory.decodeByteArray(data, 0, data.length);
        String result = Tess.getOCRResult(bitmap);

        if (result != null) Log.i(TAG, result);
        else Log.i(TAG, "NO RESULT");
    }
};

TessOCR class for tesseract traineddata file finding or adding and text recognition (Constructor is only for finding traineddata file):

public class TessOCR {
public static final String PACKAGE_NAME = "com.example.dainius.ocr";
public static final String DATA_PATH = Environment
        .getExternalStorageDirectory().toString() + "/AndroidOCR/";
public static final String lang = "eng";

private static final String TAG = "OCR";
private TessBaseAPI mTess;

public TessOCR(AssetManager assetManager) {

    mTess = new TessBaseAPI();

    String[] paths = new String[] { DATA_PATH, DATA_PATH + "tessdata/" };

    for (String path : paths) {
        File dir = new File(path);
        if (!dir.exists()) {
            if (!dir.mkdirs()) {
                Log.v(TAG, "ERROR: Creation of directory " + path + " on sdcard failed");
                return;
            } else {
                Log.v(TAG, "Created directory " + path + " on sdcard");
            }
        }

    }

    if (!(new File(DATA_PATH + "tessdata/" + lang + ".traineddata")).exists()) {
        try {
            InputStream in = assetManager.open("tessdata/" + lang + ".traineddata");
            OutputStream out = new FileOutputStream(DATA_PATH
                    + "tessdata/" + lang + ".traineddata");

            byte[] buf = new byte[1024];
            int len;
            while ((len = in.read(buf)) > 0) {
                out.write(buf, 0, len);
            }
            in.close();
            out.close();

            Log.v(TAG, "Copied " + lang + " traineddata");
        } catch (IOException e) {
            Log.e(TAG, "Was unable to copy " + lang + " traineddata " + e.toString());
        }
    }

    mTess.setDebug(true);
    mTess.init(DATA_PATH, lang);
}

public String getOCRResult(Bitmap bitmap) {

    mTess.setImage(bitmap);
    String result = mTess.getUTF8Text();

    return result;
}

public void onDestroy() {
    if (mTess != null)
        mTess.end();
}
  • If this problem is caused by bad tesseract integration, please post a proper tutorial about how to integrate it, because every tutorial on the internet is different from each other, it's hard to understand how to properly do it.

Solution

  • The cause of my problem was that I did not as permission to write external storage. If anyone will try to apply this method to extract file from assets folder (got this method from this github project), make sure you add permission to write external storage code line to your manifest (AndroidManifest.xml file):

    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />