Search code examples
iosswiftocrtext-recognitiongoogle-mlkit

The text recognition of MLkIT does not recognize non-english text


I implemented the text recognition of MLkit on IOS but it could not recognize non-English text (e.x Arabic text).

It works in English only.

Here are the docs https://developers.google.com/ml-kit/vision/text-recognition/ios

My code:

        let textRecognizer = TextRecognizer.textRecognizer()
        let visionImage = VisionImage(image: image)

        textRecognizer.process(visionImage) { result, error in
            guard error == nil, let result = result else { return }
            let resultText = result.text
            print("MLKit : " + resultText)
        }

Solution

  • Update: If it does not need to be completely local (on-device without network), you can try MLKit for cloud, which supports "100+ different languages and scripts.". Firebase Text Recognition

    For local inference: Googles 'MLKit text recognition' is based on 'TensorFlow Lite' which uses local models in order to recognize the text. From my researches Google does not state that the predefined model is exclusively working for Latin alphabet. But it seems like that. So you have three options now:

    1. Look for a custom "TensorFlow Lite" model which is trained for arabic alphabet.
    2. Train your own "TensorFlow Lite" model: TensorFlow Lite Model Maker
    3. Find/Train a "Tensor Flow" model (not lite!) and convert it to a "TensorFlow Lite" Model