Search code examples
automationocrrpaintellibotintellibot-studio

Add Japnese Support to OCR | RPA


How can I add Japanese language with OCR? I would like to know where the language files are located and how to select them.


Solution

  • Intellibot supports the following ocr engines Tesseract 4.0 (LSTM) ABBYY Abbyy Cloud Google Cloud Microsoft Cloud Amazon Cloud Modi (should be installed separately on the machine)

    Tesseract 4.0 is a good OCR engine and is free to use. By default, the OCR Text component is set to use Tesseract with the English language. The engine and language can be selected in the Ocr settings window which can be accessed by double-clicking on the title of the OCR TEXT component.

    enter image description here

    To use the Japanese language with Tesseract Please download the Japanese trained data file from the link mentioned below

    https://github.com/tesseract-ocr/tessdata/blob/master/jpn.traineddata

    then place it at the following file path

    "%localappdata%\INTELLIBOT\ed611e32-2c12-4040-a1f0-4f8184df3000\0a634a0b-d535-4343-9963-23ab0d5a8702\293745f8-12ea-4a86-be5f-8cea9576f0b5\tessdata"

    then reopen Intellibot Studio.

    Now select "jpn" under Language drop-down as shown below:

    enter image description here