Search code examples
tesseract

Tesseract data language codes with country name


Tesseract updated their iOS library and training data. The training data is with language codes. How can I know which language is this and to which country it belongs? I searched all Google for this. Some codes are understandable but not all. i.e.

  1. asm.traindata
  2. aze.traindata
  3. bel.traindata
  4. ben.traindata
  5. bod.traindata ....

Solution

  • Those file names are ISO 639-2/T or ISO 639-2/B language codes. IN THIS WIKI ARTICLE you can find the whole table of languages and their codes, so finding out to which language those files belong should be easy.