Search code examples

List custom fonts in tesseract-ocr/langdata/font_properties?

I am using Tesseract 4.0.0-beta.1-370-g8b64 on Ubuntu 16.04 by building it from source. I've got a directory of font files, and it seems from the documentation for fonts that you need to list the custom fonts in training/ and langdata/font_properties. Also it seems that fonts are listed in font_properties in some particular format, however I can't find the format anywhere. Is there any link or instruction asking how to do it?


  • It's described in Tesseract Training Wiki:

    Each line of the font_properties file is formatted as follows: fontname italic bold fixed serif fraktur where fontname is a string naming the font (no spaces allowed!), and italic, bold, fixed, serif and fraktur are all simple 0 or 1 flags indicating whether the font has the named property.


    timesitalic 1 0 0 1 0