Search code examples
truetype

character encoding for TrueType format 0 cmap tables?


The TrueType Reference Manual explains that "cmap subtable format 0" maps 8-bit character codes to glyph index values. Which encoding is used for these character codes? Are these the first 256 unicode characters?


Solution

  • The cmap subtable format is semi-orthogonal to the encoding. As the TT Ref Manual explains (or, I think, a bit more clearly in the OpenType spec), there are structs ("EncodingRecord" in the OT spec, "subtables" in the TT Ref Manual) that specify a platformID and encodingID. I say "semi" orthogonal since certain formats can be used only with certain platforms/encodings.

    In practice, format 0 is only used for platform 1, Macintosh, or for platform 3 encoding 0, "Windows symbol". The Macintosh platform uses only legacy 8-bit encodings defined in older Mac platforms.

    When creating new fonts, format 0 should only be used for "Windows symbol" fonts, and even that is no longer best practice (the non-standard characters can be represented using Unicode private-use area code points).