I'm interested in knowing where font fallback fits in the font shaping/rendering stack. In other words, at what point are missing glyphs detected and how are they substituted?
I see in this document that the FontConfig tool does font fallback "based on glyph coverage transparently."
So the questions are:
Edit: I found this document which explains the "what" of FontConfig, but not the "how." Question 1 is about the "how."
To summarize - this post really has to do with one thing only - how does font fallback work when glyphs are missing in a font.
Font fallback in browsers (as opposed to, say, in an OS) is based on two things:
The CSS spec is fairly trivial in this respect, simply giving the list of fonts using their system names, but several possible "catch all" fonts that are in no way guaranteed to be the same from computer to computer (there is no reason to assume that serif
maps to Times
or Times New Roman
, for instance).
The fallback algorithm used by text engines is entirely up to the engine, but usually kicks in during the glyph lookup step: the text engine sees a string of code points, and tries to use a font to shape that string. For each point in the sequence, it checks whether the font has a matching glyph (by consulting the CMAP table and subtables), or a rule that tells the engine that there may be a glyph to use only if more code points follow, through the GSUB mechanism (For instance, a font without glyphs for the individual letters e
, t
and c
, but with a glyph for &
and a GSUB rule that says the sequence e
+t
+c
should be in-text replaced with the single glyph &
), and when it's finished accumulating this kind of "unit of points", it shapes the text and hands it back to whatever asked it to shape text.
If, during glyph lookup, it turns out the font doesn't contain anything that lets the engine shape a particular code point (i.e. running through the CMAP data as well as the GSUB rules still shows "there is no glyph") then the text engine can do two things:
.notdef
outline defined as glyph id 0, and generally give you text with lovely empty boxes (lovingly called "tofu" by font folks) or question marks.When using fallback, an engine can go down a list of alternative fonts until either: (a) a glyph is found, or (b) the list is exhausted, at which point the engine has to give up, and will use the .notdef
glyph. Whether the engine grabs the .notdef
glyph from the original font, or from the last font in the list, is entirely up to the engine (although usually it'll go with the first font, for legibility)
There is no "standard" algorithm for this defined anywhere; font fallback is basically a convenience mechanism offered by text engine authors, like how browsers come with bookmark managers (handy, and not part of any spec). As far as OpenType is concerned, there are no requirements on whether an engine should just serve up .notdef
when a glyph is not found, or whether it should serve up the part it could shape, then find the missing glyph somewhere else, and render text that way. CSS implies that your text engine should have at least some form of font fallback, but it doesn't specify how it should work, or when it should kick in.