Search code examples
unicodefontsapache-fop

Displaying Unicode characters in PDF produced by Apache FOP


I have an XML file containing a list of names, some of which use characters/glyphs which are not represented in the default PDF font (Helvetica/Arial):

<name>Paul</name>
<name>你好</name>

I'm processing this file using XSLT and Apache FOP to produce a PDF file which lists the names. Currently I'm getting the following warning on the console and the Chinese characters are replaced by ## in the PDF:

Jan 30, 2016 11:30:56 AM org.apache.fop.events.LoggingEventListener processEvent WARNING: Glyph "你" (0x4f60) not available in font "Helvetica".
Jan 30, 2016 11:30:56 AM org.apache.fop.events.LoggingEventListener processEvent WARNING: Glyph "好" (0x597d) not available in font "Helvetica".

I've looked at the documentation and it seems to suggest that the options available are:

  1. Use an OpenType font - except this isn't supported by FOP.
  2. Switch to a different font just for the non-ASCII parts of text.

I don't want to use different fonts for each language, because there will be PDFs that have a mixture of Chinese and English, and as far as I know there's no way to work out which is which in XSLT/XSL-FO.

Is it possible to embed a single font to cover all situations? At the moment I just need English and Chinese, but I'll probably need to extend that in future.

I'm using Apache FOP 2.1 and Java 1.7.0_91 on Ubuntu. I've seen some earlier questions on a similar topic but most seem to be using a much older version of Apache FOP (e.g. 0.95 or 1.1) and I don't know if anything has been changed/improved in the meantime.

Edit: My question is different (I think) to the suggested duplicate. I've switched to using the Ubuntu Font Family using the following code in my FOP config:

<font kerning="yes" embed-url="../fonts/ubuntu/Ubuntu-R.ttf" embedding-mode="full">
   <font-triplet name="Ubuntu" style="normal" weight="normal"/>
</font>

<font kerning="yes" embed-url="../fonts/ubuntu/Ubuntu-B.ttf" embedding-mode="subset">
   <font-triplet name="Ubuntu" style="normal" weight="bold"/>
</font>

However, I'm still getting the 'glyph not available' warning:

Jan 31, 2016 10:22:59 AM org.apache.fop.events.LoggingEventListener processEvent
WARNING: Glyph "你" (0x4f60) not available in font "Ubuntu".
Jan 31, 2016 10:22:59 AM org.apache.fop.events.LoggingEventListener processEvent
WARNING: Glyph "好" (0x597d) not available in font "Ubuntu".

I know Ubuntu Regular has these two glyphs because it's my standard system font.

Edit 2: If I use GNU Unifont, the glyphs display correctly. However, it seems to be a font aimed more at console use than in documents.


Solution

  • The answer to my question is either use GNU Unifont, which:

    1. Supports Chinese and English.
    2. Is available under a free licence.
    3. 'Just works' if you add it to the FOP config file.

    Or alternatively produce separate templates for English and Chinese PDFs and use different fonts for each.