Search code examples
javafontspdfboxpdfrenderer

why both pdfbox and pdfrenderer can not support "Additional fonts"?


I have a pdf which contains 'UniCNS-UCS2-H' font, I tried both pdfbox and pdfrenderer, they all throw exception: Unknown encoding for 'UniCNS-UCS2-H'

and this font was included in a font file :mingliu.ttc(it's a true type collection file, I don't know does this matter ?

what can I do to let these two libraries support additional fonts ?


Solution

  • The encoding for a font in PDF documents is specified in the font dictionary object. The font you are encountering is encoded using 'UniCNS-UCS2-H', which as far as I can tell is a variant of Chinese encoding.

    PDFBox only supports 4 encodings:

    1. PDFDocEncoding
    2. MacRomanEncoding
    3. StandardEncoding
    4. WinAnsiEncoding

    These are defined in the font dictionary object inside the pdf stream
    (e.g. .../Encoding/WinAnsiEncoding/...)

    When PDFBox encounters an unknown encoding, the exception you reported is shown.

    For more information about fonts in PDF documents, see section 9.5 through 9.8 of the PDF Specification