Search code examples
javapdflib

PDFlib copy page and use font


The PDFlib example search and replace text copies pages and pastes rectangles and text.

Instead of loading a font from my hard disk (like it is done in the example with int font = p.load_font(REPLACEMENT_FONT, "unicode", "");) I'd like to use the original font from the source document.

How can I achieve this?

What I tried is this:

When using int font = 0 (which is equivalent to the value of tet.fontid in line 244), PDFlib throws an exception like this:

com.pdflib.PDFlibException: Option 'font' has bad font handle 0
    at com.pdflib.pdflib.PDF_fit_textline(Native Method)
    at com.pdflib.pdflib.fit_textline(pdflib.java:1086)

What could work (and what I'm also not able to get to run)

Maybe I could read the fonts in the target document. Reading fonts in source document is feasible with this: (int) lib.pcos_get_number(pdiHandle, "length:fonts");. Trying to read the fonts in target document with (int) lib.pcos_get_number(outputPdfHandle, "length:fonts"); (with outputPdfHandle = p.begin_document(outfilename, "") from example line 560) throws exception

com.pdflib.PDFlibException: Handle parameter or option of type 'PDI document' has bad value 1
    at com.pdflib.pdflib.PDF_pcos_get_number(Native Method)
    at com.pdflib.pdflib.pcos_get_number(pdflib.java:1539)

Solution

  • It is not possible to use a font from a document imported via PDI to create text in an output document. In theory the idea sounds attractive to access the font data from the input document via pCOS functions. One could think that it should be possible to reassemble the font data into for example a valid TrueType font that then can be loaded via the PDFlib load_font() function.

    But that is not possible for the following reasons:

    • The font data that is stored in a PDF document is not the complete data that is stored in a TrueType font. Important TrueType tables are missing and cannot be reconstructed from the font data in the PDF file.
    • A font in a PDF file is almost always a subset that contains only the glyphs that are actually used in the document. So even if it would be possible to use the font data from the input document, you could use only glyphs from the subset to create new text in an output document.

    Also the fontid value provided by TET cannot be used as a font handle when creating new output via PDFlib. The fontid value is the index in the pCOS pseudo object array fonts[], and it is totally unrelated to any handles used to create new output via the PDFlib API.