I am using Itext
library to manipulate my PDF.
I am using this example http://developers.itextpdf.com/examples/itext-action-second-edition/chapter-16#616-listusedfonts.java to find out the fonts which are not embedded in PDF.
Does the library provide any option to check where exactly was the font not embedded in the PDF?
The sample referenced by the OP only inspects the pages and the form xobjects referenced from them, and it outputs information on the fonts provided in the resources of these entities.
If one needs to pinpoint where exactly which kind of font is used, one has to use a different mechanism, the parser package classes with a custom render listener. This listener then can act on text drawing operations while such a not embedded font is used.
To find out where some resource actually is used on a page, you have to parse the page content stream and check the PDF instructions therein.
iText helps you in doing so by providing a parser framework which reads the content stream and pre-analyzes it. The results of this first analysis are forwarded to a render listener you provide.
You use the parser framework like this:
PdfReader reader = new PdfReader(SOURCE);
for (int page = from; page <= to; page++)
{
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
RenderListener renderListener = YOUR_RENDER_LISTENER_IMPLEMENTATION;
parser.processContent(page, renderListener);
// after the page has been processed, probably
// some render listener related post-processing
}
For e.g. text extraction, you usually use the render listener implementations LocationTextExtractionStrategy
or SimpleTextExtractionStrategy
(which come with iText) and after the page has been processed, you retrieve the String
of text from the strategy it has extracted from the events from the page.
Render listeners in iText 5 have to implement the interface RenderListener
:
public interface RenderListener {
/**
* Called when a new text block is beginning (i.e. BT)
*/
public void beginTextBlock();
/**
* Called when text should be rendered
* @param renderInfo information specifying what to render
*/
public void renderText(TextRenderInfo renderInfo);
/**
* Called when a text block has ended (i.e. ET)
*/
public void endTextBlock();
/**
* Called when image should be rendered
* @param renderInfo information specifying what to render
*/
public void renderImage(ImageRenderInfo renderInfo);
}
or ExtRenderListener
which declares some additional listener methods.
A render listener for your task, i.e. a render listener to find where exactly a given font is used to draw text, only needs to implement renderText
non-trivially, e.g. like this:
public void renderText(TextRenderInfo renderInfo)
{
DocumentFont documentFont = renderInfo.getFont();
PdfDictionary font = documentFont.getFontDictionary();
// Check the font dictionary like in your example code
if (font FULFILLS SOME CRITERIA)
{
// The text
String text = renderInfo.getText();
// is rendered on the current page on the base line
LineSegment baseline = renderInfo.getBaseline();
// using a font fulfilling the given criteria
...
}
}