In my production pipeline I need to generate a few hundred PDF from HTML. For this scenario I first convert the HTML into XHTML. Than im passing the 'cleaned' XHTML and the uri to the renderer.
Since the *.css and imageFiles are equal for all the XHTML files I dont need to resolve them all the time I process a file. Im successfully using the following code for caching images. How can I cache .css files aswell? I want to avoid putting all files into my classpath.
ITextRenderer renderer = new ITextRenderer();
ResourceLoaderUserAgent callback = new ResourceLoaderUserAgent(renderer.getOutputDevice());
callback.setSharedContext(renderer.getSharedContext());
for (MyObject myObject : myObjectList) {
OutputStream os = new FileOutputStream(tempFile);
final DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setValidating(false);
DocumentBuilder builder = documentBuilderFactory.newDocumentBuilder();
org.w3c.dom.Document document = builder.parse(myObject.getLocalPath); // full path to .xhtml
renderer.getSharedContext().setUserAgentCallback(callback);
renderer.setDocument(document, myObject.getUri());
renderer.layout();
renderer.createPDF(os);
os.flush();
os.close();
}
...
private static class ResourceLoaderUserAgent extends ITextUserAgent
{
public ResourceLoaderUserAgent(ITextOutputDevice outputDevice) {
super(outputDevice);
}
protected InputStream resolveAndOpenStream(String uri) {
InputStream is = super.resolveAndOpenStream(uri);
System.out.println("IN resolveAndOpenStream() " + uri);
return is;
}
}
incase someone facing the same problem here is how I solved it. Since I wasnt able to cache the *.css files inside my CustomUserAgent I had to find another way. My solution uses Squid as http-proxy to cache all frequently used resources.
Inside my CustomUserAgent I only need to access this proxy by passing the proxy-object.
public class ResourceLoaderUserAgent extends ITextUserAgent {
public ResourceLoaderUserAgent(ITextOutputDevice outputDevice) {
super(outputDevice);
}
protected InputStream resolveAndOpenStream(String uri) {
HttpURLConnection connection = null;
URL proxyUrl = null;
try {
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("localhost", 3128));
proxyUrl = new URL(uri);
connection = (HttpURLConnection) proxyUrl.openConnection(proxy);
connection.connect();
} catch (Exception e) {
throw new RuntimeException(e);
}
java.io.InputStream is = null;
try {
is = connection.getInputStream();
} catch (java.net.MalformedURLException e) {
XRLog.exception("bad URL given: " + uri, e);
} catch (java.io.FileNotFoundException e) {
XRLog.exception("item at URI " + uri + " not found");
} catch (java.io.IOException e) {
XRLog.exception("IO problem for " + uri, e);
}
return is;
}
}
cached:
resolving css took 74 ms
resolving images took 225 ms
uncached:
resolving css took 15466 ms
resolving images took 11236 ms
as you can see, the differents between cached and uncached resources are significant