Search code examples
javaxhtmlpdf-generationitextflying-saucer

XHTML to PDF using flying-saucer how to cache css


In my production pipeline I need to generate a few hundred PDF from HTML. For this scenario I first convert the HTML into XHTML. Than im passing the 'cleaned' XHTML and the uri to the renderer.

Since the *.css and imageFiles are equal for all the XHTML files I dont need to resolve them all the time I process a file. Im successfully using the following code for caching images. How can I cache .css files aswell? I want to avoid putting all files into my classpath.

ITextRenderer renderer = new ITextRenderer();

ResourceLoaderUserAgent callback = new ResourceLoaderUserAgent(renderer.getOutputDevice());
callback.setSharedContext(renderer.getSharedContext());

for (MyObject myObject : myObjectList) {

    OutputStream os = new FileOutputStream(tempFile);

    final DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
    documentBuilderFactory.setValidating(false);
    DocumentBuilder builder = documentBuilderFactory.newDocumentBuilder();
    org.w3c.dom.Document document = builder.parse(myObject.getLocalPath); // full path to .xhtml

    renderer.getSharedContext().setUserAgentCallback(callback);

    renderer.setDocument(document, myObject.getUri());
    renderer.layout();
    renderer.createPDF(os);

    os.flush();
    os.close();
}
    ...


private static class ResourceLoaderUserAgent extends ITextUserAgent
{
    public ResourceLoaderUserAgent(ITextOutputDevice outputDevice) {
        super(outputDevice);
    }

    protected InputStream resolveAndOpenStream(String uri) {
        InputStream is = super.resolveAndOpenStream(uri);
        System.out.println("IN resolveAndOpenStream() " + uri);

        return is;
    }
}

Solution

  • incase someone facing the same problem here is how I solved it. Since I wasnt able to cache the *.css files inside my CustomUserAgent I had to find another way. My solution uses Squid as http-proxy to cache all frequently used resources.

    Inside my CustomUserAgent I only need to access this proxy by passing the proxy-object.

    public class ResourceLoaderUserAgent extends ITextUserAgent {
    
    public ResourceLoaderUserAgent(ITextOutputDevice outputDevice) {
        super(outputDevice);
    }
    
    protected InputStream resolveAndOpenStream(String uri) {    
    
        HttpURLConnection connection = null;
        URL proxyUrl = null;
        try {
            Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("localhost", 3128));
            proxyUrl = new URL(uri);
            connection = (HttpURLConnection) proxyUrl.openConnection(proxy);
            connection.connect();
    
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    
        java.io.InputStream is = null;
        try {
            is = connection.getInputStream();
        } catch (java.net.MalformedURLException e) {
            XRLog.exception("bad URL given: " + uri, e);
        } catch (java.io.FileNotFoundException e) {
            XRLog.exception("item at URI " + uri + " not found");
        } catch (java.io.IOException e) {
            XRLog.exception("IO problem for " + uri, e);
        }
    
        return is;
    }
    }
    

    cached:

    resolving css took 74 ms
    resolving images took 225 ms
    

    uncached:

    resolving css took 15466 ms
    resolving images took 11236 ms
    

    as you can see, the differents between cached and uncached resources are significant