Search code examples

PDF 'Itext User Agent' cache size and how to clear it

My code uses the "Flying saucer User Agent" library to generate PDF/PPT files from html templates.

Now the problem is that Itext Uses a cache system to access images and other resources via cache instead of calling an external URL. I want to know how to clear this cache, or how much does it take to refresh. I have no idea I'm desperate and clueless here since I can't even understand this tool with the lack of a good documentation.

1- Can you please explain what is the role of the ReplacedElementFactory.

2- And while investigating the library, I found the method:

    public ImageResource getImageResource(String uri) {
    ImageResource resource = null;
    uri = this.resolveURI(uri);
    resource = (ImageResource)this._imageCache.get(uri);
    if (resource == null) {
        InputStream is = this.resolveAndOpenStream(uri);
        if (is != null) {
            try {
                URL url = new URL(uri);
                if (url.getPath() != null && url.getPath().toLowerCase().endsWith(".pdf")) {
                    PdfReader reader = this._outputDevice.getReader(url);
                    PDFAsImage image = new PDFAsImage(url);
                    Rectangle rect = reader.getPageSizeWithRotation(1);
                    image.setInitialWidth(rect.getWidth() * this._outputDevice.getDotsPerPoint());
                    image.setInitialHeight(rect.getHeight() * this._outputDevice.getDotsPerPoint());
                    resource = new ImageResource(uri, image);
                } else {
                    Image image = Image.getInstance(this.readStream(is));
                    resource = new ImageResource(uri, new ITextFSImage(image));

                this._imageCache.put(uri, resource);
            } catch (Exception var16) {
                XRLog.exception("Can't read image file; unexpected problem for URI '" + uri + "'", var16);
            } finally {
                try {
                } catch (IOException var15) {


that contains this line of code resource = (ImageResource)this._imageCache.get(uri);

I assume that's where it gets the image from cache instead of looking for a newer version of the picture.

3- How often does Itext refresh it's cache, and what's it's size in the first place, and how do I specify a path for it, how it stores it?

Thank you for your help.


  • Summary: The solution the OP probably used the one at the bottom no. (3): Disabling the cache via commandline parameter/config file.

    This code is not from iText but from flyingsaucer itself but since you only copy&pasted one method it is really difficult for people to answer.

    As you can see in the top the cache size is 32 private static final int IMAGE_CACHE_CAPACITY = 32;.

    As you can also see in the code the key is the URI resource = (ImageResource) _imageCache.get(uriStr); or _imageCache.put(uriStr, resource);

    So if your images on the remote location change but the URI stays the same you'll get old images. So you have several options:

    1. Disable the cache
    2. Add an invalidation mechanism. This can be based on a time. E.g. you know your images on your server change every 6hours then set the invalidation time accordingly
    3. Add a hash to verify whether the image has changed...

    Update: It is still not completely clear to me what you want? Do you want to disable the caching functionality without a code change?

    1. You can change the image URI (e.g. add some random number...) each time the images changes (thus make it unique). This has the advantage if an image can be reused it will be faster.
    2. You can try calling clearImageCache() which will clear the cache or [shrinkImage][2] and older images will be dropped (if there are more than 32)
    3. Or you disable the cache with the FlyinSaucer Configuration (e.g. set it to 0). The key you are looking for is xr.image.cache-capacity. You can use a config file (local.xhtmlrenderer.conf) or specify it as a parameter java -Dxr.image.cache-capacity=0.