I was hoping to get a better understanding of PdfBox here. We currently use LosslessFactory which needs a BufferedImage, BufferImage however is stored wholly in memory uncompressed. We're dealing with PNG files upwards of 25mb so this ends up being very memory hungry. I've done some scanning of the doc and can't find a non BufferedImage reliant solution for PNG files. It appears JPG can use streams directly to avoid this? Am I missing something here or is there a genuine technical reason for PNG files requiring BufferedImage?
Yes it is possible if you use the PDImageXObject.createFromByteArray()
. That one uses the non public class PNGConverter
which uses the compression that is in PNG files. That method exists since 2.0.18 and was developed by Emmeran Seehuber in PDFBOX-4341.