Search code examples
javajpegbufferedimagetiffjavax.imageio

Combine multiple tif and Jpeg into single tif file has huge size


I am trying to combine multiple tif and jpeg files into single tif file.

When multiple tif files alone are combined into one single tif file, the file has nearly the same size as of original multiple tif files (10 MB of multiple tif files ----> 10 MB of single tif file). This is perfect.

However, when tif files alone with many number of Jpeg files are combined to single tif file, the file size is huge compared to original (10 MB of multiple tif and Jpeg file resulted in creating 200 MB tif file).

Is there way to prevent the large file size when JPEG file comes?

Code used:

List<BufferedImage> bufferedImageList = new ArrayList<>();
for (String page : pages) {
BufferedImage bufferedImage = ImageIO.read(file);
    bufferedImageList.add(bufferedImage);
}

String filename = "D:\home\example.tif";
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIF").next();

try (ImageOutputStream output = ImageIO.createImageOutputStream(new File(filename))) {
    writer.setOutput(output);

    ImageWriteParam params = writer.getDefaultWriteParam();
    params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);

    params.setCompressionType("LZW");
    params.setCompressionQuality(1.0f);

    writer.prepareWriteSequence(null);

    for (BufferedImage image : bufferedImageList) {
        writer.writeToSequence(new IIOImage(image, null, null), params);
    }
    
    
    writer.endWriteSequence();
    
}

writer.dispose();


Solution

  • As @Henry wrote in the comments, using LZW (or any other lossless) compression will not achieve the same compression as lossy JPEG (for "natural" images). Going this route will in most cases result in larger files, but with the highest quality possible. In many cases, this will be acceptable.

    An alternative is to use JPEG compression in the TIFF files as well, by specifying compressionType "JPEG". You probably also need to set the compressionQuality to something lower (I believe 0.7f is the default for JPEG in ImageIO), to get more reasonable file sizes.

    However, using JPEG compression to recompress images that was already JPEG compressed, will introduce "generational loss", as JPEGs typically cannot be perfectly reconstructed. It's technically possible to keep this quality loss at a minimum, by recompressing with the same tables as the original, but this is hard to achieve in practice, due to minor rounding errors in encoders/decoders (ie. your best bet is using the same encoder that wrote the original, with the exact same parameters).

    A third option is to use/create a special purpose TIFF utility, that can store the JPEG streams as-is inside the new TIFF container. These files may not be super-efficient TIFFs, as they won't support strips/tiles etc, and some non-standard inputs may still require re-writing to produce valid TIFFs. This will be a bit more work and require some in-depth knowledge about the TIFF format, but certainly doable.