Search code examples
javaimage-compression

Maximum possible image compression using setCompressionQuality


Before I start, No, this is not a possible duplicate question to Theoretical Limit to Compression. I just need to find a way to compress an image a few hundred bytes more than it is, using JAVA.

I've been trying to compress a 5kB image . The maximum compression reduces it to 980 bytes, which is quite effective, but I need it to be compressed to at least 500 bytes.

Here's my code snippet:

  File compressedImageFile = new File("D:\\compress.jpg");
  OutputStream os =new FileOutputStream(compressedImageFile);

  Iterator<ImageWriter>writers = 
  ImageIO.getImageWritersByFormatName("jpg");
  ImageWriter writer = (ImageWriter) writers.next();

  ImageOutputStream ios = ImageIO.createImageOutputStream(os);
  writer.setOutput(ios);

  ImageWriteParam param = writer.getDefaultWriteParam();
  param.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
  param.setCompressionQuality(0.01f);
  writer.write(null, new IIOImage(image, null, null), param);
  os.close();
  ios.close();
  writer.dispose();

Here, compressedImageFile is the Buffered Image I got from the input image file. And, in the param.setCompressionQuality(0.01f) line, no matter how much value I set lower than 0.01f, it doesn't make a difference. Is 0.01f the lower limit of compression?

If so, is there any way I could compress it further?


Solution

  • The value that is passed to the setCompressionQuality method is a float value that is intended to be between 0.0f and 1.0f. However, the compression is nothing that is "continuous" or "linear" in that sense. You can not imagine that a file with 100 bytes will have 100 bytes for a quality of 1.0, 50 bytes for a quality of 0.5 and 0 bytes for a quality of 0.0. Similarly, you can not expect a difference in the compression for a quality of 0.000001 and 0.000002.

    The value that is passed to this method is used for internal computations, particularly, for setting up the JPEG quantization table. You may want to have a look at the classes from the javax.imageio.plugins.jpeg package, but ... expect that you will not understand them without a profound background knowledge. In any case, the value between 0.0 and 1.0 has to be discretized in any form. Roughly speaking: It may be converted to an int value between 0 and 255, so there may be no difference in the compression for 0.01 and 0.00, because both values will be converted to the same int value - namely 0.

    This may explain why there is no difference in the compression for "small" differences in the values. The reason why an image file can hardly be compressed to be arbitrarily small is that there is a limit for the compression, implied by the algorithm that is used. Of course you could create an own JPEG-like compression, where the highest compression turns the image into a large rectangle with a single color that could theoretically (!) be stored as a single byte (or 3 bytes, maybe). But this is simply not intended for the JPEG compression.

    In https://stackoverflow.com/a/22016608/3182664 I posted a small utility application that lets you adjust the desired image size and the desired compression. For example, you may choose a file size of 10kB, and the program will compute the compression level that is necessary to achieve this file size. However, this will not allow you to achieve a higher compression than by manually setting a quality of 0.0.


    EDIT: Concerning the comments and other answers refering to information theory: This is not really applicable here. An arbitrary large and complex image could theoretically be compressed to be only 3 bytes large - when accepting that all information of the original image is lost. JPEG is a lossy compression, and some information is lost for every quality setting except the highest one. So it's not related to information theory, but simply to the question of how much loss of information one is willing to accept - and what is supported by the JPEG standard. (For example, each JPEG file may have to contain some information like the quanization table, which may always take a few bytes regardless of the actual image content).