Search code examples
javacompressiontarlossless-compressionxz

How to get uniform compression while using xz compression in Java?


I am trying xz compression in Java. Using the xz 1.5 compression library, the commons io 2.4 library and the commons compress 1.8.1 library. I tried to run the code below which gave me very inconsistent results . Over 70% for text, Under 0.1% for audio and video files (1-compressed/original * 100). I am using making a tarball before compressing each time. Is this supposed to work only for text files?

package makexz;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.commons.compress.compressors.CompressorOutputStream;
import org.apache.commons.compress.compressors.CompressorStreamFactory;
import org.apache.commons.compress.compressors.xz.XZCompressorOutputStream;
import org.apache.commons.compress.utils.IOUtils;

public class MakeXZ {

public static void main(String[] args) throws FileNotFoundException, IOException, Exception {

        FileOutputStream dest = new         FileOutputStream("C://TARDUMP//XZ//newvid.tar.xz");
    CompressorOutputStream cos = new     CompressorStreamFactory().createCompressorOutputStream(CompressorStreamFactory.XZ, dest);
        String input = "C://TARDUMP//newvid.tar";
        IOUtils.copy(new FileInputStream(input), cos);
        cos.close();
    }

}

Solution

  • What you are seeing is entirely expected. Data can be compressed only if it has redundancy that can be detected and exploited. Audio and video files are already compressed. There is no redundancy in them for xz to exploit. There is plenty of redundancy in text files to exploit.