Search code examples
javacompressiongzipgzipinputstream

Extract GZ file with Java


I'm trying to extract a CSV file from a GZ file.

So far, I've tried the following ways to make this operation:

Archiver archiver = ArchiverFactory.createArchiver(null, CompressionType.GZIP);
archiver.extract(archiveFile, destFile);

Or

GzipCompressorInputStream archive = new GzipCompressorInputStream(new BufferedInputStream(new FileInputStream(archiveFile)));
OutputStream out = new FileOutputStream(destFile);
IOUtils.copy(archive, out);
out.close();
archive.close();

Or

GZIPInputStream archive= new GZIPInputStream(new FileInputStream(archiveFile));
OutputStream out = new FileOutputStream(destFile);
IOUtils.copy(archive, out);
out.close();
archive.close();

I've also given a try to Snappy which is a (un)compression lib on github.

In every case, I got the following error displayed:

java.io.IOException: Gzip-compressed data is corrupt

I've checked the GZ files validity with the following console command, which says everything should be alright.

gzip -v -t MyFileToUncompress.csv.gz
MyFileToUncompress.csv.gz: OK

The GZ files were compressed by console command or by Java itself or on a Windows. Same result so far.

Is there something I'm doing wrong or is that an issue on my Java (JDK 1.7 or 1.8 produce the same exception) ?


Solution

  • Here is code I use for gunzip, though it doesn't look as though it would produce a different outcome as it is essentially same as your third example:

    try(final OutputStream out = Files.newOutputStream(fout);
        final InputStream in   = new GZIPInputStream(Files.newInputStream(fin))) {
        in.transferTo(out);
    }
    

    However it is worth checking whether your result changes using latest JDK, and also check that gzip -d MyFileToUncompress.csv.gz generates the expected file back.