I would like to preserve the timestamp of the file that is extracted from a gzip file in Java.
Here's the code:
public void gunzipFile(String zipFile, String newFile) {
System.out.println("zipFile: " + zipFile);
final int bufferSize = 1024;
try {
FileInputStream fis = new FileInputStream(zipFile);
BufferedInputStream bis = new BufferedInputStream(fis);
GZIPInputStream gis = new GZIPInputStream(bis);
FileOutputStream fos = new FileOutputStream(newFile);
final byte[] buffer = new byte[bufferSize];
int len = 0;
while ((len = gis.read(buffer)) != -1) {
fos.write(buffer, 0, len);
}
//close resources
fos.close();
gis.close();
} catch (IOException e) {
System.out.println("exception caught");
}
}
This is a hacky solution because the GZIPInputStream
class cannot give you the timestamp.
FileInputStream fis = new FileInputStream(zipFile);
byte[] header = new byte[10];
fis.read(header);
int timestamp = header[4] & 0xFF |
(header[5] & 0xFF) << 8 |
(header[6] & 0xFF) << 16 |
(header[7] & 0xFF) << 24;
// or more simply, use
// int timestamp = ByteBuffer.wrap(header, 4, 4).order(ByteOrder.LITTLE_ENDIAN).getInt();
System.out.println(new Date((long) timestamp * 1000)); // this will give you the date
The GZIP format uses a 10 byte header for some metadata. Bytes 5 (offset 4) through 8 represent the unix timestamp. If you convert those into an int
and multiply by 1000 to get milliseconds, you can get the date of the file within (if there originally was one).
The format is (from RFC 1952)
0 1
+--------+--------+
|00001000|00000010|
+--------+--------+
^ ^
| |
| + more significant byte = 2 x 256
+ less significant byte = 8
In other words, the first byte is the last 8 bits of the int
. That's where the LITTLE_ENDIAN
comes in.
I would recommend you are careful using the InputStream
here. Possibly use BufferedInputStream
and reset()
to position 0
or just open a different InputStream
. Use one to get the timestamp and use the other inflate the gzip content.