Search code examples
goiometadatagzip

How to remove the gzip header metada from decompressed file


I've a gzip file that I try to decompress and save the result as follows:

bytesReader := bytes.NewReader(gzipData)
gzipReader, err := gzip.NewReader(bytesReader)
defer gzipReader.Close()
if err == nil {
    u1 := uuid.NewV4()
    filename := u1.String() + ".json"
    file, _ := os.Create(filename)
    defer file.Close()
    fileWriter := bufio.NewWriter(file)
    io.Copy(fileWriter, gzipReader)
    fileWriter.Flush()
} else {
    log.Println(err.Error())
}

When I check the resulting json file, I see that it starts with some metadata as follows:

$ head -n 1 caf12e7b-e5e5-4453-ac0f-4d1d02770632.json
data.json000644 000765 000024 00001562330 12614372206 013272 0ustar00elsoufystaff000000 000000 {... json content ...}

I'm getting this header whether the original file was created with gzip data.json or tar -czf data.tar.gz data.json. How I can remove the few first bytes from beeing writing to the output file?


Solution

  • You generated your compressed file as an archive. The difference between compressing something and creating a compressed archive is, that an archive is a file format to contain more than one file, or complex structure (such as a file structure with folders).

    tar -cz <input files> creates an archive and compresses that using gzip, so you can have more than one file in a compressed tar archive.

    To compress a file in a typical UNIX/Linux environment, use the gzip command:

    $ gzip foo.json
    

    This will create a file foo.json.gz for you. To access its contents, use gunzip or zcat:

    $ zcat foo.json.gz
    <contents of foo.json>
    
    $ gunzip foo.json.gz
    $ cat foo.json