Search code examples
variablescompressiontclgzipdeflate

How to decompress the contents of a var to another var?


I'm trying to uncompress what's inside of a var into the same var or onto another. I created a .gz file using gzip -k filename. It generated a filename.gz while keeping the original. I uploaded this gz file to a server. The following line downloads the gz file to a var. Yes, I could use package require http, etc...

set testvar [exec wget -q -O - url-to-gz-file]; string length $testvar

The string length $testvar returns a number. Now, I need to deflate/decompress what's inside of $testvar to the same var or to another var like testvar2 and this var has to contain the contents of the original file. I mean, the contents of the file before being compressed. The original file is just a text file.

I probably need to instruct that testvar is binary.

Is this possible? I do not want to download the .gz file first to the hard drive and process it after. It all needs to be done "in memory".

Thanks.


Solution

  • The simplest way is to just pipe wget into gunzip:

    set result [exec wget -q -O - url-to-gz-file | gunzip -c]
    

    This requires that the uncompressed contents be encoded in your system's default character encoding; if it's binary data or some other text encoding, you'll have problems (Your teststr variable probably isn't holding valid gzip data either because of the character encoding translation).

    In those cases, tcl comes with the zlib command for working with zlib and gzip compression. The zlib gunzip STRING subcommand, in particular, would be useful here. You have to use open for that, not exec, in order to get binary data that can be passed to the zlib gunzip. It would look something like:

    # b for opening in binary mode instead of having to set it with fconfigure
    set wget [open "| wget -q -O - url-to-gz-file" rb]
    set binData [zlib gunzip [read -nonewline $wget]]
    close $wget
    set result [encoding convertfrom utf-8 $binData] ;# Or whatever encoding the file uses
    

    You can use zlib push to add a layer to the wget channel so that using read on it automatically decompresses the data too.

    set wget [open "| wget -q -O - url-to-gz-file" rb]
    zlib push gunzip $wget
    set result [encoding convertfrom utf-8 [read -nonewline $wget]]
    close $wget