Search code examples
phpencoding

PHP gzcompress encoding issue


I used gzcompress function of PHP to compress a Google page requested from PHP curl. I stored it as a html file but on gzuncompress some characters are not decoded correctly. The page was from Google lietuvos.

Code:

// Encoding used in curl request
curl_setopt($ch, CURLOPT_ENCODING , 'gzip,deflate'); 

//Compressing
gzcompress($res['page'],9);

// Uncompressing
gzuncompress($data);

Please let me know if I am missing anything?


Solution

  • Given the little information we have, I assume that this is a problem with newlines and with the encoding having the character "\0" (null character) somewhere.

    What you can do is the following:

    Compress the page and encode to base64:

    base64_encode(gzcompress($res['page'],9));
    

    To decompress, decode from base64:

    gzuncompress(base64_decode($data));
    

    This will ensure that everything will be written and read as-is, with around 33% of overhead.

    There are other solutions for this problem, but this is the easiest.