Search code examples
phpcurlcharacter-encodingfopenfile-get-contents

Gibberish is returned when running fopen/file_get_contents with one server but not with another


I'm building a script to check whether sites are up or not by reading a given page's content and looking for a predefined string in it (if the site is down the string won't be found).

I'm reading a page content with the function file_get_contents, the problem is that at some rare cases the content received is simply Gibberish. I tried to do the same with fopen and even with curl. Getting gibberish with all functions. At the beginning I thought its because of encoding issues (the pages are UTF8) and I tried to play with all the parameters, but it doesn't seem to be it.

The whole thing became much weirder when I decided to test the code on another server. It worked perfectly! the same pages which return gibberish on my dev station return readable text when being run on my other web server.

Both stations have the latest WAMP installed as a dev environment, do you have any suggestions to what can cause this?


Solution

  • as i said it could be a gzipped output, use this function and pass that "gibberish" through it. if its not the issue let me know ill remove this answer

    $site = file_get_contents('http://example.com');
    echo gzdecoder($site);
    
    function gzdecoder($d){
        $f=ord(substr($d,3,1));
        $h=10;$e=0;
        if($f&4){
            $e=unpack('v',substr($d,10,2));
            $e=$e[1];$h+=2+$e;
        }
        if($f&8){
            $h=strpos($d,chr(0),$h)+1;
        }
        if($f&16){
            $h=strpos($d,chr(0),$h)+1;
        }
        if($f&2){
            $h+=2;
        }
        $u = gzinflate(substr($d,$h));
        if($u===FALSE){
            $u=$d;
        }
        return $u;
    }
    

    EDIT:

    not sure, odd settings. ive run into this problem before with some sites

    SetEnv no-gzip dont-vary in a .htaccess file turns it off