Search code examples
phpget-headers

How to get headers with PHP before loading entire content/page/file?


Background info:

  • I'm collecting some URLs dynamically from various sources online.
  • I would like to get the URL's content if it's an HTML page or an image.
  • I do not want to load large files (like a download zip, pdf or others) - just to realize that the target is not interesting for me.

Is there a way I can check the response type/format with PHP before actually fetching the content? (to avoid wasting my own and the target servers resources and bandwidth)

(I found get_headers() in the PHP doc, but it is unclear to me, if the function actually fetches the entire content and returns the headers, or somehow only gets the headers from the server, without downloading the content first. I also found solutions to get headers with CURL and fsocketopen, but the question remains, if I can do it without loading actual content)


Solution

  • There is a PHP-function for that:

    $headers=get_headers("http://www.amazingjokes.com/img/2014/530c9613d29bd_CountvonCount.jpg");
    print_r($headers);
    

    returns the following:

    Array
    (
        [0] => HTTP/1.1 200 OK
        [1] => Date: Tue, 11 Mar 2014 22:44:38 GMT
        [2] => Server: Apache
        [3] => Last-Modified: Tue, 25 Feb 2014 14:08:40 GMT
        [4] => ETag: "54e35e8-8873-4f33ba00673f4"
        [5] => Accept-Ranges: bytes
        [6] => Content-Length: 34931
        [7] => Connection: close
        [8] => Content-Type: image/jpeg
    )
    

    Should be easy to get the content-type after this.

    More reading here (PHP.NET)