I'm sending a header request with curl using the following code
function getContentType($u)
{
$ch = curl_init();
$url = $u;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20100101 Firefox/7.0.12011-10-16 20:23:00");
$results = split("\n", trim(curl_exec($ch)));
print_r($results);
foreach($results as $line) {
if (strtok($line, ':') == 'Content-Type') {
$parts = explode(":", $line);
return trim($parts[1]);
}
}
}
For most websites it is returning correctly, although for some servers it is returning a 404 error when the page is actually available. I'm assuming this is because the servers have been configured to reject the header request.
I'm looking for a way to bypass this server header request rejection, or a way to tell if the header request has been rejected and is not in fact 404.
Setting CURLOPT_NOBODY to "true" with curl_setopt sets the request
method to HEAD for HTTP(s) requests, and furthermore, cURL does not read
any content even if a Content-Length header is found in the headers.
However, setting CURLOPT_NOBODY back to "false" does *not* reset the
request method back to GET. But because it is now "false", cURL will
wait for content if the response contains a content-length header.
My guess is that you're using a HEAD request instead of GET and therefore getting rejected for it.