I'm attempting to use cURL to download an external image file. When used from the command line, cURL correctly states the response headers with content-type=image/png
. When I attempt to use cURL in PHP however, it returns content-type=text/html
.
When attempting to save the file using cURL in PHP, with the CURLOPT_BINARYTRANSFER
option set to 1, in conjunction with fopen/fwrite/
, the result is a corrupt file.
The only cURL flags I'm using in are -A
to send a user agent with the request, which I've also done in PHP by calling curl_setopt($ch, CURLOPT_USERAGENT, ...)
.
The only thing I can think of that would cause this is perhaps some background request headers sent by cURL which aren't accounted for using the standard PHP functions?
For reference;
CLI
curl -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -I http://find.icaew.com/data/imgs/736c476534ddf7b249d806d9aa7b9ee8.png
PHP
private function curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 1);
$response = array(
'html' => curl_exec($ch),
'http_code' => curl_getinfo($ch, CURLINFO_HTTP_CODE),
'contentLength' => curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD),
'contentType' => curl_getinfo($ch, CURLINFO_CONTENT_TYPE)
);
curl_close($ch);
return $response;
}
public function parseImage() {
$imageSrc = pq('img.firm-logo')->attr('src');
if (!empty($imageSrc)) {
$newFile = '/Users/firstlast/Desktop/Hashery/test01/imgdump/' . $this->currentListingId . '.png';
$curl = $this->curl('http://find.icaew.com' . $imgSrc);
if ($curl['http_code'] == 200) {
if (file_exists($newFile)) unlink($newFile);
$fp = fopen($newFile,'x');
fwrite($fp, $curl['html']);
fclose($fp);
return $this->currentListingId;
} else {
return 0;
}
} else {
return 0;
}
}
When I mentioned content-type=text/html
The call to $this->curl()
results in the contentLength
and contentType
properties of the returned $response
variable having the values -1
and text/html
respectively.
I can imagine this is quite an obscure question, so I've attempted to provide as much context as to what is going on/what I'm trying to achieve. Any help in understanding why this is the case, and what I can do to resolve/achieve my goal would be greatly appreciated
If you know exactly what you are getting then get_file_contents()
is much simpler.
A URL can be used as a filename with this function
http://php.net/manual/en/function.file-get-contents.php
Also, it is helpful to go through the user comments on php.net as they have written many examples and potential issues or tricks to using the function.