I am working on a scraper (in PHP) and use cURL to fetch pages. The script can be run both in CLI & browser. This is my first time working with PHP on CLI and I was trying to make the screen pretty and have a nice data representation like scrape statistics show up.
I am able to generate the output the way I want it, well almost. But with every cURL request the server makes, it also outputs this the extra header information like this :
* About to connect() to imbd.com port 80 (#0)
* Trying 123.111.222.333... * connected
> GET /categories/something.html HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20130401 Firefox/21.0
Host: imdb.com
Accept: */*
< HTTP/1.1 200 OK
< Server: nginx/1.4.1
< Date: Wed, 25 Dec 2013 02:17:06 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: keep-alive
< Vary: Accept-Encoding
< X-Powered-By: PHP/5.3.17
< Set-Cookie: mobileType=0%something; expires=Wed, 01-Jan-2014 02:17:06 GMT; path=/; domain=.imdb.com
<
* Connection #0 to host imdb.com left intact
* Closing connection #0
...
Statistics
...
Function that uses cURL
public function getHTML($url)
{
$user_agent = "Mozilla/5.0 (Windows NT 5.1; U; zh-cn; rv:1.9.1.6) ...";
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => $user_agent,
CURLOPT_URL => $url,
CURLOPT_VERBOSE => true,
CURLOPT_SSL_VERIFYPEER => false,
);
$ch = curl_init();
curl_setopt_array($ch, $options);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
Now all I want to do is hide this information from the CLI as it does in the browser.
Had it been cli curl, i would use -s
to shut it up. But I am unable to find an PHP alternative for this. Also, CURLOPT_MUTE
is depreciated. All Google gave me was to set CURLOPT_RETURNTRANSFER
true, which I already have.
Also I would like to know how can I avoid setting any cookies to avoid tracking.
If it helps in any way I am using
Remove this.
CURLOPT_VERBOSE => true,
According to php manual
Set value to
TRUE to output verbose information. Writes output to STDERR, or the file specified using CURLOPT_STDERR.